Finnish syllabifier and compound segmenter
Project description
FinnSyll
FinnSyll is a Python library that syllabifies words according to Finnish syllabification principles. It is also equipped with a Finnish compound splitter. More details/docs to come.
Installation
$ pip install FinnSyll
Basic usage
First, instantiate a FinnSyll object.
>>> from finnsyll import FinnSyll >>> f = FinnSyll()
To syllabify:
>>> f.syllabify('runoja') ['ru.no.ja'] # internal syllable boundaries are indicated with '.'
To segment compounds:
>>> f.split('sosiaalidemokraattien') 'sosiaali=demokraattien' # internal word boundaries are indicated with '='
Optional arguments
The syllabifier can be customized along two different parameters: variation and compound splitting.
variation
Instantiating a FinnSyll object with variation=True (default) will allow the syllabifier to return multiple syllabifications if variation is predicted. When variation=True, the syllabifier will return a list. Setting variation to False will cause the syllabifier to return a string containing the first predicted syllabification.
Variation:
>>> f = FinnSyll(variation=True) >>> f.syllabify('runoja') ['ru.no.ja'] >>> f.syllabify('vapaus') ['va.pa.us', 'va.paus']
No variation:
>>> f = FinnSyll(variation=False) >>> f.syllabify('runoja') 'ru.no.ja' >>> f.syllabify('vapaus') 'va.pa.us'
split_compounds
When instantiating a FinnSyll object with split_compounds=True (default), the syllabifier will first attempt to split the input into constituent words before syllabifying it. This forces the syllabifier to insert a syllable boundary in between identified constituent words. The syllabifier will skip this step if split_compounds is set to False.
Compound splitting:
>>> f = FinnSyll(split_compounds=True) >>> f.syllabify('rahoituserien') # rahoitus=erien ['ra.hoi.tus.e.ri.en']
No compound splitting:
>>> f = FinnSyll(split_compounds=False) >>> f.syllabify('rahoituserien') ['ra.hoi.tu.se.ri.en'] # incorrect
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file FinnSyll-2.0.0-py2.py3-none-any.whl
.
File metadata
- Download URL: FinnSyll-2.0.0-py2.py3-none-any.whl
- Upload date:
- Size: 884.5 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 364df4c6276c95d6327b169724c6d87778d310059ca82f5996577a63a996a95e |
|
MD5 | 237bbae5a706885f28864e363ba57562 |
|
BLAKE2b-256 | 93a9929b873616be5ec218bf33a3fde332f86aa6a61d306557c8f5a1549a6784 |