Skip to main content

Finnish syllabifier and compound segmenter

Project description

## FinnSyll

FinnSyll is a Python library that syllabifies words according to Finnish syllabification principles. It is also equipped with a Finnish compound splitter. More details/docs to come.

### Installation

`$ pip install FinnSyll`

### Basic usage

First, instantiate a `FinnSyll` object.

` >>> from finnsyll import FinnSyll >>> f = FinnSyll() `

To syllabify: ` >>> f.syllabify('runoja') ['ru.no.ja'] # internal syllable boundaries are indicated with '.' `

To segment compounds: ` >>> f.split('sosiaalidemokraattien') 'sosiaali=demokraattien' # internal word boundaries are indicated with '=' `

### Optional arguments

The syllabifier can be customized along two different parameters: variation and compound splitting.

####variation

Instantiating a `FinnSyll` object with `variation=True` (default) will allow the syllabifier to return multiple syllabifications if variation is predicted. When `variation=True`, the syllabifier will return a list. Setting `variation` to `False` will cause the syllabifier to return a string containing the first predicted syllabification.

Variation: ` >>> f = FinnSyll(variation=True) >>> f.syllabify('runoja') ['ru.no.ja'] >>> f.syllabify('vapaus') ['va.pa.us', 'va.paus'] `

No variation: ` >>> f = FinnSyll(variation=False) >>> f.syllabify('runoja') 'ru.no.ja' >>> f.syllabify('vapaus') 'va.pa.us' `

#### split_compounds

When instantiating a `FinnSyll` object with `split_compounds=True` (default), the syllabifier will first attempt to split the input into constituent words before syllabifying it. This forces the syllabifier to insert a syllable boundary in between identified constituent words. The syllabifier will skip this step if `split_compounds` is set to `False`.

Compound splitting: ` >>> f = FinnSyll(split_compounds=True) >>> f.syllabify('rahoituserien') # rahoitus=erien ['ra.hoi.tus.e.ri.en'] `

No compound splitting: ` >>> f = FinnSyll(split_compounds=False) >>> f.syllabify('rahoituserien') ['ra.hoi.tu.se.ri.en'] # incorrect `

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

FinnSyll-1.0.0-py2.py3-none-any.whl (831.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file FinnSyll-1.0.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for FinnSyll-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 eb2e99f0d6202dd97651dc9a89ab57996cd04dcd260f0021f26a231abc940ac4
MD5 400151c94fb6219e060b98b156d6573c
BLAKE2b-256 3d1d8846f9261181de7e3445ca4d344a25ddc0e3e7fc1c790d51bf6df1cc1578

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page