Splitting Czech words into syllables
Project description
Czech Syllable Splitter
Alogirthm for splitting Czech words into syllables. Inspired by a syllable counting algorithm from David Lukeš counting the vowels.
With Klára Bendová we put together rules to expand the vowels into syllables, empirically finding some common letter groups to stay intact.
This is not a perfect solution, but it is a good start for Czech language processing. Measuring the accuracy of this algorithm is a to-do, as well as adding more rules if needed.
Installation
pip install czech-syllable-splitter
or using Poetry package manager
poetry add czech-syllable-splitter
Usage
from czech_syllable_splitter import count_syllables, split_to_syllables, split_to_characters
print(split_to_syllables("příliš"))
print(split_to_characters("přesný"))
print(count_syllables("přísný"))
Lint & Test
poetry run python3 -m pytest
poetry run mypy czech_syllable_splitter
poetry run pylint czech_syllable_splitter
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file czech_syllable_splitter-0.1.0.tar.gz
.
File metadata
- Download URL: czech_syllable_splitter-0.1.0.tar.gz
- Upload date:
- Size: 3.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2bf0b0cc55417fa750a5fb51fbde1a98038bf611f9e9efd88ba349ed2bdba509 |
|
MD5 | 71ab4b5c61b895cc71e553c736db5b86 |
|
BLAKE2b-256 | 0bb023bec9b5d9aa36782610a247dc2dfd1eb8f837cd5cb229a24a9dbd9bf4e6 |
File details
Details for the file czech_syllable_splitter-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: czech_syllable_splitter-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a787fc2d045c1df6d7b10196ed8d5f9db4dc4a7556fb5bd6305c23471cc793ec |
|
MD5 | 092b645d8b32230132c0879e2d644c08 |
|
BLAKE2b-256 | 3723b2088e9eef5f94b5cfec8983403fca2b70320b7cb2f018970daf05428888 |