Skip to main content

A Turkish syllable splitter implemented in C with Python bindings

Project description

Turkish Syllable Splitter

turkish-syllable is a library for syllabification of Turkish text, written in C and accessible using Python connectors. It works quickly and efficiently, produces results that follow Turkish spelling rules, and offers optional inclusion of punctuation.

Features

  • Turkish Spelling: Works according to the spelling rules specific to the Turkish language (for example, “merhaba” → ['mer', 'ha', 'ba']).
  • Punctuation Support: Optionally adds punctuation marks and spaces to the syllable list (with_punctuation parameter).
  • Fast Performance: C-based algorithm provides fast results even for large texts.
  • Platform Compatibility: Works on Linux based systems (manylinux compatible).

Installation

You can install it via PyPI:

pip install turkish-syllable

Sample Usage

Using with Python:

from turkish_syllable import syllabify

# with punctuation
result = syllabify("Merhaba, dünya!") # default value of with_punctuation is True
print(result)
# output: ['Mer', 'ha', 'ba', ',', ' ', 'dü', 'nya', '!']

# without punctuation
result = syllabify("Merhaba, dünya!", with_punctuation=False)
print(result)
# output: ['Mer', 'ha', 'ba', 'dü', 'nya']

Using with command line:

# with punctuation (default)
python -m turkish_syllable -i input.txt -o output.txt -p
# or enter the text directly:
python -m turkish_syllable -p
# sample input: "Merhaba, dünya!"
# output: Mer ha ba ,   dü nya !

# without punctuation
python -m turkish_syllable -i input.txt -o output.txt --no-punctuation
# or:
python -m turkish_syllable --no-punctuation
# sample input: "Merhaba, dünya!"
# output: Mer ha ba dü nya

Technical Details

  • Language: The algorithm is written in C and linked to Python with ctypes.
  • Spelling Algorithm: It follows the natural distinctions between vowels and consonants according to Turkish spelling rules. It is optimized for special cases (for example, words with 3 or 4 letters).
  • Dependencies: No extra Python dependencies are required, only standard libraries are used.
  • File Structure:
    • syllable.c: C source code containing the spelling logic.
    • libsyllable.so: Compiled shared library.
    • csyllable_en.py: Python linker.

Requirements

  • Python 3.6 or higher
  • Linux operating system (with manylinux compatible build)

License

Distributed under this project (MIT).

Contribution

If you want to contribute:

  1. Fork the repository: github
  2. Make your changes and send pull request.

Contact

For questions or suggestions: ahmetozdemiir.ao@gmail.com

Version History

  • 0.1.1: Added with_punctuation parameter, shortened function name to syllabify.
  • 0.1.0: Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turkish_syllable-0.1.1.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turkish_syllable-0.1.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl (21.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.5+ x86-64

File details

Details for the file turkish_syllable-0.1.1.tar.gz.

File metadata

  • Download URL: turkish_syllable-0.1.1.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for turkish_syllable-0.1.1.tar.gz
Algorithm Hash digest
SHA256 47620735fc63f8c4b0008ead79eafc7e161baf8cddfc2a5d958ca0522ebf7c0e
MD5 31c8f2a4391a69ce5c701b29f7f0c130
BLAKE2b-256 86ef392e84bde8443e1e559abf97f0c6b590a7997cd4575b15838f13954e39ac

See more details on using hashes here.

File details

Details for the file turkish_syllable-0.1.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for turkish_syllable-0.1.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6f358d2f1222ba991acaf9274ade8e1abf7c520e80ae68a263e7218b0bc98658
MD5 b97f610f61edb94d1e9bbe2f44dd39f6
BLAKE2b-256 eb6c8fe2629a3d58cd36c0ddf6500dc24a16590919da0fecbaa34be76e796241

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page