English word syllabifier and extended syllable analysis tool
Project description
# English Syllabifier (eng_syl) This program implements a sequence labelling Bidirectional LSTM to identify syllable boundaries in English words. The model was trained on data from the WebCelex English wordform corpus.
Use the syllabify()
function from the Syllabel
class to syllabify your words:
>>> from eng_syl.syllabify import Syllabel >>> syllabler = Syllabel() >>> syllabler.syllabify("chomsky") 'chom-sky'
syllabify()
parameters
- text: string- English text to be syllabified. Input should only contain alphabetic characters.
syllabify()
returns the given word with hyphens inserted at syllable boundaries.
Onceler (Onset, Nucleus, Coda Segmenter)
The onc_split()
function from the Onceler
class splits single syllables into their constituent Onset, Nucleus, and Coda components.
>>> from eng_syl.onceler import Onceler >>> lorax = Onceler() >>> print(lorax.onc_split("sloan") 'sl-oa-n'
- text: string - English single syllable word/ component to be segmented into Onset, Nucleus, Coda. Input should only contain alphabetic characters.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file eng-syl-2.0.1.tar.gz
.
File metadata
- Download URL: eng-syl-2.0.1.tar.gz
- Upload date:
- Size: 91.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f182aff9e89749349c63f35934384b2d2dd77574d66272def3fad7447f8a05d4 |
|
MD5 | 82458e006627f26893d32fb8cc54e3de |
|
BLAKE2b-256 | 0d38c74d7e0c58893c380d7788f6aa4fc8d1456abe95348cfd6396b1caa81189 |