Skip to main content

An in-memory syllable count dictionary for North American English derived from the CMU Pronouncing Dictionary.

Project description

pysyllables

An in-memory syllable count dictionary for North American English derived from the CMU Pronouncing Dictionary.

>>> from pysyllables import get_syllable_count
>>> get_syllable_count("fabulous")
3
>>> get_syllable_count("word-that-doesn't-exist")
None

Where do these syllable counts come from?

From the CMU Pronouncing Dictionary, an open-source machine-readable pronunciation dictionary for North American English that contains over 134,000 words and their pronunciations.

By counting the number of lexical stress markers in each word's pronunciation, we can compute the # of syllables in each word. This library ships with a file that maps each word to a syllable count in pysyllables/syllable-counts.txt

How does one generate pysyllables/syllable-counts.txt?

scripts/download_syllable_counts.sh downloads the CMU Pronouncing Dictionary, computes each word's syllable count, and emits pysyllables/syllable-counts.txt.

Should there be a new version of the CMU Pronouncing Dictionary, update the source in scripts/download_syllable_counts.sh.

Contributing

Questions & contributions welcome -- please open an issue or, even better, a PR!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysyllables-1.0.3.tar.gz (431.8 kB view hashes)

Uploaded Source

Built Distribution

pysyllables-1.0.3-py2.py3-none-any.whl (426.9 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page