A simple IPA tokeniser, as simple as in:
>>> from ipatok import tokenise >>> tokenise('ˈtiːt͡ʃə') ['t', 'iː', 't͡ʃ', 'ə'] >>> tokenise('ʃːjeq͡χːʼjer') ['ʃː', 'j', 'e', 'q͡χːʼ', 'j', 'e', 'r']
tokenise(string, strict=False, replace=False, diphtongs=False, merge=None) takes an IPA string and returns a list of tokens. A token usually consists of a single letter together with its accompanying diacritics. If two letters are connected by a tie bar, they are also considered a single token. Except for length markers, suprasegmentals are excluded from the output. Whitespace is also ignored. The function accepts the following keyword arguments:
strict: if set to True, the function ensures that string complies to the IPA spec (the 2015 revision); a ValueError is raised if it does not. If set to False (the default), the role of non-IPA characters is guessed based on their Unicode category.
replace: if set to True, the function replaces some common substitutes with their IPA-compliant counterparts, e.g. g → ɡ, ɫ → l̴, ʦ → t͡s. Refer to ipatok/data/replacements.tsv for a full list. If both strict and replace are set to True, replacing is done before checking for spec compliance.
diphtongs: if set to True, the function groups together non-syllabic vowels with their syllabic neighbours (e.g. aɪ̯ would form a single token). If set to False (the default), vowels are not tokenised together unless there is a connecting tie bar (e.g. a͡ɪ).
merge: expects a str, str → bool function to be applied onto each pair of consecutive tokens; those for which the output is True are merged together. You can use this to, e.g., plug in your own diphtong detection algorithm:
>>> tokenise(string, diphtongs=False, merge=custom_func)
tokenize is an alias for tokenise.
This is a standard Python 3 package without dependencies. It is offered at the Cheese Shop, so you can install it with pip:
pip install ipatok
or, alternatively, you can clone this repo (safe to delete afterwards) and do:
python setup.py test python setup.py install
Of course, this could be happening within a virtualenv/venv as well.
other IPA packages
MIT. Do as you please and praise the snake gods.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size ipatok-0.1.1-py3-none-any.whl (15.0 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size ipatok-0.1.1.tar.gz (9.3 kB)||File type Source||Python version None||Upload date||Hashes View hashes|