ASJP conversion and tokenisation utils
Project description
A library of three functions. ipa2asjp takes an IPA-encoded sequence and converts it into an ASJP-encoded sequence. asjp2ipa tries to do the opposite. tokenise takes an ASJP-encoded string and returns a list of tokens.
>>> from asjp import ipa2asjp, asjp2ipa, tokenise >>> ipa2asjp('zɛmʲa') 'zEmy~a' >>> ipa2asjp(['z', 'ɛ', 'mʲ', 'a']) ['z', 'E', 'my~', 'a'] >>> asjp2ipa('zEmy~a') 'zɛmʲa' >>> tokenise('zEmy~a') ['z', 'E', 'my~', 'a'] >>> ipa2asjp(asjp2ipa(tokenise('zEmy~a'))) == tokenise('zEmy~a') True >>> ipa2asjp(['z', 'ɛ', 'mʲ', 'a']) == tokenise(ipa2asjp('zɛmʲa')) True
what is this?
ASJPcode, more commonly referred to as the ASJP alphabet or simply as ASJP, is a simplified transcription alphabet introduced in Brown et al. (2008) and then slightly modified in Brown et al. (2013); the latter is considered the alphabet’s spec for the purposes of this package.
The ASJP alphabet is used for transcribing the ASJP Database, an actively developed database aiming to provide the translations of a set of 40 basic concepts into all the world’s languages. Both alphabet and database are employed in the field of computational historical linguistics, e.g. in Jäger (2013) or Wichmann et al. (2011).
api
ipa2asjp(ipa_seq) takes an IPA string or sequence of string tokens and converts it into an ASJP string or sequence of string tokens. Raises a ValueError if the input does not constitute a valid IPA sequence.
asjp2ipa(asjp_seq) takes an ASJP string or sequence of string tokens and converts it into an IPA string or sequence of string tokens. As ASJP encodes much less information than IPA, something like asjp2ipa(ipa2asjp(ipa_seq)) == ipa_seq would rarely hold true. Raises a ValueError if the input does not constitute a valid ASJP sequence.
tokenise(asjp_string) takes an ASJP string and converts it into a list of ASJP tokens. Raises a ValueError if the input cannot be unambiguously tokenised.
tokenize(asjp_string) is an alias for tokenise(asjp_string).
installation
This is a standard Python 3 package with a single dependency, ipatok. It is offered at the Cheese Shop, so you can install it with pip:
pip install asjp
or, alternatively, you can clone this repo (safe to delete afterwards) and do:
python setup.py test python setup.py install
Of course, all of this could, and probably should, be happening within a virtual environment.
see also
licence
MIT. Do as you please and praise the snake gods.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file asjp-0.0.2.tar.gz
.
File metadata
- Download URL: asjp-0.0.2.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8263eeb3df59d110c8fd666f3c65aaa17093f4a0aa7deb8fc1b3dd2bc77a7e15 |
|
MD5 | 161ca5d96e8e7f4e3cb923cd9ca452fd |
|
BLAKE2b-256 | 2186899a26268e220c661806caebfe4c9210556aaf8d026c5482bb1b38137fbe |
File details
Details for the file asjp-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: asjp-0.0.2-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7eb51189a1a848ef84641e767373cb4cdb61116ac471a05ad0bc201b77153049 |
|
MD5 | 0f1e6baba89b96a76e7780244a8325db |
|
BLAKE2b-256 | d8b33af44aead0869c857ccf21557ec514f45e803a7281ab626db7926d5d85fe |