Skip to main content

ASJP conversion and tokenisation utils

Project description

A library of three functions. ipa2asjp takes an IPA-encoded sequence and converts it into an ASJP-encoded sequence. asjp2ipa tries to do the opposite. tokenise takes an ASJP-encoded string and returns a list of tokens.

>>> from asjp import ipa2asjp, asjp2ipa, tokenise
>>> ipa2asjp('zɛmʲa')
'zEmy~a'
>>> ipa2asjp(['z', 'ɛ', 'mʲ', 'a'])
['z', 'E', 'my~', 'a']
>>> asjp2ipa('zEmy~a')
'zɛmʲa'
>>> tokenise('zEmy~a')
['z', 'E', 'my~', 'a']
>>> ipa2asjp(asjp2ipa(tokenise('zEmy~a'))) == tokenise('zEmy~a')
True
>>> ipa2asjp(['z', 'ɛ', 'mʲ', 'a']) == tokenise(ipa2asjp('zɛmʲa'))
True

what is this?

ASJPcode, more commonly referred to as the ASJP alphabet or simply as ASJP, is a simplified transcription alphabet introduced in Brown et al. (2008) and then slightly modified in Brown et al. (2013); the latter is considered the alphabet’s spec for the purposes of this package.

The ASJP alphabet is used for transcribing the ASJP Database, an actively developed database aiming to provide the translations of a set of 40 basic concepts into all the world’s languages. Both alphabet and database are employed in the field of computational historical linguistics, e.g. in Jäger (2013) or Wichmann et al. (2011).

api

ipa2asjp(ipa_seq) takes an IPA string or sequence of string tokens and converts it into an ASJP string or sequence of string tokens. Raises a ValueError if the input does not constitute a valid IPA sequence.

asjp2ipa(asjp_seq) takes an ASJP string or sequence of string tokens and converts it into an IPA string or sequence of string tokens. As ASJP encodes much less information than IPA, something like asjp2ipa(ipa2asjp(ipa_seq)) == ipa_seq would rarely hold true. Raises a ValueError if the input does not constitute a valid ASJP sequence.

tokenise(asjp_string) takes an ASJP string and converts it into a list of ASJP tokens. Raises a ValueError if the input cannot be unambiguously tokenised.

tokenize(asjp_string) is an alias for tokenise(asjp_string).

installation

This is a standard Python 3 package with a single dependency, ipatok. It is offered at the Cheese Shop, so you can install it with pip:

pip install asjp

or, alternatively, you can clone this repo (safe to delete afterwards) and do:

python setup.py test
python setup.py install

Of course, all of this could, and probably should, be happening within a virtual environment.

see also

  • lingpy is an extensive library for computational historical linguistics that includes functions for converting IPA to ASJP.

  • ipatok is a library for tokenising IPA strings used by the ipa2asjp function for handling string input.

licence

MIT. Do as you please and praise the snake gods.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asjp-0.0.2.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

asjp-0.0.2-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file asjp-0.0.2.tar.gz.

File metadata

  • Download URL: asjp-0.0.2.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for asjp-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8263eeb3df59d110c8fd666f3c65aaa17093f4a0aa7deb8fc1b3dd2bc77a7e15
MD5 161ca5d96e8e7f4e3cb923cd9ca452fd
BLAKE2b-256 2186899a26268e220c661806caebfe4c9210556aaf8d026c5482bb1b38137fbe

See more details on using hashes here.

File details

Details for the file asjp-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for asjp-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7eb51189a1a848ef84641e767373cb4cdb61116ac471a05ad0bc201b77153049
MD5 0f1e6baba89b96a76e7780244a8325db
BLAKE2b-256 d8b33af44aead0869c857ccf21557ec514f45e803a7281ab626db7926d5d85fe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page