Package for converting SIL IPA93 legacy font to unicode

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Overview

This package converts text encoded using the legacy SIL IPA93 font to unicode.

It contains one function, convert_to_unicode(), which relies on a dictionary mapping IPA93 glyph codes to their corresponding unicode code point(s). This is useful if, for example, you are working with a resource like the [MOSS Aphasia MAPPD dataset] (https://www.mappd.org/about.html).

The package also exposes the dictionary itself, sil_to_unicode_dict, in case it is more convenient to use that directly. Lastly, this package contains a list of all the unicode diacritics (ipa_diacritics_unicode), which may be useful for removing diacritics from the input in a post-processing step.

Notes

The IPA93 glyph access codes (and descriptions in the comments) were copied from the file Ipa93.pdf, which can be found in the [IPA93 fonts zip archive] (https://scripts.sil.org/cms/scripts/render_download.php?format=file&media_id=silipa93-2.00.zip&filename=silipa93.zip).
Glyph access code 202 "minute space" does not have an obvious unicode equivalent and is not handled in this package.
The interpretation of glyph access codes 232, 134, 216, 128, 133, 217 in this package is based on [ipa-braille-final.pdf] (http://brailleauthority.org/ipa/ipa-braille-final.pdf) and understood to represent various tone bars (see http://www.internationalphoneticalphabet.org/ipa-charts/tones-and-accents/ as well as comments in the code).

Usage

The following code snippet illustrates the usage of the function convert_to_unicode, which takes a string of SIL IPA93 glyph access codes and returns an equivalent unicode string. In this example, we assume the input excel file MAPPD.xlsx contains a structured data set in which the IPA93 data lives in a column called "Phonetic_response." We send each data point in this column to convert_to_unicode(), store the result in a new column called "New_phonetic_response," and write the new data set to a file called "MAPPD.new.xlsx":

import pandas as pd

mappd_df = pd.read_excel('MAPPD.xlsx')
# The input to convert_to_unicode() is a string so handle null values
# appropriately first.
mappd_df['New_phonetic_response'] = mappd_df.Phonetic_response.fillna('')
mappd_df.New_phonetic_response = mappd_df.New_phonetic_response.map(lambda x: convert_to_unicode(x))
mappd_df.to_excel('MAPPD.new.xlsx', index=False)

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.3

Aug 20, 2019

1.2

Aug 12, 2019

1.1

Aug 12, 2019

1.0

Aug 12, 2019

0.0.3

Aug 12, 2019

0.0.2

Aug 12, 2019

0.0.1

Aug 12, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ipa2unicode-1.3.tar.gz (5.7 kB view details)

Uploaded Aug 20, 2019 Source

File details

Details for the file ipa2unicode-1.3.tar.gz.

File metadata

Download URL: ipa2unicode-1.3.tar.gz
Upload date: Aug 20, 2019
Size: 5.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.6.8

File hashes

Hashes for ipa2unicode-1.3.tar.gz
Algorithm	Hash digest
SHA256	`eae2a54840a24a14f47817839c19867a21cb7de541139fd4a25567f9e79933c2`
MD5	`165ff7b7be21c01cf6c136ef7f31039f`
BLAKE2b-256	`2501943b884f51d089a94d01c72cde9274789f18c83a4d1b7e7bd93e64e8050d`

See more details on using hashes here.

ipa2unicode 1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Overview

Notes

Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes