Skip to main content

Package for converting SIL IPA93 legacy font to unicode

Project description

Overview

This package converts text encoded using the legacy SIL IPA93 font to unicode.

It contains one function, convert_to_unicode(), which relies on a dictionary mapping IPA93 glyph codes to their corresponding unicode code point(s). This is useful if, for example, you are working with a resource like the [MOSS Aphasia MAPPD dataset] (https://www.mappd.org/about.html).

The package also exposes the dictionary itself, sil_to_unicode_dict, in case it is more convenient to use that directly. Lastly, this package contains a list of all the unicode diacritics (ipa_diacritics_unicode), which may be useful for removing diacritics from the input in a post-processing step.

Notes

Usage

The following code snippet illustrates the usage of the function convert_to_unicode, which takes a string of SIL IPA93 glyph access codes and returns an equivalent unicode string. In this example, we assume the input excel file MAPPD.xlsx contains a structured data set in which the IPA93 data lives in a column called "Phonetic_response." We send each data point in this column to convert_to_unicode(), store the result in a new column called "New_phonetic_response," and write the new data set to a file called "MAPPD.new.xlsx":

import pandas as pd

mappd_df = pd.read_excel('MAPPD.xlsx')
# The input to convert_to_unicode() is a string so handle null values
# appropriately first.
mappd_df['New_phonetic_response'] = mappd_df.Phonetic_response.fillna('')
mappd_df.New_phonetic_response = mappd_df.New_phonetic_response.map(lambda x: convert_to_unicode(x))
mappd_df.to_excel('MAPPD.new.xlsx', index=False)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ipa2unicode-0.0.2.tar.gz (5.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page