Module for creating context-aware, rule-based G2P mappings that preserve indices
Project description
Gⁱ-2-Pⁱ
Grapheme-to-Phoneme transformations that preserve input and output indices!
This library is for handling arbitrary conversions between input and output segments while preserving indices.
Table of Contents
Background
The initial version of this package was developed by Patrick Littell and was developed in order to allow for g2p from community orthographies to IPA and back again in ReadAlong-Studio. We decided to then pull out the g2p mechanism from Convertextract which allows transducer relations to be declared in CSV files, and turn it into its own library - here it is!
Install
The best thing to do is install with pip pip install g2p
.
Otherwise, clone the repo and pip install it locally.
$ git clone https://github.com/roedoejet/g2p.git
$ cd g2p
$ pip install -e .
Usage
The easiest way to create a transducer is to use the g2p.make_g2p
function.
To use it, first import the function:
from g2p import make_g2p
Then, call it with an argument for in_lang
and out_lang
. Both must be strings equal to the name of a particular mapping.
>>> transducer = make_g2p('dan', 'eng-arpabet')
>>> transducer('hej').output_string
'HH EH Y'
There must be a valid path between the in_lang
and out_lang
in order for this to work. If you've edited a mapping or added a custom mapping, you must update g2p to include it: g2p update
CLI
update
If you edit or add new mappings to the g2p.mappings.langs
folder, you need to update g2p
. You do this by running g2p update
convert
If you want to convert a string on the command line, you can use g2p convert <input_text> <in_lang> <out_lang>
Ex. g2p convert hej dan eng-arpabet
would produce HH EH Y
generate-mapping
If your language has a mapping to IPA and you want to generate a mapping between that and the English IPA mapping, you can use g2p generate-mapping <in_lang> --ipa
Ex. g2p generate-mapping dan --ipa
will produce a mapping from dan-ipa
to eng-ipa
. You must run g2p update
afterwards to update g2p
. The resulting mapping will be added to the folder in g2p.mappings.langs.generated
Studio
You can also run the g2p Studio
which is a web interface for creating custom lookup tables to be used with g2p. To run the g2p Studio
either visit https://g2p-studio.herokuapp.com/ or run it locally using python run_studio.py
.
Alternatively, you can run the app from the command line: g2p run
.
Maintainers
Contributing
Feel free to dive in! Open an issue or submit PRs.
This repo follows the Contributor Covenant Code of Conduct.
Contributors
This project exists thanks to all the people who contribute.
@littell. @finguist. @joanise. @eddieantonio. @dhdaines.
License
MIT © Patrick Littell, Aidan Pine
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file g2p-0.4.20200324.tar.gz
.
File metadata
- Download URL: g2p-0.4.20200324.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/41.2.0 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0ac66a2e10eeba8dbef3648fab7990dfee0d5595249403a3d6c7b32f7fbdad0 |
|
MD5 | 776ee93da167ae18f8b591325808e504 |
|
BLAKE2b-256 | 0fa28a3c83ef4240c297c4de9033a002a2f66ff5cb4a3e38125004f4029faf2e |
File details
Details for the file g2p-0.4.20200324-py3-none-any.whl
.
File metadata
- Download URL: g2p-0.4.20200324-py3-none-any.whl
- Upload date:
- Size: 2.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/41.2.0 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f91f5eb33a44f4c9408c90961a2ed09bed1600cc738d26ce072db8fdc6d75ff |
|
MD5 | 2b8bc12cfba0993abe81f0b09033ce42 |
|
BLAKE2b-256 | 569d3492e58a2f9f76a380990b6a9d4741af0ee92f5bcb3747e76bb99fc722b8 |