Module for creating context-aware, rule-based G2P mappings that preserve indices
Project description
Gⁱ-2-Pⁱ
Grapheme-to-Phoneme transductions that preserve input and output indices!
This library is for handling arbitrary transductions between input and output segments while preserving indices.
Table of Contents
Background
The initial version of this package was developed by Patrick Littell and was developed in order to allow for g2p from community orthographies to IPA and back again in ReadAlong-Studio. We decided to then pull out the g2p mechanism from Convertextract which allows transducer relations to be declared in CSV files, and turn it into its own library - here it is!
Install
The best thing to do is install with pip pip install g2p
.
Otherwise, clone the repo and pip install it locally.
$ git clone https://github.com/roedoejet/g2p.git
$ cd g2p
$ pip install -e .
Usage
In order to initialize a Transducer
, you must first create a Mapping
object.
Mapping
You can create mappings either by initializing them directly with a list:
from g2p.mappings import Mapping
mappings = Mapping([{"in": 'a', "out": 'b'}])
Alternatively, you can add a CSV file to g2p/mappings/langs/<YourLang>
/<YourLookupTable>
from g2p.mappings import Mapping
mappings = Mapping(language={"lang": "<YourLang>", "table": "<YourLookupTable>"})
Transducer
Initialize a Transducer
with a Mapping
object. Calling the Transducer
then produces the output. In order to preserve the indices, pass index=True when calling the Transducer
.
from g2p.mappings import Mapping
from g2p.transducer import Transducer
mappings = Mapping([{"in": 'a', "out": 'b'}])
transducer = Transducer(mappings)
transducer('a')
# 'b'
transducer('a', index=True)
# ('b', <g2p.transducer.indices.Indices object>)
To make sense of the Indices
object that is produced, you can either call it, and produce a list of each character. Doing that for the above produces [((0, 'a'), (0, 'b'))]
- a list of relation tuples where each relation tuple is comprised of an input and output. Each input tuple and output tuple is in turn comprised of an index and a corresponding character. You can also call output()
and input()
to see the plain text output and input respectively.
Studio
You can also run the g2p Studio
which is a web interface for creating custom lookup tables to be used with g2p. To run the g2p Studio
either visit ***** or run it locally using python run_studio.py
.
You can also import the app directly from the package:
from g2p import app
app.run(host='0.0.0.0', port=5000, debug=True)
Maintainers
Contributing
Feel free to dive in! Open an issue or submit PRs.
This repo follows the Contributor Covenant Code of Conduct.
Contributors
This project exists thanks to all the people who contribute.
@littell. @finguist. @eddieantonio. @dhdaines.
License
MIT © Patrick Littell, Aidan Pine
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file g2p-0.2.20190919.tar.gz
.
File metadata
- Download URL: g2p-0.2.20190919.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 184a28b8e068228cc74fd829098b57dd7b40934f0ea9dfbd8ff30db62cbe35a3 |
|
MD5 | 1f7b3a4c28c26058d2ae036989eeaf83 |
|
BLAKE2b-256 | 1aa1d6bc95623b0fe596719fd3d822a131966d059e510b796b6089ae9be9a0d7 |
File details
Details for the file g2p-0.2.20190919-py3-none-any.whl
.
File metadata
- Download URL: g2p-0.2.20190919-py3-none-any.whl
- Upload date:
- Size: 2.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fd574e842a715359dc8dd21201561890bba342a931934b7d3981c56d5c94758 |
|
MD5 | 5ee6f4cc2c03db105dab2c5688e8fa43 |
|
BLAKE2b-256 | b4dd041de95371a0b181ce20013d8af31974bc33c2fa2be7daf1333d2f9dc906 |