Skip to main content

Module for creating context-aware, rule-based G2P mappings that preserve indices

Project description

Gⁱ-2-Pⁱ

Coverage Status Build Status PyPI package license standard-readme compliant

Grapheme-to-Phoneme transductions that preserve input and output indices!

This library is for handling arbitrary transductions between input and output segments while preserving indices.

Table of Contents

Background

The initial version of this package was developed by Patrick Littell and was developed in order to allow for g2p from community orthographies to IPA and back again in ReadAlong-Studio. We decided to then pull out the g2p mechanism from Convertextract which allows transducer relations to be declared in CSV files, and turn it into its own library - here it is!

Install

The best thing to do is install with pip pip install g2p.

Otherwise, clone the repo and pip install it locally.

$ git clone https://github.com/roedoejet/g2p.git
$ cd g2p
$ pip install -e .

Usage

In order to initialize a Transducer, you must first create a Mapping object.

Mapping

You can create mappings either by initializing them directly with a list:

from g2p.mappings import Mapping

mappings = Mapping([{"in": 'a', "out": 'b'}])

Alternatively, you can add a CSV file to g2p/mappings/langs/<YourLang>/<YourLookupTable>

from g2p.mappings import Mapping

mappings = Mapping(language={"lang": "<YourLang>", "table": "<YourLookupTable>"})

Transducer

Initialize a Transducer with a Mapping object. Calling the Transducer then produces the output. In order to preserve the indices, pass index=True when calling the Transducer.

from g2p.mappings import Mapping
from g2p.transducer import Transducer

mappings = Mapping([{"in": 'a', "out": 'b'}])
transducer = Transducer(mappings)
transducer('a')
# 'b'
transducer('a', index=True)
# ('b', <g2p.transducer.indices.Indices object>)

To make sense of the Indices object that is produced, you can either call it, and produce a list of each character. Doing that for the above produces [((0, 'a'), (0, 'b'))] - a list of relation tuples where each relation tuple is comprised of an input and output. Each input tuple and output tuple is in turn comprised of an index and a corresponding character. You can also call output() and input() to see the plain text output and input respectively.

Studio

You can also run the g2p Studio which is a web interface for creating custom lookup tables to be used with g2p. To run the g2p Studio either visit ***** or run it locally using python run_studio.py.

You can also import the app directly from the package:

from g2p import app

app.run(host='0.0.0.0', port=5000, debug=True)

Maintainers

@roedoejet.

Contributing

Feel free to dive in! Open an issue or submit PRs.

This repo follows the Contributor Covenant Code of Conduct.

Contributors

This project exists thanks to all the people who contribute.

@littell. @finguist. @eddieantonio. @dhdaines.

License

MIT © Patrick Littell, Aidan Pine

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

g2p-0.2.20190919.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

g2p-0.2.20190919-py3-none-any.whl (2.2 MB view details)

Uploaded Python 3

File details

Details for the file g2p-0.2.20190919.tar.gz.

File metadata

  • Download URL: g2p-0.2.20190919.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.4

File hashes

Hashes for g2p-0.2.20190919.tar.gz
Algorithm Hash digest
SHA256 184a28b8e068228cc74fd829098b57dd7b40934f0ea9dfbd8ff30db62cbe35a3
MD5 1f7b3a4c28c26058d2ae036989eeaf83
BLAKE2b-256 1aa1d6bc95623b0fe596719fd3d822a131966d059e510b796b6089ae9be9a0d7

See more details on using hashes here.

File details

Details for the file g2p-0.2.20190919-py3-none-any.whl.

File metadata

  • Download URL: g2p-0.2.20190919-py3-none-any.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.4

File hashes

Hashes for g2p-0.2.20190919-py3-none-any.whl
Algorithm Hash digest
SHA256 9fd574e842a715359dc8dd21201561890bba342a931934b7d3981c56d5c94758
MD5 5ee6f4cc2c03db105dab2c5688e8fa43
BLAKE2b-256 b4dd041de95371a0b181ce20013d8af31974bc33c2fa2be7daf1333d2f9dc906

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page