Skip to main content

Pronunciation and Transliteration module trained on CMU pronouncing dictionary, IIT Bombay and IIT Kharagpur text corpora

Project description

Code style License Maintenance versions

Transly

Transly is a sequence to sequence Bi-directional LSTM Encoder-Decoder model with Bahdanau Attention that’s trained on the CMU pronouncing dictionary, IIT Bombay English-Hindi Parallel Corpus and IIT Kharagpur transliteration corpus.

The pronunciation module in Transly can predict pronunciation of any given word (with an American accent of course!)

Take any word of any language - just transliterate the word in English (all capitals) and you are good to go. Be it a new or old, seen or unseen, sensible or insensible word - Transly can catch’em all!

Another module in Transly is the transliteration module. It currently supports Hindi to English and English to Hindi transliterations.

Pre-trained models can be found inside the respective trained_models folders. New models can also be trained on custom data.

Installation

Use the package manager pip to install transly

pip install transly

Usage

Pronunciation

Using the pre-trained pronunciation model

import transly.pronunciation as tp

# let's try a hindi word
# the prediction accent would be American
QUERY = 'MAKAAN'
a = tp.load_model(model_path='cmu')
a.infer(QUERY, separator=" ")
# use infer_batch function to infer batches
# use beamsearch function to perform a beam search

>> 'M AH0 K AA1 N'

Training a new model on custom data

from transly.seq2seq.config import SConfig
from transly.seq2seq.version0 import Seq2Seq

config = SConfig(training_data_path=training_data_path, input_mode='character_level', output_mode='word_level')
s2s = Seq2Seq(config)
s2s.fit()
s2s.save_model(path_to_model=model_path, model_file_name=model_file_name)

Training data file should be a csv with two columns, the input and the output

Input

Output

AA

AA1

AABERG

AA1 B ER0 G

AACHEN

AA1 K AH0 N

AACHENER

AA1 K AH0 N ER0

Transliteration

Hindi to English

Using the pre-trained model

import transly.transliteration as tl

QUERY = 'निखिल'
a = tl.load_model(model_path='hi2en')
a.infer(QUERY)
# use infer_batch function to infer batches
# use beamsearch function to perform a beam search

>> 'NIKHIL'

English to Hindi

Using the pre-trained model

import transly.transliteration as tl

QUERY = 'NIKHIL'
a = tl.load_model(model_path='en2hi')
a.infer(QUERY)
# use infer_batch function to infer batches
# use beamsearch function to perform a beam search

>> 'निखिल'

Training a new model on custom data

from transly.seq2seq.config import SConfig
from transly.seq2seq.version0 import Seq2Seq

config = SConfig(training_data_path=training_data_path)
s2s = Seq2Seq(config)
s2s.fit()
s2s.save_model(path_to_model=model_path, model_file_name=model_file_name)

License

The Python code in this module is distributed with Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transly-0.1.3.tar.gz (32.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

transly-0.1.3-py3-none-any.whl (32.0 MB view details)

Uploaded Python 3

File details

Details for the file transly-0.1.3.tar.gz.

File metadata

  • Download URL: transly-0.1.3.tar.gz
  • Upload date:
  • Size: 32.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.3

File hashes

Hashes for transly-0.1.3.tar.gz
Algorithm Hash digest
SHA256 c8fda0a5c8effd2ef1b83ae172f67f5c169666852b41c36da65b9922f0d24abd
MD5 3e05a028aab88fa7cef0b3e774c0785b
BLAKE2b-256 e676a1b7107f419dea3108a89635e76d5f0e78a38ddbc400bf91eb7d8828dd66

See more details on using hashes here.

File details

Details for the file transly-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: transly-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 32.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.3

File hashes

Hashes for transly-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1865b0564cdeaa91306763208d13d35dd533dc5a8a077cd9b3be4d7712d01929
MD5 7fa3b25f14f529582763a9c9b092325f
BLAKE2b-256 9238e2154172e02abab6cbd761a540b2c1ccd3bfb04315b0180772caf1385e37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page