Skip to main content

Transliteration with Deep Learning

Project description

DeepTranslit: Towards better transliteration for Indic languages.

telugu, kannada, tamil, malayalam, marathi, hindi are the current supported languages.

Usage

Via docker

# Start the container in background
docker run -d -p 8080:8080 notaitech/deeptranslit:hindi
# Query from python
import requests
requests.post('http://localhost:8080/sync', json={"data": ['mera naam amitab.']}).json()

As python module

pip install --upgrade deeptranslit
from deeptranslit import DeepTranslit

# hindi
transliterator = DeepTranslit('hindi')
# Single sentence prediction
transliterator.transliterate('mera naam amitab.')
# [{'pred': 'मेरा नाम अमिताब.', 'prob': 0.25336900797483103}]

# Multiple sentence prediction
transliterator.transliterate(['mera naam amitab.', 'amitab-aur-abhishek'])
#[[{'pred': 'मेरा नाम अमिताब.', 'prob': 0.25336900797483103}],
# [{'pred': 'अमिताब-और-अभिषेक', 'prob': 0.1027598988040056}]]

Notes:

  • Tokens (characters) not present in input space (english alphabet) are copied over to output.
    • eg: (amitab. -> अमिताब., amitab-aur-abhishek -> अमिताब-और-अभिषेक)
  • Predictions are cached at word level. i.e: computationally, transliterate('amitab amitab') is equivalent to transliterate('amitab') or transliterate('amitab amitab amitab')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deeptranslit-1.1.1.tar.gz (5.0 kB view hashes)

Uploaded Source

Built Distribution

deeptranslit-1.1.1-py2.py3-none-any.whl (16.5 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page