Skip to main content

Transliterations to/from Indian languages

Project description

https://app.travis-ci.com/in-rolls/indicate.svg?branch=master https://img.shields.io/pypi/v/indicate.svg Documentation Status https://static.pepy.tech/badge/indicate

Transliterations to/from Indian languages are still generally low quality. One problem is access to data. Another is that there is no standard transliteration. For Hindi–English, we build novel dataset for names using the ESPNcricinfo. For instance, see here for hindi version of the english scorecard. We also create a dataset from election affidavits We also exploit the Google Dakshina dataset.

To overcome the fact that there isn’t one standard way of transliteration, we provide k-best transliterations.

Install

We strongly recommend installing indicate inside a Python virtual environment (see venv documentation)

pip install indicate

General API

  1. transliterate.hindi2english will take Hindi text and translate into English.

Examples

from indicate import transliterate
english_translated = transliterate.hindi2english("हिंदी")
print(english_translated)

output - hindi

Functions

We expose 1 function, which will take Hindi text and transliterate it to English.

  • transliterate.hindi2english(input)

    • What it does:

      • Converts given hindi text into English alphabet

    • Output

      • Returns text in English

Data

The datasets used to train the model:

Evaluation

Model was evaluated on test dataset of Google Dakshina dataset, Model predicted 73.64% exact matches. Indic-trans predicted 63.12% exact matches on Google Dakshina dataset. Below is the edit distance metrics on test dataset (0.0 mean exact match, the farther away from 0.0, the difference is more between predicted text and actual text)

Edit distance metrics of model on Google Dakshina test dataset

Authors

Rajashekar Chintalapati and Gaurav Sood

Contributor Code of Conduct

The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.

License

The package is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

indicate-0.2.1.tar.gz (56.8 MB view details)

Uploaded Source

Built Distribution

indicate-0.2.1-py3-none-any.whl (56.8 MB view details)

Uploaded Python 3

File details

Details for the file indicate-0.2.1.tar.gz.

File metadata

  • Download URL: indicate-0.2.1.tar.gz
  • Upload date:
  • Size: 56.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for indicate-0.2.1.tar.gz
Algorithm Hash digest
SHA256 048c5c5861480d1907a614a12e19ecaea0a2807b4a22968cdf58d31ff18112ba
MD5 d5908b05ca132c05a7cec2d649fe36fc
BLAKE2b-256 6949574ff866d811ed88a3172ad5af465bd8f79280410730b04b96dd0cd5d584

See more details on using hashes here.

Provenance

The following attestation bundles were made for indicate-0.2.1.tar.gz:

Publisher: python-publish.yml on in-rolls/indicate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file indicate-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: indicate-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 56.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for indicate-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7b058f6a2976a7e16e53399cb8569a64d4c8f4403be1fd6b63e006e7a56d5417
MD5 9d41a3800f5040eb1aa41fcb6f1b2fb5
BLAKE2b-256 04298598de6d16188bbdb981beb2dbfe588a3e318d37343815d89451738024ee

See more details on using hashes here.

Provenance

The following attestation bundles were made for indicate-0.2.1-py3-none-any.whl:

Publisher: python-publish.yml on in-rolls/indicate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page