Skip to main content

UzTransliterator | Transliteration tool for Uzbek language - Cyrillic<>Latin<>NewLatin

Project description


UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language, Cyrillic<>Latin<>NewLatin

The main goal of this paper is to present a state-of-the-art machine transliteration tool between three common scripts used in low-resource Uzbek language: old Cyrillic, currently official Latin, and newly announced New-Latin alphabets, which was created using a combination of rule-based and statistical approaches. The created tool is available as an open-source Python package, as well as a web-based application including a public API

About The Project

Web-interface of the tool

Feel free to use the tool presented in this project, and if you find it useful, plese make sure to cite the paper here (coming soon...) Demo of the web-based transliteration tool can be seen here.

In this paper, we presented a Python code, a web tool, and an API created for the Uzbek language that performs machine transliteration between two popularly used Cyrillic and Latin alphabets, as well as a newly reformed version of the Latin alphabet, which, according to the governmental decree, all legal texts will have been completely adapted to by year 2023.

(back to top)

Installation

Python

pip install UzTransliterator
Source: https://pypi.org/project/UzTransliterator/

Using
from UzTransliterator import UzTransliterator
obj = UzTransliterator.UzTransliterator()
print(obj.transliterate("маткаб", from_="cyr", to="lat"))
Output: maktab

Options

from_='cyr', to='lat'
from_='cyr', to='nlt'
from_='lat', to='cyr'
from_='lat', to='nlt'
from_='nlt', to='cyr'
from_='nlt', to='lat'

Web Interface

https://nlp.urdu.uz/?menu=translit

API

URL: https://uz-translit.herokuapp.com/translit
Methods: GET, POST
Parametres: text:str, from_:str, to:str
Example Request: https://uz-translit.herokuapp.com/translit?text=мактаб&from_=cyr&to=lat

Note

New latin alphabet has some difference than Latin. Main changing is presented in following as format Latin - New Latin:
“G‘, g‘” — “Ḡ, ḡ”
“O‘, o‘” — “Ō, ō”
“Sh, sh” — “Ş, ş”
“Ch, ch” — “Ç ç”

Built With

Programming language used:

These are the major libraries used inside Python:

(back to top)

License

Distributed under the MIT LICENSE. See LICENSE.txt for more information.

(back to top)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

UzTransliterator-0.0.36-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file UzTransliterator-0.0.36-py3-none-any.whl.

File metadata

File hashes

Hashes for UzTransliterator-0.0.36-py3-none-any.whl
Algorithm Hash digest
SHA256 0aa1e83f75e974cecd8cc66066c02dd692ffbb68b76799eabd2f59ee83545c2f
MD5 63b816e801de6ce5286b2ad6a165d136
BLAKE2b-256 6906511a2009f2990980e5728dc348cfd75d8010709d07bcc338d866226ff5c4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page