UzTransliterator | Transliteration tool for Uzbek language - Cyrillic<>Latin<>NewLatin
Project description
UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language, Cyrillic<>Latin<>NewLatin
The main goal of this paper is to present a state-of-the-art machine transliteration tool between three common scripts used in low-resource Uzbek language: old Cyrillic, currently official Latin, and newly announced New-Latin alphabets, which was created using a combination of rule-based and statistical approaches. The created tool is available as an open-source Python package, as well as a web-based application including a public API
About The Project
Feel free to use the tool presented in this project, and if you find it useful, plese make sure to cite the paper here (coming soon...) Demo of the web-based transliteration tool can be seen here.
In this paper, we presented a Python code, a web tool, and an API created for the Uzbek language that performs machine transliteration between two popularly used Cyrillic and Latin alphabets, as well as a newly reformed version of the Latin alphabet, which, according to the governmental decree, all legal texts will have been completely adapted to by year 2023.
Installation
Python
pip install UzTransliterator
Source: https://pypi.org/project/UzTransliterator/
Using
from UzTransliterator import UzTransliterator
obj = UzTransliterator.UzTransliterator()
print(obj.transliterate("маткаб", from_="cyr", to="lat"))
Output: maktab
Options
from_='cyr', to='lat'
from_='cyr', to='nlt'
from_='lat', to='cyr'
from_='lat', to='nlt'
from_='nlt', to='cyr'
from_='nlt', to='lat'
Web Interface
https://nlp.urdu.uz/?menu=translit
API
URL: https://uz-translit.herokuapp.com/translit
Methods: GET, POST
Parametres: text:str
, from_:str
, to:str
Example Request: https://uz-translit.herokuapp.com/translit?text=мактаб&from_=cyr&to=lat
Note
New latin alphabet has some difference than Latin. Main changing is presented in following as format Latin - New Latin:
“G‘, g‘” — “Ḡ, ḡ”
“O‘, o‘” — “Ō, ō”
“Sh, sh” — “Ş, ş”
“Ch, ch” — “Ç ç”
Built With
Programming language used:
These are the major libraries used inside Python:
License
Distributed under the MIT LICENSE. See LICENSE.txt
for more information.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file UzTransliterator-0.0.36-py3-none-any.whl
.
File metadata
- Download URL: UzTransliterator-0.0.36-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0aa1e83f75e974cecd8cc66066c02dd692ffbb68b76799eabd2f59ee83545c2f |
|
MD5 | 63b816e801de6ce5286b2ad6a165d136 |
|
BLAKE2b-256 | 6906511a2009f2990980e5728dc348cfd75d8010709d07bcc338d866226ff5c4 |