Transliterate Cyrillic → Latin in every possible way
Project description
Iuliia
Transliterate Cyrillic → Latin in every possible way
Transliteration means representing Cyrillic data (mainly names and geographic locations) with Latin letters. It is used for international passports, visas, green cards, driving licenses, mail and goods delivery etc.
Iuliia
makes transliteration as easy as:
>>> import iuliia
>>> iuliia.translate("Юлия Щеглова", schema=iuliia.WIKIPEDIA)
'Yuliya Shcheglova'
Why use Iuliia
- 19 transliteration schemas (rule sets), including all main international and Russian standards.
- Correctly implements not only the base mapping, but all the special rules for letter combinations and word endings (AFAIK, Iuliia is the only library which does so).
- Simple API and zero third-party dependencies.
Supports actual schemas:
- ALA-LC (
iuliia.ALA_LC
andiuliia.ALA_LC_ALT
) - BGN/PCGN (
iuliia.BGN_PCGN
andiuliia.BGN_PCGN_ALT
) - BS 2979:1958 (
iuliia.BS_2979
andiuliia.BS_2979_ALT
) - GOST R 52290-2004 (
iuliia.GOST_52290
) - GOST R 7.0.34-2014 (
iuliia.GOST_7034
) - ICAO DOC 9303 (
iuliia.ICAO_DOC_9303
) - ISO 9:1995 aka GOST 7.79-2000 (
iuliia.GOST_779
andiuliia.GOST_779_ALT
) - UNGEGN 1987 V/18 (
iuliia.UNGEGN_1987
) - Scientific (
iuliia.SCIENTIFIC
) - Telegram (
iuliia.TELEGRAM
) - Wikipedia (
iuliia.WIKIPEDIA
) - Yandex.Maps (
iuliia.YANDEX_MAPS
) - Yandex.Money (
iuliia.YANDEX_MONEY
)
And deprecated ones:
- GOST 16876-71 (
iuliia.GOST_16876
andiuliia.GOST_16876_ALT
) - GOST R 52535.1-2006 (
iuliia.GOST_52535
) - ISO/R 9:1954 (
iuliia.ISO_9_1954
) - ISO/R 9:1968 (
iuliia.ISO_9_1968
andiuliia.ISO_9_1968_ALT
) - MVD 310-1997 (
iuliia.MVD_310
andiuliia.MVD_310_FR
) - MVD 782-2000 (
iuliia.MVD_782
)
Known issues:
- BS 2979:1958. This schema defines two alternative translations for
Ы
:Ы
→Ȳ
(used by the Oxford University Press) andЫ
→UI
(used by the British Library).iuliia
usesЫ
→Ȳ
. - GOST R 7.0.34-2014. This schema defines alternatives for many letters, but does not specify when to use which. Therefore,
iuliia
uses the first of suggested translations for each such letter. - MVD-310. This schema defines "
С
between two vowels →SS
" rule. There is no such rule in other schemas, and MVD-310 itself is deprecated, so I decided to ignore this specific rule for the sake of code simplicity.
Installation
pip install iuliia
Usage
API:
import iuliia
# list all supported schemas
for schema_name in iuliia.Schemas.names():
print(schema_name)
# transliterate using specified schema
source = "Юлия Щеглова"
iuliia.translate(source, schema=iuliia.ICAO_DOC_9303)
# "Iuliia Shcheglova"
# or pick schema by name
schema = iuliia.Schemas.get("wikipedia")
iuliia.translate(source, schema)
# "Yuliya Shcheglova"
Command line:
$ iuliia icao_doc_9303 "Юлия Щеглова"
Iuliia Shcheglova
Development setup
$ pip install black coverage flake8 pylint pytest tox
$ tox
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Make sure to add or update tests as appropriate.
Use Black for code formatting and Conventional Commits for commit messages.
Changelog
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.