Skip to main content

Transliterate Cyrillic → Latin in every possible way

Project description

Iuliia

Transliterate Cyrillic → Latin in every possible way

PyPI Version Build Status Code Coverage Code Quality

Transliteration means representing Cyrillic data (mainly names and geographic locations) with Latin letters. It is used for international passports, visas, green cards, driving licenses, mail and goods delivery etc.

Iuliia makes transliteration as easy as:

>>> import iuliia
>>> iuliia.translate("Юлия Щеглова", schema=iuliia.WIKIPEDIA)
'Yuliya Shcheglova'

Why use Iuliia

  • 19 transliteration schemas (rule sets), including all main international and Russian standards.
  • Correctly implements not only the base mapping, but all the special rules for letter combinations and word endings (AFAIK, Iuliia is the only library which does so).
  • Simple API and zero third-party dependencies.

Supports actual schemas:

  • ALA-LC (iuliia.ALA_LC and iuliia.ALA_LC_ALT)
  • BGN/PCGN (iuliia.BGN_PCGN and iuliia.BGN_PCGN_ALT)
  • BS 2979:1958 (iuliia.BS_2979 and iuliia.BS_2979_ALT)
  • GOST R 52290-2004 (iuliia.GOST_52290)
  • GOST R 7.0.34-2014 (iuliia.GOST_7034)
  • ICAO DOC 9303 (iuliia.ICAO_DOC_9303)
  • ISO 9:1995 aka GOST 7.79-2000 (iuliia.GOST_779 and iuliia.GOST_779_ALT)
  • UNGEGN 1987 V/18 (iuliia.UNGEGN_1987)
  • Scientific (iuliia.SCIENTIFIC)
  • Telegram (iuliia.TELEGRAM)
  • Wikipedia (iuliia.WIKIPEDIA)
  • Yandex.Maps (iuliia.YANDEX_MAPS)
  • Yandex.Money (iuliia.YANDEX_MONEY)

And deprecated ones:

  • GOST 16876-71 (iuliia.GOST_16876 and iuliia.GOST_16876_ALT)
  • GOST R 52535.1-2006 (iuliia.GOST_52535)
  • ISO/R 9:1954 (iuliia.ISO_9_1954)
  • ISO/R 9:1968 (iuliia.ISO_9_1968 and iuliia.ISO_9_1968_ALT)
  • MVD 310-1997 (iuliia.MVD_310 and iuliia.MVD_310_FR)
  • MVD 782-2000 (iuliia.MVD_782)

Known issues:

  • BS 2979:1958. This schema defines two alternative translations for Ы: ЫȲ (used by the Oxford University Press) and ЫUI (used by the British Library). iuliia uses ЫȲ.
  • GOST R 7.0.34-2014. This schema defines alternatives for many letters, but does not specify when to use which. Therefore, iuliia uses the first of suggested translations for each such letter.
  • MVD-310. This schema defines "С between two vowels → SS" rule. There is no such rule in other schemas, and MVD-310 itself is deprecated, so I decided to ignore this specific rule for the sake of code simplicity.

Installation

pip install iuliia

Usage

API:

import iuliia

# list all supported schemas
for schema_name in iuliia.Schemas.names():
    print(schema_name)

# transliterate using specified schema
source = "Юлия Щеглова"
iuliia.translate(source, schema=iuliia.ICAO_DOC_9303)
# "Iuliia Shcheglova"

# or pick schema by name
schema = iuliia.Schemas.get("wikipedia")
iuliia.translate(source, schema)
# "Yuliya Shcheglova"

Command line:

$ iuliia icao_doc_9303 "Юлия Щеглова"
Iuliia Shcheglova

Development setup

$ pip install black coverage flake8 pylint pytest tox
$ tox

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Make sure to add or update tests as appropriate.

Use Black for code formatting and Conventional Commits for commit messages.

Changelog

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iuliia-0.8.0.tar.gz (15.6 kB view hashes)

Uploaded Source

Built Distribution

iuliia-0.8.0-py3-none-any.whl (16.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page