Skip to main content

Transliterate Chechen text from Cyrillic to Latin script using the Chechen Latin alphabet

Project description

Chechen Transliterator

PyPI version Python versions License

A Python library for transliterating Chechen text from Cyrillic to Latin script using the Chechen Latin alphabet.

Installation

pip install ce-translit

Quick Start

import ce_translit

# Simple usage - transliterate Chechen text
text = "Нохчийн мотт"
result = ce_translit.transliterate(text)
print(result)  # Outputs: "Noxçiyŋ mott"

Features

  • Simple API: Clean, single-function interface
  • Linguistically Accurate: Handles all Chechen-specific rules
  • Context-Aware: Special handling for letter position rules
  • Customizable: Advanced options for specialized use cases
  • Pure Python: No external dependencies
  • Memory Efficient: Uses minimal memory and efficient string handling

Detailed Usage

Basic Usage

import ce_translit

# Transliterate a single word
word_result = ce_translit.transliterate("дош")  # "doş"

# Transliterate a sentence
sentence = "Муха ду хьал де?"
sentence_result = ce_translit.transliterate(sentence)  # "Muxa du ẋal de?"

Advanced Usage with Custom Rules

from ce_translit import Transliterator

# Create a custom transliterator with your own rules
custom_transliterator = Transliterator(
    # Custom letter mapping
    mapping={
        **Transliterator()._mapping, # First define base mapping
        # Then override specific mappings
        "й": "j",
        # Append completely new mappings
        "1": "j"
    },
    # Override blacklist (Words that should keep the regular 'н' at the end)
    blacklist=["дин", "гӏан", "сан"],
    # Override unsurelist (Words that should use 'ŋ[REPLACE]' at the end)
    unsurelist=["шун", "бен", "цӏен"]
)

# Use the custom transliterator
result = custom_transliterator.transliterate("1аж дин шун")

If you omit **Transliterator()._mapping** from the custom mapping, the custom transliterator will only use the custom mappings you provide.

Oveeride just one of list by defining a list outside

from ce_translit import Transliterator

# Define your own list
my_blacklist = ["дин", "гӏан", "сан"]

# Create a custom transliterator with defined blacklist
custom_transliterator = Transliterator(blacklist=my_blacklist)
result = custom_transliterator.transliterate("дин")

Special Transliteration Rules

The library handles several special rules in Chechen transliteration:

  1. Letter 'е':

    • At the start of a word → 'ye' (ex: "елар" → "yelar")
    • After 'ъ' → 'ye' (ex: "шелъелча" → "şelyelça")
    • In other positions → 'e' (ex: "мела" → "mela")
  2. Letter 'н' at end of words:

    • Regular handling → 'ŋ' (ex: "сан" → "saŋ")
    • Blacklisted words keep 'n' (ex: "хан" → "xan")
    • Unsurelist words use 'ŋ[REPLACE]' (ex: "шун" → "şuŋ[REPLACE]")
  3. Standalone 'а':

    • When 'а' is a standalone word → 'ə' (ex: "а" → "ə")
  4. Special Character Combinations:

    • 'къ' → 'q̇'
    • 'хь' → 'ẋ'
    • 'гӏ' → 'ġ'

Technical Details

Performance

The library is optimized for both startup time and runtime performance:

  • Data is loaded once at import time
  • Efficient string handling for minimal memory usage
  • Uses sets for O(1) lookups in blacklists and unsure lists

Development

Setting up the Development Environment

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate

# Install development tools
pip install --upgrade hatch pytest

# Run tests
hatch run test

# Build the package
hatch build

# Test the built package
pip install --force-reinstall dist/ce_translit-1.0.0-py3-none-any.whl

Running Tests

# Install test dependencies
pip install pytest

# Run tests
pytest

Repository Structure

ce-translit-py/
├── src/
│   └── ce_translit/
│       ├── __init__.py         # Public API
│       ├── _transliterator.py  # Core implementation
│       ├── data/
│       │   └── cyrl_latn_map.json  # Character mapping
├── tests/
│   └── test_transliterator.py
├── LICENSE
├── README.md
└── pyproject.toml

License

This project is licensed under the MIT License.

Contributing

Contributions are welcome! Feel free to submit issues or pull requests on the GitHub repository.

Related Projects

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ce_translit-1.0.1.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ce_translit-1.0.1-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file ce_translit-1.0.1.tar.gz.

File metadata

  • Download URL: ce_translit-1.0.1.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ce_translit-1.0.1.tar.gz
Algorithm Hash digest
SHA256 f382a84b9b1e5788c902de3a814da301f7a4a3811ad130a9d3cb8bc101311204
MD5 cb66b0f36c7e71cd2b4869f9e78e6578
BLAKE2b-256 c9c71ffbcb9c7af1e12cccf5882a6ad0e7dfa34c21d2dcc32ac2e25e8de40dd4

See more details on using hashes here.

Provenance

The following attestation bundles were made for ce_translit-1.0.1.tar.gz:

Publisher: python-publish.yml on chechen-language/ce-translit-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ce_translit-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ce_translit-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ce_translit-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f81259fc5d5f35fc965f4dfb2d1c1f3c2a0faaef14f7ce066a18210c9bc6846c
MD5 d09d2f0331f21f0c7b0eed63bef6c63d
BLAKE2b-256 13494e33b5560b5cb9a3d61a9794bf777bbeb413d41248af43e1627a5c09ceb3

See more details on using hashes here.

Provenance

The following attestation bundles were made for ce_translit-1.0.1-py3-none-any.whl:

Publisher: python-publish.yml on chechen-language/ce-translit-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page