Python utilities for Adyghe (Western Circassian) language: Cyrillic↔Latin alphabet conversion and number-to-words conversion
Project description
adyghe-latin-utils
🌐 English | Türkçe | Русский | עברית
Python utilities for the Adyghe (Western Circassian) language — Cyrillic↔Latin alphabet conversion and number-to-words conversion.
About Adyghe
Adyghe (адыгабзэ / adıǵabze) is a Northwest Caucasian language spoken by approximately 600,000 people, primarily in the Republic of Adygea (Russia), Turkey, Jordan, Syria, and diaspora communities worldwide. Its ISO 639-3 language code is ady.
Adyghe is traditionally written in the Cyrillic script (since 1938). A Latin-based Adyghe alphabet also exists as an official writing system. This package provides tools for converting between these two official alphabets, as well as converting numbers into Adyghe words.
Features
- Cyrillic → Latin conversion — context-aware conversion between the official Cyrillic and Latin Adyghe alphabets, handling compound characters (гу, гъ, дж, дз, жь, кӀ, ку, шъ, etc.)
- Latin → Cyrillic conversion — reverse conversion with vowel insertion and palochka (Ӏ) rules
- Number to words — converts integers (0 to 10¹⁵) into Adyghe words using the modern decimal (base-10) system
- Numbers in text — detects and converts 12 types of numeric patterns in mixed text: phone numbers, currencies ($), percentages (%), ranges (7-12), decimals (5.11), Roman numerals (IV), signed numbers (+14, -32), and more
- Case utilities — uppercase, lowercase, and capitalize with proper handling of special Latin characters (İ/ı) and Cyrillic palochka (Ӏ)
- Script detection — detect whether text is written in Cyrillic Adyghe
- CLI tools — command-line utilities for batch file conversion with multiprocessing support
Installation
pip install adyghe-latin-utils
Or with uv:
uv add adyghe-latin-utils
Quick Start
Alphabet Conversion
from adyghe_latin_utils import AdigaCharacterUtils
utils = AdigaCharacterUtils()
# Cyrillic to Latin
utils.cyrillic_to_latin("гупшысэ") # → "ǵupşıśé"
utils.cyrillic_to_latin("лъэхъаным") # → "ĺéḣáním"
utils.cyrillic_to_latin("къещхы") # → "kéşḣı"
# Latin to Cyrillic
utils.latin_to_cyrillic("selam") # → "сэлам"
utils.latin_to_cyrillic("adıǵe") # → "адыгэ"
# Script detection
utils.is_cyrillic_adyghe("гупшысэ") # → True
utils.is_cyrillic_adyghe("ǵupşıśé") # → False
Number to Words
This library uses the modern decimal (base-10) system for number-to-words conversion. Adyghe traditionally uses a vigesimal (base-20) counting system similar to French (e.g., French soixante-douze = 60 + 12 for "72"). In the traditional Adyghe system, "72" is ṫoćişıre ṫure (roughly "three-twenties-and-twelve"). Modern usage has shifted towards a simpler decimal (base-10) system:
| Number | Modern decimal (this library) | Traditional vigesimal (not supported) |
|---|---|---|
| 72 | blıć ṫu (7-tens and 2) | ṫoćişıre ṫure (3×20 + 12) |
from adyghe_latin_utils import AdigaNumberUtils
AdigaNumberUtils.number_to_words(5) # → "tfı"
AdigaNumberUtils.number_to_words(42) # → "pĺ'ıć ṫu"
AdigaNumberUtils.number_to_words(100) # → "şe"
AdigaNumberUtils.number_to_words(1000) # → "min"
AdigaNumberUtils.number_to_words(2025) # → "ṫu min ṫuć tfı"
Numbers in Mixed Text
from adyghe_latin_utils import AdigaNumberUtils
AdigaNumberUtils.convert_numbers_in_text("chapter 3")
# → "chapter şı"
AdigaNumberUtils.convert_numbers_in_text("agent 007")
# → "agent ziy ziy blı"
AdigaNumberUtils.convert_numbers_in_text("the year 2025")
# → "the year ṫu min ṫuć tfı"
Case Utilities
from adyghe_latin_utils import AdigaCharacterUtils
utils = AdigaCharacterUtils()
# Latin text
utils.to_lowercase("ADIGE", is_latin=True, is_cyrillic=False)
utils.to_uppercase("adıǵe", is_latin=True, is_cyrillic=False)
utils.capitalize("adıǵe", is_latin=True, is_cyrillic=False)
# Simplify special Latin chars to basic English
utils.special_chars_to_english_chars("ǵupşıśé") # → "gupsise"
CLI Usage
Two command-line tools are installed with the package:
Script Conversion
# Cyrillic to Latin (file to file)
adyghe-char-convert -i input.txt -o output.txt -d c2l
# Latin to Cyrillic (file to stdout)
adyghe-char-convert -i input.txt -d l2c
# Options:
# -i, --input Path to input text file (required)
# -o, --output Path to output file (default: stdout)
# -d, --direction c2l (Cyrillic→Latin) or l2c (Latin→Cyrillic) (required)
The script conversion CLI supports multiprocessing for large files and displays a progress bar.
Number Conversion
# Convert numbers in a text string
adyghe-num-convert -t "chapter 3"
# Convert numbers in a file
adyghe-num-convert -i input.txt -o output.txt
# Options:
# -t, --text Input text string (mutually exclusive with -i)
# -i, --input Path to input text file (mutually exclusive with -t)
# -o, --output Path to output file (default: stdout)
API Reference
AdigaCharacterUtils
| Method | Description |
|---|---|
cyrillic_to_latin(text: str) -> str |
Convert Cyrillic Adyghe text to Latin script |
latin_to_cyrillic(text: str) -> str |
Convert Latin Adyghe text to Cyrillic script |
is_cyrillic_adyghe(text: str, threshold: float = 0.5) -> bool |
Detect if text is Cyrillic Adyghe |
to_lowercase(text, is_latin, is_cyrillic) -> str |
Lowercase with script-aware rules |
to_uppercase(text, is_latin, is_cyrillic) -> str |
Uppercase with script-aware rules |
capitalize(text, is_latin, is_cyrillic) -> str |
Capitalize first character |
special_chars_to_english_chars(text: str) -> str |
Simplify accented Latin chars to ASCII |
cyrillic_extra_chars_to_basic_chars(text: str) -> str |
Normalize Cyrillic character variants |
AdigaNumberUtils
| Method | Description |
|---|---|
number_to_words(number: int) -> str |
Convert integer (0–10¹⁵) to Adyghe words |
convert_numbers_in_text(text: str) -> str |
Find and convert all numeric patterns in text |
Supported Numeric Patterns
convert_numbers_in_text() recognizes and converts these patterns:
| Pattern | Example | Description |
|---|---|---|
| International phone | +972-58-206-2315 |
Digits read individually |
| Local phone | 058-206-2315 |
Digits read individually |
| Prefix-dash | ya-20 |
Number converted, prefix preserved |
| Postfix-dash | 13-re |
Number converted, postfix preserved |
| Dollar amount | $16,918 |
Full number conversion |
| Signed number | +14, -32 |
Sign preserved, number converted |
| Range | 1042-1814 |
Each number converted separately |
| Decimal | 5.11, 50.06% |
Integer and fractional parts converted |
| Slash-separated | 2010/11 |
Each part converted |
| Symbol postfix | 4%, 804+ |
Number converted, symbol preserved |
| Roman numeral | III, IV |
Converted to Arabic then to words |
| Plain number | 42, 1,000,000 |
Full number conversion |
Development
# Clone the repository
git clone https://github.com/showgan/adyghe-latin-utils.git
cd adyghe-latin-utils
# Set up development environment
uv venv
uv pip install -e ".[dev]"
# Run tests
uv run pytest tests/ -v
License
This project is licensed under the MIT License — see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adyghe_latin_utils-0.1.1.tar.gz.
File metadata
- Download URL: adyghe_latin_utils-0.1.1.tar.gz
- Upload date:
- Size: 42.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
395259d681ff8de7b855ed05da1db354e82ade190833dad1e40ec83c29deabf5
|
|
| MD5 |
1b0fbee5f96e52ff636d27ad8eea0cf4
|
|
| BLAKE2b-256 |
4174250bb4e4962e6a6eccf42ee462ee3e0386789e0bb3d5352059300d227454
|
File details
Details for the file adyghe_latin_utils-0.1.1-py3-none-any.whl.
File metadata
- Download URL: adyghe_latin_utils-0.1.1-py3-none-any.whl
- Upload date:
- Size: 18.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd893006b19efcd768c81a76c47d0d0b316b5db1762337825beb1ca72a44f6fc
|
|
| MD5 |
7e2eccb7b2b0cd7130b399116f7020f8
|
|
| BLAKE2b-256 |
05c09bb09db77e29a930e4e0f61808dd0c22bc82d00c858df5f84cb60224bc4d
|