A Python package for Roman to Nepali (Devanagari) transliteration
Project description
Nepali Unicoder
A robust Python package for converting Romanized Nepali text and Preeti font text into Unicode Devanagari script. It uses a greedy matching algorithm for Roman transliteration and a two-phase conversion process for Preeti with contextual rules.
Read the Full Documentation for detailed usage guides, Preeti mapping references, and API details.
Features
- Accurate Transliteration: Uses a greedy matching algorithm to prioritize longer phonetic matches (e.g., 'kha' is matched before 'k' and 'h').
- Preeti Font Support: Full support for Preeti to Unicode conversion with 30+ contextual rules for accurate transformation.
- Smart Vowel Handling: Distinguishes between independent vowels (e.g., 'aa' -> 'आ') and vowel signs/matras (e.g., 'ka' -> 'क', 'kaa' -> 'का').
- Contextual Rules: Handles complex Devanagari rules like reph positioning, matra reordering, and special character combinations.
- Mixed Content Support: Allows keeping English words or specific text in Roman script using
{}blocks. - Customizable: Supports custom word-level overrides via
word_maps.json. - CLI Support: Can be used directly from the command line.
Installation
You can install the package locally:
pip install nepali-unicoder
Usage
Command Line Interface (CLI)
You can use the converter directly from the terminal:
# Using the installed command
nepali-unicoder "namaste"
# Output: नमस्ते
# Or using python module
python -m nepali_unicoder "namaste"
# Output: नमस्ते
# Pipe input
echo "mero naam sanjeev ho" | nepali-unicoder
# Output: मेरो नाम् सन्जीव् हो
Python API
from nepali_unicoder.convert import Converter
converter = Converter()
# Basic conversion
text = "namaste nepal"
print(converter.convert(text))
# Output: नमस्ते नेपाल
# Using 'as-is' blocks for English text
mixed_text = "mero naam {Sanjeev} ho"
print(converter.convert(mixed_text))
# Output: मेरो नाम् Sanjeev हो
Preeti Mode
Convert Preeti font text to Unicode with full support for contextual rules:
from nepali_unicoder.convert import Converter
# Create converter in Preeti mode
preeti_converter = Converter(mode="preeti")
# Basic conversion
preeti_text = "s{sf" # Preeti characters
print(preeti_converter.convert(preeti_text))
# Output: र्कका
# The converter handles:
# - Reph positioning: { → र् (moves before consonant)
# - Matra reordering: l (ि) moves after consonant
# - Special m transformations
# - Vowel combinations
# - Literal brackets: { and } are treated as normal characters in Preeti mode
Preeti Character Examples
| Preeti | Unicode | Description |
|---|---|---|
s |
क |
Consonant ka |
s{ |
र्क |
Reph + ka (contextual) |
sl |
कि |
ka + short i (reordered) |
qm |
क्र |
Special m transformation |
!@# |
१२३ |
Nepali numbers |
Ù / Ú |
; / : |
Literal punctuation |
« / » |
्र |
Ra-foot (for ट, ठ, ड, ढ) |
¿ |
रू |
Combined ruu |
å |
द्व |
Combined dva |
ˆ |
फ् |
Half ph |
ª |
ङ |
Consonant nga |
æ / Æ |
“ / ” |
Curly quotes |
¥ |
र् |
Half ra |
¶ |
ठ्ठ |
Combined thth |
§ |
ट्ट |
Combined tt |
£ |
घ् |
Half gh |
Ë / Í |
ङ्ग / ङ्क |
Combined nga-ga / nga-ka |
‰ |
झ् |
Half jh |
CLI for Preeti
python -m nepali_unicoder --preeti "s{sf"
# Output: र्कका
Transliteration Rules
- Consonants:
k->क्,ka->क,kh->ख्,kha->ख - Vowels:
a->अ,aa->आ,i->इ,u->उ - Matras:
ki->कि,ko->को - Special:
.->।,..->॥ - Numbers:
0-9->०-९(Decimal points are preserved:1.5->१.५)
Advanced Usage
Handling Complex Text
The converter handles mixed content gracefully. You can use {} to keep text as-is (e.g., for English words or code snippets).
text = "mero naam {Sanjeev} ho ra ma 12.5 barsa ko bhaye."
print(converter.convert(text))
# Output: मेरो नाम् Sanjeev हो र म १२.५ बर्स को भये।
Configuration
The package uses word_maps.json for custom word-level overrides, located in the src/nepali_unicoder directory.
word_maps.json: Defines custom word-level overrides. Use this for words that don't follow standard phonetic rules.
Example word_maps.json:
{
"nepal": "नेपाल",
"kathamandu": "काठमाडौँ"
}
Contribution
We welcome contributions! Here's how you can help:
-
Clone the repository:
git clone https://github.com/realsanjeev/nepali_unicoder.git cd nepali_unicoder
-
Set up a virtual environment:
python3 -m venv .venv source .venv/bin/activate pip install -e .
-
Run tests:
python -m unittest discover tests
-
Submit a Pull Request: Create a new branch, make your changes, and submit a PR.
Development
To run tests:
python -m unittest discover tests
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nepali_unicoder-0.2.0.tar.gz.
File metadata
- Download URL: nepali_unicoder-0.2.0.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abb787f29bd2a3ea43091081f6fa697018195cacf7b5af5d4e8653187ec7b444
|
|
| MD5 |
53dbca2eaa573b802603d0d825e0e5e1
|
|
| BLAKE2b-256 |
62ed1899f50fa1da6af4d465c557a5e5eeca6a59b2132ad6ae1982d51b2cfb01
|
File details
Details for the file nepali_unicoder-0.2.0-py3-none-any.whl.
File metadata
- Download URL: nepali_unicoder-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6af07dca4ed90553ed1a829b17bb710e847327881416dc1f7fdbdb5d4c69394b
|
|
| MD5 |
82b03cfac0642958b46ff44e9fecc706
|
|
| BLAKE2b-256 |
9a6f13c57d12b8b6ea7f5ba5ac40ac99b9ecb1ebf7dfa337de1b6b3115e77110
|