Skip to main content

A Python package for Roman to Nepali (Devanagari) transliteration

Project description

Nepali Unicoder

Documentation PyPI version

A robust Python package for converting Romanized Nepali text and Preeti font text into Unicode Devanagari script. It uses a greedy matching algorithm for Roman transliteration and a two-phase conversion process for Preeti with contextual rules.

Read the Full Documentation for detailed usage guides, Preeti mapping references, and API details.

Features

  • Accurate Transliteration: Uses a greedy matching algorithm to prioritize longer phonetic matches (e.g., 'kha' is matched before 'k' and 'h').
  • Preeti Font Support: Full support for Preeti to Unicode conversion with 30+ contextual rules for accurate transformation.
  • Smart Vowel Handling: Distinguishes between independent vowels (e.g., 'aa' -> 'आ') and vowel signs/matras (e.g., 'ka' -> 'क', 'kaa' -> 'का').
  • Contextual Rules: Handles complex Devanagari rules like reph positioning, matra reordering, and special character combinations.
  • Mixed Content Support: Allows keeping English words or specific text in Roman script using {} blocks.
  • Customizable: Supports custom word-level overrides via word_maps.json.
  • CLI Support: Can be used directly from the command line.

Installation

You can install the package locally:

pip install nepali-unicoder

Usage

Command Line Interface (CLI)

You can use the converter directly from the terminal:

# Direct argument
python -m nepali_unicoder "namaste"
# Output: नमस्ते

# Pipe input
echo "mero naam sanjeev ho" | python -m nepali_unicoder
# Output: मेरो नाम सन्जीव् हो

Python API

from nepali_unicoder.convert import Converter

converter = Converter()

# Basic conversion
text = "namaste nepal"
print(converter.convert(text))
# Output: नमस्ते नेपाल

# Using 'as-is' blocks for English text
mixed_text = "mero naam {Sanjeev} ho"
print(converter.convert(mixed_text))
# Output: मेरो नाम Sanjeev हो

Preeti Mode

Convert Preeti font text to Unicode with full support for contextual rules:

from nepali_unicoder.convert import Converter

# Create converter in Preeti mode
preeti_converter = Converter(mode="preeti")

# Basic conversion
preeti_text = "s{sf"  # Preeti characters
print(preeti_converter.convert(preeti_text))
# Output: र्कर्का

# The converter handles:
# - Reph positioning: { → र् (moves before consonant)
# - Matra reordering: l (ि) moves after consonant
# - Special m transformations
# - Vowel combinations
# - Literal brackets: { and } are treated as normal characters in Preeti mode

Preeti Character Examples

Preeti Unicode Description
s Consonant ka
s{ र्क Reph + ka (contextual)
sl कि ka + short i (reordered)
qm क्र Special m transformation
!@# १२३ Nepali numbers
Ù / Ú ; / : Literal punctuation
« / » ्र Ra-foot (for ट, ठ, ड, ढ)
¿ रू Combined ruu
å द्व Combined dva
ˆ फ् Half ph
ª Consonant nga
æ / Æ / Curly quotes
¥ र् Half ra
ठ्ठ Combined thth
§ ट्ट Combined tt
£ घ् Half gh
Ë / Í ङ्ग / ङ्क Combined nga-ga / nga-ka
झ् Half jh

CLI for Preeti

python -m nepali_unicoder --preeti "s{sf"
# Output: र्कर्का

Transliteration Rules

  • Consonants: k -> क्, ka -> , kh -> ख्, kha ->
  • Vowels: a -> , aa -> , i -> , u ->
  • Matras: ki -> कि, ko -> को
  • Special: . -> , .. ->
  • Numbers: 0-9 -> ०-९ (Decimal points are preserved: 1.5 -> १.५)

Advanced Usage

Handling Complex Text

The converter handles mixed content gracefully. You can use {} to keep text as-is (e.g., for English words or code snippets).

text = "mero naam {Sanjeev} ho ra ma 12.5 barsa ko bhaye."
print(converter.convert(text))
# Output: मेरो नाम Sanjeev हो र म १२.५ बर्स को भए।

Configuration

The package uses word_maps.json for custom word-level overrides, located in the src/nepali_unicoder directory.

  1. word_maps.json: Defines custom word-level overrides. Use this for words that don't follow standard phonetic rules.

Example word_maps.json:

{
    "nepal": "नेपाल",
    "kathamandu": "काठमाडौँ"
}

Contribution

We welcome contributions! Here's how you can help:

  1. Clone the repository:

    git clone https://github.com/realsanjeev/nepali_unicoder.git
    cd nepali_unicoder
    
  2. Set up a virtual environment:

    python3 -m venv .venv
    source .venv/bin/activate
    pip install -e .
    
  3. Run tests:

    python -m unittest discover tests
    
  4. Submit a Pull Request: Create a new branch, make your changes, and submit a PR.

Development

To run tests:

python -m unittest discover tests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nepali_unicoder-0.1.2.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nepali_unicoder-0.1.2-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file nepali_unicoder-0.1.2.tar.gz.

File metadata

  • Download URL: nepali_unicoder-0.1.2.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for nepali_unicoder-0.1.2.tar.gz
Algorithm Hash digest
SHA256 2fa452c16ae0d06050984314bfda30742f1ccdacb574805b047f02c9a58f13d7
MD5 e277c5c7c4164439802b762cfa1a777d
BLAKE2b-256 51f6d6ae79918fde5747d6f716fea2c84c644bfbf17428ea05bca172c050d033

See more details on using hashes here.

File details

Details for the file nepali_unicoder-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for nepali_unicoder-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 637b41f205d483b22fe187f056e248b78b377227c9ed1ea649052f7e80778c1b
MD5 7a35d4a2e8a47b3491cd6e938a457aec
BLAKE2b-256 12bd75c72209ed2a9ee397f167b2de01dfc972d2cb40bc26d47ade7e75ddc666

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page