Skip to main content

A Python package for Roman to Nepali (Devanagari) transliteration

Project description

Nepali Unicoder

Documentation PyPI version

A robust Python package for converting Romanized Nepali text and Preeti font text into Unicode Devanagari script. It uses a greedy matching algorithm for Roman transliteration and a two-phase conversion process for Preeti with contextual rules.

Read the Full Documentation for detailed usage guides, Preeti mapping references, and API details.

Features

  • Accurate Transliteration: Uses a greedy matching algorithm to prioritize longer phonetic matches (e.g., 'kha' is matched before 'k' and 'h').
  • Preeti Font Support: Full support for Preeti to Unicode conversion with 30+ contextual rules for accurate transformation.
  • Smart Vowel Handling: Distinguishes between independent vowels (e.g., 'aa' -> 'आ') and vowel signs/matras (e.g., 'ka' -> 'क', 'kaa' -> 'का').
  • Contextual Rules: Handles complex Devanagari rules like reph positioning, matra reordering, and special character combinations.
  • Mixed Content Support: Allows keeping English words or specific text in Roman script using {} blocks.
  • Customizable: Supports custom word-level overrides via word_maps.json.
  • CLI Support: Can be used directly from the command line.

Installation

You can install the package locally:

pip install nepali-unicoder

Usage

Command Line Interface (CLI)

You can use the converter directly from the terminal:

# Using the installed command
nepali-unicoder "namaste"
# Output: नमस्ते

# Or using python module
python -m nepali_unicoder "namaste"
# Output: नमस्ते

# Pipe input
echo "mero naam sanjeev ho" | nepali-unicoder
# Output: मेरो नाम् सन्जीव् हो

Python API

from nepali_unicoder.convert import Converter

converter = Converter()

# Basic conversion
text = "namaste nepal"
print(converter.convert(text))
# Output: नमस्ते नेपाल

# Using 'as-is' blocks for English text
mixed_text = "mero naam {Sanjeev} ho"
print(converter.convert(mixed_text))
# Output: मेरो नाम् Sanjeev हो

Preeti Mode

Convert Preeti font text to Unicode with full support for contextual rules:

from nepali_unicoder.convert import Converter

# Create converter in Preeti mode
preeti_converter = Converter(mode="preeti")

# Basic conversion
preeti_text = "s{sf"  # Preeti characters
print(preeti_converter.convert(preeti_text))
# Output: र्कका

# The converter handles:
# - Reph positioning: { → र् (moves before consonant)
# - Matra reordering: l (ि) moves after consonant
# - Special m transformations
# - Vowel combinations
# - Literal brackets: { and } are treated as normal characters in Preeti mode

Preeti Character Examples

Preeti Unicode Description
s Consonant ka
s{ र्क Reph + ka (contextual)
sl कि ka + short i (reordered)
qm क्र Special m transformation
!@# १२३ Nepali numbers
Ù / Ú ; / : Literal punctuation
« / » ्र Ra-foot (for ट, ठ, ड, ढ)
¿ रू Combined ruu
å द्व Combined dva
ˆ फ् Half ph
ª Consonant nga
æ / Æ / Curly quotes
¥ र् Half ra
ठ्ठ Combined thth
§ ट्ट Combined tt
£ घ् Half gh
Ë / Í ङ्ग / ङ्क Combined nga-ga / nga-ka
झ् Half jh

CLI for Preeti

python -m nepali_unicoder --preeti "s{sf"
# Output: र्कका

Transliteration Rules

  • Consonants: k -> क्, ka -> , kh -> ख्, kha ->
  • Vowels: a -> , aa -> , i -> , u ->
  • Matras: ki -> कि, ko -> को
  • Special: . -> , .. ->
  • Numbers: 0-9 -> ०-९ (Decimal points are preserved: 1.5 -> १.५)

Advanced Usage

Handling Complex Text

The converter handles mixed content gracefully. You can use {} to keep text as-is (e.g., for English words or code snippets).

text = "mero naam {Sanjeev} ho ra ma 12.5 barsa ko bhaye."
print(converter.convert(text))
# Output: मेरो नाम् Sanjeev हो र म १२.५ बर्स को भये।

Configuration

The package uses word_maps.json for custom word-level overrides, located in the src/nepali_unicoder directory.

  1. word_maps.json: Defines custom word-level overrides. Use this for words that don't follow standard phonetic rules.

Example word_maps.json:

{
    "nepal": "नेपाल",
    "kathamandu": "काठमाडौँ"
}

Contribution

We welcome contributions! Here's how you can help:

  1. Clone the repository:

    git clone https://github.com/realsanjeev/nepali_unicoder.git
    cd nepali_unicoder
    
  2. Set up a virtual environment:

    python3 -m venv .venv
    source .venv/bin/activate
    pip install -e .
    
  3. Run tests:

    python -m unittest discover tests
    
  4. Submit a Pull Request: Create a new branch, make your changes, and submit a PR.

Development

To run tests:

python -m unittest discover tests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nepali_unicoder-0.2.0.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nepali_unicoder-0.2.0-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file nepali_unicoder-0.2.0.tar.gz.

File metadata

  • Download URL: nepali_unicoder-0.2.0.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for nepali_unicoder-0.2.0.tar.gz
Algorithm Hash digest
SHA256 abb787f29bd2a3ea43091081f6fa697018195cacf7b5af5d4e8653187ec7b444
MD5 53dbca2eaa573b802603d0d825e0e5e1
BLAKE2b-256 62ed1899f50fa1da6af4d465c557a5e5eeca6a59b2132ad6ae1982d51b2cfb01

See more details on using hashes here.

File details

Details for the file nepali_unicoder-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: nepali_unicoder-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for nepali_unicoder-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6af07dca4ed90553ed1a829b17bb710e847327881416dc1f7fdbdb5d4c69394b
MD5 82b03cfac0642958b46ff44e9fecc706
BLAKE2b-256 9a6f13c57d12b8b6ea7f5ba5ac40ac99b9ecb1ebf7dfa337de1b6b3115e77110

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page