A highly customizable Python package for converting Preeti font text to Unicode (Nepali)
Project description
Preeti Unicode Converter
A highly customizable Python package for converting Preeti font text to Unicode (Nepali). Supports multiple input and output formats including PDF, DOCX, and TXT files with enterprise-level features.
Features
Core Functionality
- Text Conversion: Convert Preeti font text to proper Unicode Nepali text
- Multiple Formats: Support for PDF, DOCX, TXT, and HTML files
- Batch Processing: Convert multiple files simultaneously with parallel processing
- CLI Interface: Command-line tool for easy integration into workflows
Advanced Features
- Dynamic Font Support: Add custom font mappings and user-defined conversion rules
- Plugin Architecture: Extensible system for custom conversion logic
- Processing Pipelines: Configurable conversion workflows with middleware support
- Caching System: Improved performance with intelligent caching
- Progress Tracking: Real-time progress monitoring for large operations
- Error Handling: Comprehensive error management with graceful degradation
Enterprise Features
- PDF Processing: Robust PDF handling with integrity validation and corruption detection
- Parallel Processing: Multi-threaded processing for high-performance batch operations
- Logging System: Structured logging with multiple output formats
- Configuration Management: Flexible configuration system with environment variable support
Installation
Using pip (Recommended)
pip install preeti-unicode
Using uv (Fast)
uv add preeti-unicode
From Source
git clone https://github.com/diwaskunwar/preeti-unicode.git
cd preeti-unicode
pip install -e .
Quick Start
Basic Text Conversion
from preeti_unicode import convert_text
# Convert Preeti text to Unicode
result = convert_text("g]kfn")
print(result) # Output: नेपाल
# Convert with custom options
result = convert_text("g]kfn @)!&", convert_numbers=True)
print(result) # Output: नेपाल २०१७
File Conversion
from preeti_unicode import file_converter
# Convert a PDF file to Unicode text
success = file_converter(
input_file="document.pdf",
input_format="pdf",
output_file="converted.txt",
output_format="txt"
)
# Convert DOCX to HTML
success = file_converter(
input_file="document.docx",
input_format="docx",
output_file="converted.html",
output_format="html"
)
Quick Testing
from preeti_unicode import test
# Test string conversion
test("string", "g]kfn")
# Test file conversion capabilities
test("txt") # Test TXT file conversion
test("pdf") # Test PDF file conversion
test("docx") # Test DOCX file conversion
test("all") # Test all formats
Command Line Interface
Text Conversion
# Convert text directly
preeti-unicode text "g]kfn"
# Convert without number conversion
preeti-unicode text "g]kfn @)!&" --no-convert-numbers
File Conversion
# Convert a single file
preeti-unicode file input.pdf output.txt --output-format txt
# Convert with explicit input format
preeti-unicode file input.pdf output.docx --input-format pdf --output-format docx
Batch Conversion
# Convert multiple files
preeti-unicode batch *.pdf --input-format pdf --output-format txt --output-dir converted/
# Convert all files in a directory
preeti-unicode batch documents/ --input-format pdf --output-format html --output-dir output/
Testing
The package includes a comprehensive test system that you can use to verify functionality:
from preeti_unicode import test
# Test basic string conversion
test("string") # Uses default sample text
test("string", "your_preeti_text_here") # Test with your own text
# Test file format support
test("txt") # Test TXT file conversion
test("pdf") # Test PDF file conversion (requires reportlab)
test("docx") # Test DOCX file conversion (requires python-docx)
# Test everything at once
test("all") # Comprehensive test of all features
Supported Formats
Input Formats
- PDF: Full support with integrity validation and password protection handling
- DOCX: Microsoft Word documents with formatting preservation
- TXT: Plain text files with encoding detection
Output Formats
- PDF: Generate Unicode PDF documents
- DOCX: Create formatted Word documents
- TXT: Plain text output with proper encoding
- HTML: Web-ready HTML with proper Unicode rendering
Contributing
We welcome contributions! Please feel free to submit issues and pull requests.
Development Setup
git clone https://github.com/diwaskunwar/preeti-unicode.git
cd preeti-unicode
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e .
Running Tests
# Run the built-in test suite
python -c "from preeti_unicode import test; test('all')"
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Issues: GitHub Issues
- Documentation: Check the docstrings and examples in this README
Acknowledgments
- Thanks to the Nepali computing community for font specifications
- Built with modern Python packaging standards using uv and proper package structure
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file preeti_unicode_converter-0.1.1.tar.gz.
File metadata
- Download URL: preeti_unicode_converter-0.1.1.tar.gz
- Upload date:
- Size: 53.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
144000bc2b265e18b29d80d07065cfbd88f6371816c30dcf6fdacc1f535e0122
|
|
| MD5 |
20fd4b70de88fd4f0f13fcabe97a47a5
|
|
| BLAKE2b-256 |
7a2b014bda9c24b23a2cd50387442d5c65dabf39907fe180489a8a15c55f44c5
|
File details
Details for the file preeti_unicode_converter-0.1.1-py3-none-any.whl.
File metadata
- Download URL: preeti_unicode_converter-0.1.1-py3-none-any.whl
- Upload date:
- Size: 69.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cdf192a1576b301dfa5a257ab2f65f9c6e3b0ca287691616b8cb45dd68ae5a16
|
|
| MD5 |
385c5c5a3e8d4c13bc90b5b70d57fab6
|
|
| BLAKE2b-256 |
f7f8db20ec92cf6d71dac0c5dc1da5046e1ad66352c1e818476b93e6eb64a83a
|