TransFuzzy is a robust transliteration system that bridges the gap between Indic scripts and the Latin alphabet.
Project description
🔤 TransFuzzy
Multilingual AI-powered name matching — phonetic + semantic + ML in one CLI tool.
TransFuzzy is a high-performance system for matching names across Indic and Latin scripts, combining phonetic algorithms, string similarity, and transformer embeddings into a single intelligent pipeline.
🚀 Installation
pip install transfuzzy
⚡ Usage
Start API Server
transfuzzy
# or
transfuzzy serve
Runs at:
http://localhost:5000
🔍 CLI Prediction
transfuzzy predict "Rahul"
transfuzzy predict "Rahul" --top 5
transfuzzy predict "Rahul" --json
Example Output
🔍 Similar names:
1. Rahul
2. Raahul
3. Rahool
4. Rahil
🌐 Supported Languages
- English (Latin)
- Hindi (Devanagari)
- Telugu
- Tamil
- Kannada
- Malayalam
- Gujarati
- Gurmukhi
You can input:
"Rahul"
"राहुल"
"రాహుల్"
🧠 How It Works
Input Name
↓
Script Detection → Transliteration
↓
Candidate Filtering (~73k names)
↓
Similarity Metrics (8 features)
↓
ML Model (Random Forest)
↓
Hybrid Scoring
↓
Top Matches
📡 API Usage
POST /similar_names
{
"name": "Rahul"
}
Response:
{
"similar_names": ["Rahul", "Raahul", "Rahool"]
}
🏗️ Project Structure
src/transfuzzy/
├── cli.py # CLI entrypoint
├── app.py # Flask API
├── core/ # ML pipeline
├── dir/ # processing steps
├── db/ # dataset + model
├── utils/ # helpers
├── templates/ # UI
├── static/ # frontend
🧪 Training
uv run python scripts/enrich.py
uv run python scripts/train.py
⚙️ Development
git clone https://github.com/your-username/transfuzzy.git
cd transfuzzy
uv sync
uv run transfuzzy
✨ Features
- 🔊 Phonetic matching (Soundex, Metaphone)
- 📐 String similarity (Levenshtein, Jaro-Winkler)
- 🧠 Semantic embeddings (Sentence Transformers)
- 🌲 ML model (Random Forest)
- ⚡ Optimized inference pipeline
- 💻 CLI + API + Web UI
📄 License
MIT © Goutham
🔥 Vision
TransFuzzy is designed for real-world systems like:
- KYC verification
- Government databases
- Search & deduplication
- Multilingual identity matching
Built with ❤️ for AI-powered applications
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file transfuzzy-0.1.1.tar.gz.
File metadata
- Download URL: transfuzzy-0.1.1.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe040c28f4735236e9ed537d3f9dcb6384c4cada7f6307d5eeabb0c67c105f0e
|
|
| MD5 |
0b5a7b3b90da2b4cfa8b15f64f0f65dd
|
|
| BLAKE2b-256 |
f9b82b6b5c074438b118bc45852aac8b06b5f4dd52c78138e4094cb06d86b765
|
File details
Details for the file transfuzzy-0.1.1-py3-none-any.whl.
File metadata
- Download URL: transfuzzy-0.1.1-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca61fbe6170234ba1f94228ccc08bacae2f4da8bcdf5f9be50ca20a3a80cc444
|
|
| MD5 |
d230d99ce81344043bc3129d81e21a76
|
|
| BLAKE2b-256 |
26fee96bec0576b530a6d2971979c35697dbd16c6e323337bf6f318636612bbf
|