Skip to main content

TransFuzzy is a robust transliteration system that bridges the gap between Indic scripts and the Latin alphabet.

Project description

🔤 TransFuzzy

Multilingual AI-powered name matching — phonetic + semantic + ML in one CLI tool.

TransFuzzy is a high-performance system for matching names across Indic and Latin scripts, combining phonetic algorithms, string similarity, and transformer embeddings into a single intelligent pipeline.


🚀 Installation

pip install transfuzzy

⚡ Usage

Start API Server

transfuzzy
# or
transfuzzy serve

Runs at:

http://localhost:5000

🔍 CLI Prediction

transfuzzy predict "Rahul"
transfuzzy predict "Rahul" --top 5
transfuzzy predict "Rahul" --json

Example Output

🔍 Similar names:

1. Rahul
2. Raahul
3. Rahool
4. Rahil

🌐 Supported Languages

  • English (Latin)
  • Hindi (Devanagari)
  • Telugu
  • Tamil
  • Kannada
  • Malayalam
  • Gujarati
  • Gurmukhi

You can input:

"Rahul"
"राहुल"
"రాహుల్"

🧠 How It Works

Input Name
   ↓
Script Detection → Transliteration
   ↓
Candidate Filtering (~73k names)
   ↓
Similarity Metrics (8 features)
   ↓
ML Model (Random Forest)
   ↓
Hybrid Scoring
   ↓
Top Matches

📡 API Usage

POST /similar_names

{
  "name": "Rahul"
}

Response:

{
  "similar_names": ["Rahul", "Raahul", "Rahool"]
}

🏗️ Project Structure

src/transfuzzy/
├── cli.py          # CLI entrypoint
├── app.py          # Flask API
├── core/           # ML pipeline
├── dir/            # processing steps
├── db/             # dataset + model
├── utils/          # helpers
├── templates/      # UI
├── static/         # frontend

🧪 Training

uv run python scripts/enrich.py
uv run python scripts/train.py

⚙️ Development

git clone https://github.com/your-username/transfuzzy.git
cd transfuzzy
uv sync
uv run transfuzzy

✨ Features

  • 🔊 Phonetic matching (Soundex, Metaphone)
  • 📐 String similarity (Levenshtein, Jaro-Winkler)
  • 🧠 Semantic embeddings (Sentence Transformers)
  • 🌲 ML model (Random Forest)
  • ⚡ Optimized inference pipeline
  • 💻 CLI + API + Web UI

📄 License

MIT © Goutham


🔥 Vision

TransFuzzy is designed for real-world systems like:

  • KYC verification
  • Government databases
  • Search & deduplication
  • Multilingual identity matching

Built with ❤️ for AI-powered applications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transfuzzy-0.1.1.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

transfuzzy-0.1.1-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file transfuzzy-0.1.1.tar.gz.

File metadata

  • Download URL: transfuzzy-0.1.1.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for transfuzzy-0.1.1.tar.gz
Algorithm Hash digest
SHA256 fe040c28f4735236e9ed537d3f9dcb6384c4cada7f6307d5eeabb0c67c105f0e
MD5 0b5a7b3b90da2b4cfa8b15f64f0f65dd
BLAKE2b-256 f9b82b6b5c074438b118bc45852aac8b06b5f4dd52c78138e4094cb06d86b765

See more details on using hashes here.

File details

Details for the file transfuzzy-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: transfuzzy-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for transfuzzy-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ca61fbe6170234ba1f94228ccc08bacae2f4da8bcdf5f9be50ca20a3a80cc444
MD5 d230d99ce81344043bc3129d81e21a76
BLAKE2b-256 26fee96bec0576b530a6d2971979c35697dbd16c6e323337bf6f318636612bbf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page