Skip to main content

Multilingual translator for Kabardian and Caucasian languages with speech synthesis

Project description

๐ŸŒ Kabardian Translator

Voice-Enabled Multilingual Translator for Caucasian Languages

License Python PyTorch Hugging Face

๐ŸŽฏ Educational tool for learning Kabardian and Caucasian languages with AI-powered translation and speech synthesis

โœจ Features

  • ๐Ÿง  Smart Translation: 14 languages with specialized Kabardian models
  • ๐Ÿ”Š Voice Synthesis: Text-to-speech with automatic transliteration
  • ๐Ÿ”ค Phonetic Support: Georgian/Armenian alphabets โ†’ readable Cyrillic
  • โšก Apple Optimized: MPS acceleration for Apple Silicon (requires 16GB RAM)
  • ๐ŸŽจ Modern UI: Dark/light themes, keyboard shortcuts

๐Ÿš€ Quick Start

System Requirements

  • Python: 3.11 or higher
  • RAM: 16GB minimum (for MPS acceleration on Apple Silicon)
  • Storage: ~10GB for AI models
  • OS: macOS (Apple Silicon), Linux, or Windows

Method 1: Package Installation (Recommended)

# 1. Clone & setup
git clone https://github.com/kubataba/kabardian-translator.git
cd kabardian-translator

# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# 3. Install as package (auto-installs all dependencies)
pip install -e .

# 4. Download AI models (~10GB)
python download_models.py

# 5. Launch application
kabardian-translator --port 5500
# โ†’ Open http://localhost:5500

Method 2: Manual Installation

# 1. Clone & setup
git clone https://github.com/kubataba/kabardian-translator.git
cd kabardian-translator

# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# 3. Install dependencies manually
pip install -r requirements.txt

# 4. Download AI models (~10GB)
python3 download_models.py

# 5. Launch application
python3 app.py
# โ†’ Open http://localhost:5500

CLI Options

# Custom port
kabardian-translator --port 8080

# Localhost only (more secure)
kabardian-translator --host localhost --port 5500

# Debug mode
kabardian-translator --debug

# Help
kabardian-translator --help

โšก Performance Optimizations

Optimization Benefit
Float16 instead of Float32 ~50% memory savings (15GB โ†’ 7.5GB), <1% accuracy drop
torch.no_grad() for inference 10โ€“15% faster, no gradient cache
Lazy TTS loading Startup time โ†“ by ~5 sec, memory saved if unused
Automatic memory cleanup Stable long-term operation

Performance on Mac Mini M4

Operation Time Memory
Server start ~10 sec ~2GB
Translation (direct) 200-500ms +1GB
Translation (cascade) 400-900ms +1GB
TTS synthesis 1-2 sec +0.5GB
Peak memory - ~8GB

โš ๏ธ Important: MPS acceleration requires 16GB RAM minimum. With 8GB RAM, use CPU mode (see Troubleshooting).


๐ŸŽ“ Practical Applications

  • For Students: Learn Kabardian, practice pronunciation, compare translations.
  • For Teachers: Prepare materials, generate audio examples, demonstrate phonetics.
  • For Researchers: Analyze transliteration, test MT quality, compare phonetics.
  • For Travelers: Communicate in Caucasus region, understand signs, basic phrases.

๐Ÿ“Š Quality and Limitations

Translation Quality

Language Pair BLEU Quality Method
Russian โ†” Kabardian 35โ€“42 Excellent Direct (fine-tuned)
Slavic โ†” Slavic 30โ€“38 Good Direct (base)
Any โ†” Kabardian 28โ€“35 Good Cascade (2 models)
European โ†” European 32โ€“40 Good Direct (base)

Voice Synthesis

Language TTS Quality Method Accuracy
Russian, Ukrainian, Belarusian 95โ€“98% Direct Excellent
Kabardian, Kazakh 92โ€“95% Direct Excellent
Georgian, Armenian 88โ€“92% Transliteration โ†’ TTS Good
Turkish, Azerbaijani 85โ€“88% Transliteration โ†’ TTS Good
German, Spanish, Latvian 78โ€“82% Transliteration โ†’ TTS Acceptable

Limitations

  • TTS: Max 200 chars; imperfect pronunciation for transliterated langs; no intonation.
  • Translation: Cascade may lose nuance; technical terms may be inaccurate; context >512 tokens lost.
  • Transliteration: Simplified phonetics; stress marks not shown.

๐Ÿ› ๏ธ Troubleshooting

Insufficient RAM (Less than 16GB)

For systems with 8GB RAM, disable MPS and use CPU mode:

Option 1: Environment variable (temporary)

export PYTORCH_ENABLE_MPS_FALLBACK=1
kabardian-translator

Option 2: Edit app.py (permanent)

# Find this line:
device = "mps" if torch.backends.mps.is_available() else "cpu"

# Change to:
device = "cpu"  # Force CPU mode

โš ๏ธ CPU mode runs 3โ€“5ร— slower but works on any system.

Models Won't Load

# Try mirror if Hugging Face is blocked
export HF_ENDPOINT=https://hf-mirror.com
python3 download_models.py

MPS Unavailable

If MPS acceleration is not detected on Apple Silicon:

# Check PyTorch MPS support
python3 -c "import torch; print(torch.backends.mps.is_available())"

If returns False:

  • Update to latest macOS (13.0+)
  • Reinstall PyTorch: pip install --upgrade torch torchaudio
  • Fallback to CPU mode (see "Insufficient RAM" above)

Out of Memory (OOM)

  • Reduce beam search: num_beams=3
  • Comment out unused models in app.py

Transliteration Inaccurate

Edit transliterator.py:

self.turkish_to_kazakh['h'] = 'ั…'  # Better than 'าณ'

Command Not Found: kabardian-translator

If after pip install -e . the command is not recognized:

# Reinstall package
pip uninstall kabardian-translator
pip install -e .

# Or use direct Python call
python -m kabardian_translator.cli --port 5500

๐Ÿ“„ License and Usage

Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

โœ… Allowed: Personal, educational, research, modifications, distribution with attribution.
โŒ Prohibited: Commercial use, profit-driven services, integration into paid products.

๐Ÿ”— Full license: https://creativecommons.org/licenses/by-nc/4.0/


๐Ÿ™ Acknowledgments

  • anzorq โ€“ fine-tuned M2M100 models for Kabardian
  • Meta AI โ€“ base M2M100 model
  • Silero Team โ€“ high-quality TTS
  • Hugging Face โ€“ platform and Transformers
  • Kabardian language community โ€“ feedback and support

๐Ÿ“ž Support and Contribution

  • Found a bug? โ†’ Open an Issue on GitHub
  • Want to contribute? โ†’ Fork โ†’ Branch โ†’ Commit โ†’ Pull Request
  • Need help? โ†’ Check TROUBLESHOOTING or Discussions

๐Ÿ—บ๏ธ Roadmap

  • v1.1 (Q1 2026): Expanding North Caucasian Languages Support
  • v1.2 (Q2 2026): API, Redis caching, user history, batch translation
  • v2.0 (Q3 2026): Mobile app, offline mode, Telegram Bot

๐Ÿ“š Additional Resources


Made with โค๏ธ for preserving and studying the Kabardian language

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kabardian_translator-1.0.1.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kabardian_translator-1.0.1-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file kabardian_translator-1.0.1.tar.gz.

File metadata

  • Download URL: kabardian_translator-1.0.1.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for kabardian_translator-1.0.1.tar.gz
Algorithm Hash digest
SHA256 2495c581537cab3693797c42d0e86c63f4d0c42ada3f53908aeb49bf9fe6d994
MD5 a6abe7b9f10eb817b498f487a88f0c0c
BLAKE2b-256 45a23359e4788ead386ae02cfc98b7431bd6ccac46c6ec3d67102e12ce67e239

See more details on using hashes here.

File details

Details for the file kabardian_translator-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for kabardian_translator-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b61b3c80e312f672c643d6f34dfbe351ac93b60a274cb6193cb07368fb3679a6
MD5 ea8ccfa2c49013f7a1aac7fff3f88032
BLAKE2b-256 c4954fc1c31315ab8895a21bd3e4411af4133af9fa0fc35c9690c0756f40c83e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page