Skip to main content

Multilingual translator for Kabardian and Caucasian languages with speech synthesis

Project description

๐ŸŒ Kabardian Translator

Voice-Enabled Multilingual Translator for Caucasian Languages

License Python PyTorch Hugging Face

๐ŸŽฏ Educational tool for learning Kabardian and Caucasian languages with AI-powered translation and speech synthesis

โœจ Features

  • ๐Ÿง  Smart Translation: 14 languages with specialized Kabardian models
  • ๐Ÿ”Š Voice Synthesis: Text-to-speech with automatic transliteration
  • ๐Ÿ”ค Phonetic Support: Georgian/Armenian alphabets โ†’ readable Cyrillic
  • โšก Apple Optimized: MPS acceleration for Apple Silicon (requires 16GB RAM)
  • ๐ŸŽจ Modern UI: Dark/light themes, keyboard shortcuts

๐Ÿš€ Quick Start

System Requirements

  • Python: 3.11 or higher
  • RAM: 16GB minimum (for MPS acceleration on Apple Silicon)
  • Storage: ~10GB for AI models
  • OS: macOS (Apple Silicon), Linux, or Windows

Method 1: Package Installation (Recommended)

# 1. Clone & setup
git clone https://github.com/kubataba/kabardian-translator.git
cd kabardian-translator

# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# 3. Install as package (auto-installs all dependencies)
pip install -e .

# 4. Download AI models (~10GB)
python download_models.py

# 5. Launch application
kabardian-translator --port 5500
# โ†’ Open http://localhost:5500

Method 2: Manual Installation

# 1. Clone & setup
git clone https://github.com/kubataba/kabardian-translator.git
cd kabardian-translator

# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# 3. Install dependencies manually
pip install -r requirements.txt

# 4. Download AI models (~10GB)
python3 download_models.py

# 5. Launch application
python3 app.py
# โ†’ Open http://localhost:5500

CLI Options

# Custom port
kabardian-translator --port 8080

# Localhost only (more secure)
kabardian-translator --host localhost --port 5500

# Debug mode
kabardian-translator --debug

# Help
kabardian-translator --help

โšก Performance Optimizations

Optimization Benefit
Float16 instead of Float32 ~50% memory savings (15GB โ†’ 7.5GB), <1% accuracy drop
torch.no_grad() for inference 10โ€“15% faster, no gradient cache
Lazy TTS loading Startup time โ†“ by ~5 sec, memory saved if unused
Automatic memory cleanup Stable long-term operation

Performance on Mac Mini M4

Operation Time Memory
Server start ~10 sec ~2GB
Translation (direct) 200-500ms +1GB
Translation (cascade) 400-900ms +1GB
TTS synthesis 1-2 sec +0.5GB
Peak memory - ~8GB

โš ๏ธ Important: MPS acceleration requires 16GB RAM minimum. With 8GB RAM, use CPU mode (see Troubleshooting).


๐ŸŽ“ Practical Applications

  • For Students: Learn Kabardian, practice pronunciation, compare translations.
  • For Teachers: Prepare materials, generate audio examples, demonstrate phonetics.
  • For Researchers: Analyze transliteration, test MT quality, compare phonetics.
  • For Travelers: Communicate in Caucasus region, understand signs, basic phrases.

๐Ÿ“Š Quality and Limitations

Translation Quality

Language Pair BLEU Quality Method
Russian โ†” Kabardian 35โ€“42 Excellent Direct (fine-tuned)
Slavic โ†” Slavic 30โ€“38 Good Direct (base)
Any โ†” Kabardian 28โ€“35 Good Cascade (2 models)
European โ†” European 32โ€“40 Good Direct (base)

Voice Synthesis

Language TTS Quality Method Accuracy
Russian, Ukrainian, Belarusian 95โ€“98% Direct Excellent
Kabardian, Kazakh 92โ€“95% Direct Excellent
Georgian, Armenian 88โ€“92% Transliteration โ†’ TTS Good
Turkish, Azerbaijani 85โ€“88% Transliteration โ†’ TTS Good
German, Spanish, Latvian 78โ€“82% Transliteration โ†’ TTS Acceptable

Limitations

  • TTS: Max 200 chars; imperfect pronunciation for transliterated langs; no intonation.
  • Translation: Cascade may lose nuance; technical terms may be inaccurate; context >512 tokens lost.
  • Transliteration: Simplified phonetics; stress marks not shown.

๐Ÿ› ๏ธ Troubleshooting

Insufficient RAM (Less than 16GB)

For systems with 8GB RAM, disable MPS and use CPU mode:

Option 1: Environment variable (temporary)

export PYTORCH_ENABLE_MPS_FALLBACK=1
kabardian-translator

Option 2: Edit app.py (permanent)

# Find this line:
device = "mps" if torch.backends.mps.is_available() else "cpu"

# Change to:
device = "cpu"  # Force CPU mode

โš ๏ธ CPU mode runs 3โ€“5ร— slower but works on any system.

Models Won't Load

# Try mirror if Hugging Face is blocked
export HF_ENDPOINT=https://hf-mirror.com
python3 download_models.py

MPS Unavailable

If MPS acceleration is not detected on Apple Silicon:

# Check PyTorch MPS support
python3 -c "import torch; print(torch.backends.mps.is_available())"

If returns False:

  • Update to latest macOS (13.0+)
  • Reinstall PyTorch: pip install --upgrade torch torchaudio
  • Fallback to CPU mode (see "Insufficient RAM" above)

Out of Memory (OOM)

  • Reduce beam search: num_beams=3
  • Comment out unused models in app.py

Transliteration Inaccurate

Edit transliterator.py:

self.turkish_to_kazakh['h'] = 'ั…'  # Better than 'าณ'

Command Not Found: kabardian-translator

If after pip install -e . the command is not recognized:

# Reinstall package
pip uninstall kabardian-translator
pip install -e .

# Or use direct Python call
python -m kabardian_translator.cli --port 5500

๐Ÿ“„ License and Usage

Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

โœ… Allowed: Personal, educational, research, modifications, distribution with attribution.
โŒ Prohibited: Commercial use, profit-driven services, integration into paid products.

๐Ÿ”— Full license: https://creativecommons.org/licenses/by-nc/4.0/


๐Ÿ™ Acknowledgments

  • anzorq โ€“ fine-tuned M2M100 models for Kabardian
  • Meta AI โ€“ base M2M100 model
  • Silero Team โ€“ high-quality TTS
  • Hugging Face โ€“ platform and Transformers
  • Kabardian language community โ€“ feedback and support

๐Ÿ“ž Support and Contribution

  • Found a bug? โ†’ Open an Issue on GitHub
  • Want to contribute? โ†’ Fork โ†’ Branch โ†’ Commit โ†’ Pull Request
  • Need help? โ†’ Check TROUBLESHOOTING or Discussions

๐Ÿ—บ๏ธ Roadmap

  • v1.1 (Q1 2026): Expanding North Caucasian Languages Support
  • v1.2 (Q2 2026): API, Redis caching, user history, batch translation
  • v2.0 (Q3 2026): Mobile app, offline mode, Telegram Bot

๐Ÿ“š Additional Resources


Made with โค๏ธ for preserving and studying the Kabardian language

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kabardian_translator-1.0.2.tar.gz (50.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kabardian_translator-1.0.2-py3-none-any.whl (49.7 kB view details)

Uploaded Python 3

File details

Details for the file kabardian_translator-1.0.2.tar.gz.

File metadata

  • Download URL: kabardian_translator-1.0.2.tar.gz
  • Upload date:
  • Size: 50.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for kabardian_translator-1.0.2.tar.gz
Algorithm Hash digest
SHA256 44106532e112ad43a330aff5f93b5fb402f93049edbd8cf46fba40e20fa7392d
MD5 52d3c2559c8687d57a0a909ae90645f8
BLAKE2b-256 6420cfe72b3b12f80440ee20d782be12a17ad800e5cd97f350d63972339bebd3

See more details on using hashes here.

File details

Details for the file kabardian_translator-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for kabardian_translator-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a2f5996de5f569ae84ea8e4a7a5b772157a409ac7355bb93c19c44d7c1106064
MD5 aca02aad7efeea185813827617e2f355
BLAKE2b-256 72554a631792d7161c36ce97afea7623d87e499bc01eef2d633592d6e5b4a935

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page