Skip to main content

Multilingual translator for Kabardian and Caucasian languages with speech synthesis

Project description

๐ŸŒ Kabardian Translator

Voice-Enabled Multilingual Translator for Caucasian Languages

License Python PyTorch Hugging Face

๐ŸŽฏ Educational tool for learning Kabardian and Caucasian languages with AI-powered translation and speech synthesis

โœจ Features

  • ๐Ÿง  Smart Translation: 14 languages with specialized Kabardian models
  • ๐Ÿ”Š Voice Synthesis: Text-to-speech with automatic transliteration
  • ๐Ÿ”ค Phonetic Support: Georgian/Armenian alphabets โ†’ readable Cyrillic
  • โšก Apple Optimized: MPS acceleration for Apple Silicon (requires 16GB RAM)
  • ๐ŸŽจ Modern UI: Dark/light themes, keyboard shortcuts

๐Ÿš€ Quick Start

System Requirements

  • Python: 3.11 or higher
  • RAM: 16GB minimum (for MPS acceleration on Apple Silicon)
  • Storage: ~10GB for AI models
  • OS: macOS (Apple Silicon), Linux, or Windows

Method 1: Package Installation (Recommended)

# 1. Clone & setup
git clone https://github.com/kubataba/kabardian-translator.git
cd kabardian-translator

# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# 3. Install as package (auto-installs all dependencies)
pip install -e .

# 4. Download AI models (~10GB)
python download_models.py

# 5. Launch application
kabardian-translator --port 5500
# โ†’ Open http://localhost:5500

Method 2: Manual Installation

# 1. Clone & setup
git clone https://github.com/kubataba/kabardian-translator.git
cd kabardian-translator

# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate

# 3. Install dependencies manually
pip install -r requirements.txt

# 4. Download AI models (~10GB)
python3 download_models.py

# 5. Launch application
python3 app.py
# โ†’ Open http://localhost:5500

CLI Options

# Custom port
kabardian-translator --port 8080

# Localhost only (more secure)
kabardian-translator --host localhost --port 5500

# Debug mode
kabardian-translator --debug

# Help
kabardian-translator --help

โšก Performance Optimizations

Optimization Benefit
Float16 instead of Float32 ~50% memory savings (15GB โ†’ 7.5GB), <1% accuracy drop
torch.no_grad() for inference 10โ€“15% faster, no gradient cache
Lazy TTS loading Startup time โ†“ by ~5 sec, memory saved if unused
Automatic memory cleanup Stable long-term operation

Performance on Mac Mini M4

Operation Time Memory
Server start ~10 sec ~2GB
Translation (direct) 200-500ms +1GB
Translation (cascade) 400-900ms +1GB
TTS synthesis 1-2 sec +0.5GB
Peak memory - ~8GB

โš ๏ธ Important: MPS acceleration requires 16GB RAM minimum. With 8GB RAM, use CPU mode (see Troubleshooting).


๐ŸŽ“ Practical Applications

  • For Students: Learn Kabardian, practice pronunciation, compare translations.
  • For Teachers: Prepare materials, generate audio examples, demonstrate phonetics.
  • For Researchers: Analyze transliteration, test MT quality, compare phonetics.
  • For Travelers: Communicate in Caucasus region, understand signs, basic phrases.

๐Ÿ“Š Quality and Limitations

Translation Quality

Language Pair BLEU Quality Method
Russian โ†” Kabardian 35โ€“42 Excellent Direct (fine-tuned)
Slavic โ†” Slavic 30โ€“38 Good Direct (base)
Any โ†” Kabardian 28โ€“35 Good Cascade (2 models)
European โ†” European 32โ€“40 Good Direct (base)

Voice Synthesis

Language TTS Quality Method Accuracy
Russian, Ukrainian, Belarusian 95โ€“98% Direct Excellent
Kabardian, Kazakh 92โ€“95% Direct Excellent
Georgian, Armenian 88โ€“92% Transliteration โ†’ TTS Good
Turkish, Azerbaijani 85โ€“88% Transliteration โ†’ TTS Good
German, Spanish, Latvian 78โ€“82% Transliteration โ†’ TTS Acceptable

Limitations

  • TTS: Max 200 chars; imperfect pronunciation for transliterated langs; no intonation.
  • Translation: Cascade may lose nuance; technical terms may be inaccurate; context >512 tokens lost.
  • Transliteration: Simplified phonetics; stress marks not shown.

๐Ÿ› ๏ธ Troubleshooting

Insufficient RAM (Less than 16GB)

For systems with 8GB RAM, disable MPS and use CPU mode:

Option 1: Environment variable (temporary)

export PYTORCH_ENABLE_MPS_FALLBACK=1
kabardian-translator

Option 2: Edit app.py (permanent)

# Find this line:
device = "mps" if torch.backends.mps.is_available() else "cpu"

# Change to:
device = "cpu"  # Force CPU mode

โš ๏ธ CPU mode runs 3โ€“5ร— slower but works on any system.

Models Won't Load

# Try mirror if Hugging Face is blocked
export HF_ENDPOINT=https://hf-mirror.com
python3 download_models.py

MPS Unavailable

If MPS acceleration is not detected on Apple Silicon:

# Check PyTorch MPS support
python3 -c "import torch; print(torch.backends.mps.is_available())"

If returns False:

  • Update to latest macOS (13.0+)
  • Reinstall PyTorch: pip install --upgrade torch torchaudio
  • Fallback to CPU mode (see "Insufficient RAM" above)

Out of Memory (OOM)

  • Reduce beam search: num_beams=3
  • Comment out unused models in app.py

Transliteration Inaccurate

Edit transliterator.py:

self.turkish_to_kazakh['h'] = 'ั…'  # Better than 'าณ'

Command Not Found: kabardian-translator

If after pip install -e . the command is not recognized:

# Reinstall package
pip uninstall kabardian-translator
pip install -e .

# Or use direct Python call
python -m kabardian_translator.cli --port 5500

๐Ÿ“„ License and Usage

Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

โœ… Allowed: Personal, educational, research, modifications, distribution with attribution.
โŒ Prohibited: Commercial use, profit-driven services, integration into paid products.

๐Ÿ”— Full license: https://creativecommons.org/licenses/by-nc/4.0/


๐Ÿ™ Acknowledgments

  • anzorq โ€“ fine-tuned M2M100 models for Kabardian
  • Meta AI โ€“ base M2M100 model
  • Silero Team โ€“ high-quality TTS
  • Hugging Face โ€“ platform and Transformers
  • Kabardian language community โ€“ feedback and support

๐Ÿ“ž Support and Contribution

  • Found a bug? โ†’ Open an Issue on GitHub
  • Want to contribute? โ†’ Fork โ†’ Branch โ†’ Commit โ†’ Pull Request
  • Need help? โ†’ Check TROUBLESHOOTING or Discussions

๐Ÿ—บ๏ธ Roadmap

  • v1.1 (Q1 2026): Expanding North Caucasian Languages Support
  • v1.2 (Q2 2026): API, Redis caching, user history, batch translation
  • v2.0 (Q3 2026): Mobile app, offline mode, Telegram Bot

๐Ÿ“š Additional Resources


Made with โค๏ธ for preserving and studying the Kabardian language

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kabardian_translator-1.0.0.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kabardian_translator-1.0.0-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file kabardian_translator-1.0.0.tar.gz.

File metadata

  • Download URL: kabardian_translator-1.0.0.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for kabardian_translator-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a6ea968feaa62a1581729830dfa2cea3de282c8dc8dc00afe2e43e6998bdf78e
MD5 fb96e56f47b2f2e0f52b2cd083470545
BLAKE2b-256 0807a64e601f76347eccb3f77223ff7bde077bd70e41bcb826d2943edbde87bd

See more details on using hashes here.

File details

Details for the file kabardian_translator-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for kabardian_translator-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 120e7ad9ea938244f299f7320230b1a8d0c5f67544716900e8aca56201aea99d
MD5 4834d7f6919ad406d399bf2be7121f61
BLAKE2b-256 4d86b91e008459eb0624a2a63858fabb67ca746f52d2e60660de4bad9f06261d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page