A simple text to speech program using edge-tts library
Project description
TTS_ka 🚀 Ultra-Fast Text-to-Speech
Ultra-Fast Text-to-Speech CLI tool with maximum speed generation, smart chunking, and parallel processing. Auto-optimized by default - no complex flags needed! Converts text to high-quality speech in Georgian (🇬🇪), Russian (🇷🇺), and English (🇬🇧) languages.
✨ Simplified UX: Auto-optimization is now enabled by default. Just specify
--langand go!
✨ Features
- 🚀 Ultra-Fast Generation: 6-15 seconds for 1000 words (vs 25+ seconds traditional)
- 🔊 Streaming Playback: Audio starts playing while still generating (NEW!)
- 🧠 Smart Chunking: Automatic text splitting for optimal performance
- ⚡ Parallel Processing: Multi-threaded generation with up to 8 workers
- 📋 Clipboard Integration: Direct clipboard-to-speech workflow
- 🎯 Auto-Optimization: Turbo mode automatically optimizes all settings
- 🎵 High-Quality Voices: Premium neural voices for all languages
- 📁 File Support: Process text files directly
- 🔄 Real-time Playback: Automatic audio playback with system player
- Speakable text cleanup: Before TTS, the pipeline rewrites noisy input so the voice does not read raw syntax — fenced and inline code, URLs, shebang lines, HTML-like tags, file extensions (for example
.ts→ “TypeScript”), common IT acronyms (HTTPS, JSON, API, …), math symbols (for example⇒→ “implies”), and very long digit runs. Implemented inTTS_ka.not_reading(replace_not_readable). - Ctrl+C: Cancels generation and stops active streaming playback (including VLC) without waiting for the full join timeout.
🎯 Quick Start
1. Installation
# Install from PyPI (recommended)
pip install TTS_ka
# Or install from source
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .
2. Basic Usage (Auto-Optimized by Default)
# Ultra-fast generation with auto-optimization (default behavior)
python -m TTS_ka "Hello, how are you today?" --lang en
# Georgian text with automatic optimization
python -m TTS_ka "გამარჯობა, როგორ ხართ?" --lang ka
# Russian text with smart chunking
python -m TTS_ka "Привет, как дела?" --lang ru
3. Clipboard Workflow (FASTEST)
# Copy any text, then run (fastest workflow):
python -m TTS_ka clipboard --lang en
# For different languages:
python -m TTS_ka clipboard --lang ka # Georgian
python -m TTS_ka clipboard --lang ru # Russian
4. File Processing
# Process text files directly (auto-optimized)
python -m TTS_ka document.txt --lang en
# Long files with custom settings
python -m TTS_ka large_file.txt --chunk-seconds 30 --parallel 6 --lang ru
📖 Complete Usage Guide
Command Syntax
python -m TTS_ka [TEXT_SOURCE] [OPTIONS]
Text Sources
- Direct text:
"Your text here" - Clipboard:
clipboard(copy text first) - File path:
file.txt,document.md, etc.
Essential Options
| Option | Description | Examples |
|---|---|---|
--lang |
ka Georgian (female), ka-m Georgian (male), ru, en |
--lang ka |
-o, --output |
Output MP3 path (default data.mp3) |
-o speech.mp3 |
--stream |
🆕 Enable streaming playback (audio starts while generating) | --stream |
--chunk-seconds |
Chunk size in seconds (0=auto, 20-60 optimal) | --chunk-seconds 30 |
--parallel |
Workers (0=auto, 2-8 recommended) | --parallel 6 |
--no-play |
Skip automatic audio playback | --no-play |
--no-gui |
With --stream: headless VLC (dummy UI). Default is one GUI window on Windows. |
--stream --no-gui |
--no-turbo |
Disable auto-optimization (legacy mode) | --no-turbo |
--help-full |
Show comprehensive help with examples | --help-full |
-V, --version |
Print version, Python, platform, and PyPI package metadata | --version |
Text cleanup rules (summary)
| Kind of input | What you hear instead |
|---|---|
```code``` / `inline` |
Short phrases like “omitted fenced code block” / “omitted inline code snippet” |
https://… / www.… |
“omitted hyperlink” |
#!/usr/bin/env python |
“omitted script shebang line” |
<div>…</div>-style tags |
“omitted markup tag” |
file.ts, app.py |
Spoken language or format name (TypeScript, Python, …) |
API, HTTPS, JSON, … |
Letter-by-letter or expanded forms (A P I, H T T P S, …) |
=>, ≤, ∞, … |
Words (“implies”, “less than or equal to”, “infinity”, …) |
| 7+ digit numbers | “a large number” |
Chunk playback order matches document order even when chunks finish generating in parallel.
🏃♂️ Performance Examples
Speed Comparison (1000 words)
- Traditional TTS: 25-40 seconds
- TTS_ka Direct: 15-25 seconds
- TTS_ka Turbo: 8-15 seconds
- TTS_ka Chunked: 6-12 seconds ⚡
- TTS_ka Streaming: 🔊 2-3 seconds to first audio (NEW!)
🆕 Streaming Playback - Audio Starts Immediately!
The new streaming feature starts playing audio within 2-3 seconds while the rest continues generating in the background. This provides an 85-90% reduction in perceived wait time!
Quick Usage:
# Basic streaming - audio starts almost instantly!
python -m TTS_ka "Your long text..." --lang en --stream
# From file with streaming
python -m TTS_ka article.txt --lang ka --stream
# Clipboard with streaming (fastest workflow)
python -m TTS_ka clipboard --stream
How It Works:
- Text is split into chunks (if needed)
- Chunks generate in parallel (2-8 workers)
- First chunk plays quickly (~2-3 seconds); with VLC (default on Windows), one window builds a playlist in text order as chunks finish (
--no-guiuses a headless session). SetTTS_KA_VLC_RC=0to fall back to launching VLC once per chunk instead of one remote-control session. - Remaining chunks continue generating in background
- Final merged audio file is saved
Performance:
- Without streaming: Wait 10-30+ seconds for all audio
- With streaming: Hear audio in 2-3 seconds ⚡
- Platform support: Windows, Linux, macOS
Advanced Streaming:
# Custom chunking for optimal streaming
python -m TTS_ka longtext.txt --stream --chunk-seconds 25 --parallel 6
# Streaming without final playback
python -m TTS_ka text.txt --stream --no-play
Real-World Examples
# 1. Quick phrases (instant generation)
python -m TTS_ka "Thank you very much!" --lang en
# ⚡ Completed in 2.3s (optimized)
# 2. Medium text (paragraph)
python -m TTS_ka "Lorem ipsum dolor sit amet..." --lang en
# ⚡ Completed in 5.7s (direct)
# 3. Long document (chunked processing)
python -m TTS_ka large_document.txt --lang en
# Strategy: chunked generation, 6 workers
# ⚡ Completed in 12.4s (chunked)
# 4. Clipboard workflow (daily usage)
python -m TTS_ka clipboard --lang ka
# OPTIMIZED MODE - Georgian
# Processing: 45 words, 287 characters
# ⚡ Completed in 4.1s
🌍 Language Support
| Language | Code | Voice Quality | Speed | Example |
|---|---|---|---|---|
| Georgian 🇬🇪 | ka |
Neural (Eka, female) | Fast | --lang ka |
| Georgian 🇬🇪 | ka-m |
Neural (Giorgi, male) | Fast | --lang ka-m |
| Russian 🇷🇺 | ru |
High Quality | Very Fast | --lang ru |
| English 🇬🇧 | en |
Premium Neural | Maximum | --lang en |
Voice Details
- Georgian (female):
ka-GE-EkaNeural—--lang ka - Georgian (male):
ka-GE-GiorgiNeural—--lang ka-m - Russian:
ru-RU-SvetlanaNeural- High-quality female voice - English:
en-GB-SoniaNeural- British English neural voice
⚙️ Advanced Usage
Custom Optimization
# Manual chunking for very long texts
python -m TTS_ka book_chapter.txt --chunk-seconds 45 --parallel 4 --lang en
# Maximum parallelization (for powerful systems)
python -m TTS_ka large_text.txt --parallel 8 --lang ru
# Batch processing (no audio playback)
python -m TTS_ka document.txt --no-play --lang ka
# Legacy mode (disable auto-optimization)
python -m TTS_ka "text" --no-turbo --lang en
Workflow Integration
# Create alias for daily use
alias speak='python -m TTS_ka clipboard --lang en'
# Windows batch file (speak.bat)
@echo off
python -m TTS_ka clipboard --lang en
# Read web articles (with browser copy)
# 1. Copy article text
# 2. Run: python -m TTS_ka clipboard --lang en
🔧 Installation & Requirements
System Requirements
- Python: 3.8+ (required: async CLI and
httpx) - OS: Windows, macOS, Linux
- Memory: 256MB+ available RAM
- Network: Internet connection for voice synthesis
Dependencies
Required (same as pip install TTS_ka):
pip install "edge-tts>=7.2.7" # Core TTS engine
pip install pydub>=0.25.1 # Audio processing
pip install tqdm>=4.65.0 # Progress bars
pip install "httpx>=0.28.1" # Async HTTP (CLI)
System Requirements:
- FFmpeg: Required for audio processing
- Windows: Download from ffmpeg.org
- macOS:
brew install ffmpeg - Ubuntu:
sudo apt install ffmpeg
Complete Installation
# Method 1: PyPI installation (simplest)
pip install TTS_ka
# Method 2: Development installation
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .
# Method 3: Manual dependencies
pip install "edge-tts>=7.2.7" pydub tqdm "httpx>=0.28.1"
# Verify installation
python -m TTS_ka "Installation successful!" --turbo --lang en
🎮 AutoHotkey Integration (Windows)
Quick Setup
- Install AutoHotkey v2
- Create
tts_hotkeys.ahk:
; Ultra-fast TTS hotkeys
!e:: ; Alt+E - English
{
Run("cmd /k python -m TTS_ka clipboard --lang en")
}
!r:: ; Alt+R - Russian
{
Run("cmd /k python -m TTS_ka clipboard --lang ru")
}
!x:: ; Alt+X - Georgian
{
Run("cmd /k python -m TTS_ka clipboard --lang ka")
}
- Double-click to run, then:
- Copy text → Alt+E for English
- Copy text → Alt+R for Russian
- Copy text → Alt+X for Georgian
Daily Workflow
- Browse web → Copy interesting text
- Press Alt+E → Instant speech
- Continue browsing while listening
🔍 Troubleshooting
Common Issues
1. "No module named 'edge_tts'"
pip install "edge-tts>=7.2.7"
2. "FFmpeg not found"
# Windows: Download and add to PATH
# macOS: brew install ffmpeg
# Linux: sudo apt install ffmpeg
3. Slow generation
# Auto-optimization is enabled by default
python -m TTS_ka "text" --lang en
# Reduce parallel workers if network issues
python -m TTS_ka "text" --parallel 2 --lang en
# Use legacy mode only if needed
python -m TTS_ka "text" --no-turbo --lang en
4. Empty clipboard
# Ensure text is copied first
# Then run: python -m TTS_ka clipboard --turbo --lang en
5. 403 / Invalid response status (HTTP or edge-tts)
# Microsoft rotates access; upgrade edge-tts (includes updated websocket tokens)
pip install -U "edge-tts>=7.2.7"
# Optional: skip the unofficial Bing HTTP path and use edge-tts only
set TTS_KA_SKIP_HTTP=1 # Windows CMD
# export TTS_KA_SKIP_HTTP=1 # macOS / Linux
# Optional: log when the app falls back from HTTP to edge-tts (off by default)
set TTS_KA_VERBOSE=1
# If many parallel chunks still fail, reduce workers
python -m TTS_ka "your long text" --lang en --parallel 2
6. Streaming / VLC (Windows)
- Default: one VLC window with a growing playlist (TCP remote control).
TTS_KA_VLC_RC=0: disable that mode and use one VLC process per chunk (legacy).
7. Ctrl+C
Press Ctrl+C to cancel synthesis and stop streaming playback; partial part files are cleaned up.
Performance Optimization
For Maximum Speed:
# Use these exact settings for best performance (auto-optimized by default)
python -m TTS_ka clipboard --chunk-seconds 30 --parallel 6 --lang en
For System with Limited Resources:
# Reduce workers and chunk size
python -m TTS_ka text --parallel 2 --chunk-seconds 60 --lang en
📊 Performance Benchmarks
Text Length vs Generation Time
| Words | Direct Mode | Turbo Mode | Chunked (6 workers) |
|---|---|---|---|
| 10-50 | 2-4s | 1-3s | 2-4s |
| 100-300 | 8-12s | 5-8s | 4-6s |
| 500-1000 | 18-25s | 12-15s | 8-12s |
| 1000+ | 30-45s | 18-25s | 10-18s |
Optimal Settings by Text Length
# Short text (< 100 words): Direct generation (auto-optimized)
python -m TTS_ka "short text" --lang en
# Medium text (100-500 words): Auto-optimized mode
python -m TTS_ka medium_text.txt --lang en
# Long text (500+ words): Chunked processing (auto-detected)
python -m TTS_ka long_text.txt --chunk-seconds 30 --parallel 6 --lang en
🚀 Examples & Use Cases
Daily Workflows
1. Article Reading
# Copy web article → instant speech
python -m TTS_ka clipboard --lang en
2. Document Processing
# Process research papers, books, etc.
python -m TTS_ka research_paper.pdf.txt --lang en
3. Language Learning
# Practice pronunciation with different languages
python -m TTS_ka "სწავლობდი ქართულს" --lang ka
python -m TTS_ka "Learning Russian язык" --lang ru
4. Accessibility
# Screen reader alternative
python -m TTS_ka clipboard --no-play --lang en > audio_file.mp3
Batch Processing
# Process multiple files
for file in *.txt; do
python -m TTS_ka "$file" --no-play --lang en
done
# Windows batch processing
for %f in (*.txt) do python -m TTS_ka "%f" --no-play --lang en
🛠️ Advanced Configuration
Environment Variables
# Set default language
export TTS_DEFAULT_LANG=ka
# Set default mode
export TTS_DEFAULT_MODE=turbo
# Custom output directory
export TTS_OUTPUT_DIR=/path/to/audio/files
Configuration File
Create ~/.tts_config.json:
{
"default_lang": "en",
"turbo_mode": true,
"chunk_seconds": 30,
"parallel_workers": 6,
"auto_play": true
}
🔌 API Integration
Python Script Integration
#!/usr/bin/env python3
import subprocess
import sys
def text_to_speech(text, lang="en", turbo=True):
"""Convert text to speech using TTS_ka"""
cmd = [
"python", "-m", "TTS_ka",
text,
"--lang", lang
]
if turbo:
cmd.append("--turbo")
subprocess.run(cmd)
# Usage
text_to_speech("Hello world!", "en")
text_to_speech("გამარჯობა!", "ka")
Web Integration
# URL to speech (with curl + TTS_ka)
curl -s "https://example.com/article" | \
python -m TTS_ka /dev/stdin --turbo --lang en
📱 Mobile & Remote Usage
SSH/Remote Usage
# Generate audio on remote server
ssh user@server "python -m TTS_ka 'Remote generation' --turbo --no-play"
# Download and play locally
scp user@server:data.mp3 ./remote_audio.mp3
Docker Usage
FROM python:3.9
RUN pip install TTS_ka
RUN apt-get update && apt-get install -y ffmpeg
ENTRYPOINT ["python", "-m", "TTS_ka"]
# Docker usage
docker run tts_container "Hello Docker!" --turbo --lang en
🎯 Tips & Best Practices
Performance Tips
- Auto-optimization is enabled by default - no flags needed!
- Use clipboard workflow for fastest daily usage
- Chunk long texts with
--chunk-seconds 30 - Optimize workers with
--parallel 4-6for most systems - Pre-install FFmpeg for best audio processing
Quality Tips
- Georgian text: Use
--lang kafor best quality - Mixed languages: Process separately for optimal results
- Technical text: Use shorter chunks (
--chunk-seconds 20) - Clean input: Remove extra whitespace and formatting
Workflow Tips
- Create aliases for frequent commands
- Use hotkeys (AutoHotkey on Windows)
- Batch process large document collections
- Test settings with small text first
📄 File Format Support
Supported Input Formats
- Text files:
.txt,.md,.rst - Code files:
.py,.js,.html(extracts text) - Clipboard: Any copied text
- Direct input: Command-line strings
Output Format
- Audio: MP3 (high quality, compressed)
- Bitrate: 128kbps (optimal size/quality balance)
- Sample Rate: 24kHz (neural voice quality)
🔄 Updates & Maintenance
Keeping Updated
# Update to latest version
pip install --upgrade TTS_ka
# Check current version
python -m TTS_ka --version
# Update dependencies
pip install --upgrade edge-tts pydub tqdm httpx
Health Check
# Test installation
python -m TTS_ka "System check" --turbo --lang en
# Verify FFmpeg
ffmpeg -version
# Check Python version
python --version # Should be 3.8+
🤝 Contributing
We welcome contributions! See our GitHub repository for:
- Bug reports and feature requests
- Code contributions and pull requests
- Documentation improvements
- Language support additions
Development Setup
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e ".[dev]"
pytest # Run tests
📞 Support
Getting Help
- Documentation: Use
--help-fullfor comprehensive help - Issues: Report bugs on GitHub Issues
- Discussions: Join GitHub Discussions
Quick Diagnostics
# Check system compatibility
python -m TTS_ka --help-full
# Test with minimal command
python -m TTS_ka "test" --turbo --lang en
# Verify FFmpeg installation
ffmpeg -version
📜 License & Credits
License: MIT License - see LICENSE file
Credits:
- Edge-TTS: Microsoft's edge-tts library for voice synthesis
- PyDub: Audio processing and manipulation
- FFmpeg: Audio encoding and format conversion
Author: David Chincharashvili (davidchincharashvili@gmail.com)
⭐ Star this project on GitHub if you find it useful!
🐛 Report issues to help improve the tool
🤝 Contribute to make it even better
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tts_ka-1.6.1.tar.gz.
File metadata
- Download URL: tts_ka-1.6.1.tar.gz
- Upload date:
- Size: 67.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df47bbe9d257bc0dc02b71f8d2e438e0aa01dfaf1b117d49cf7c82fdf94e6414
|
|
| MD5 |
bdb77399823d70c3c309c8b5d69f1e46
|
|
| BLAKE2b-256 |
69eec8c12c321a97f5cbbb159f5ffcf9727c6569e713ec934e5cf78a324c4b9a
|
File details
Details for the file tts_ka-1.6.1-py3-none-any.whl.
File metadata
- Download URL: tts_ka-1.6.1-py3-none-any.whl
- Upload date:
- Size: 38.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1e5db74346fa44d3a2b8d80b366dad3ab9ce8ab18dd48c9d5ea555398a2aba3
|
|
| MD5 |
bb3f3c5bfe3084862b1440019e900ced
|
|
| BLAKE2b-256 |
c452a8ef1bd0e2bc91f602b9d20f99dee2372c1876ea64da707b6d91a8ebccc0
|