A simple text to speech program using edge-tts library
Project description
TTS_ka ๐ Ultra-Fast Text-to-Speech
Ultra-Fast Text-to-Speech CLI tool with maximum speed generation, smart chunking, and parallel processing. Auto-optimized by default - no complex flags needed! Converts text to high-quality speech in Georgian (๐ฌ๐ช), Russian (๐ท๐บ), and English (๐ฌ๐ง) languages.
โจ Simplified UX: Auto-optimization is now enabled by default. Just specify
--langand go!
โจ Features
- ๐ Ultra-Fast Generation: 6-15 seconds for 1000 words (vs 25+ seconds traditional)
- ๐ Streaming Playback: Audio starts playing while still generating (NEW!)
- ๐ง Smart Chunking: Automatic text splitting for optimal performance
- โก Parallel Processing: Multi-threaded generation with up to 8 workers
- ๐ Clipboard Integration: Direct clipboard-to-speech workflow
- ๐ฏ Auto-Optimization: Turbo mode automatically optimizes all settings
- ๐ต High-Quality Voices: Premium neural voices for all languages
- ๐ File Support: Process text files directly
- ๐ Real-time Playback: Automatic audio playback with system player
๐ฏ Quick Start
1. Installation
# Install from PyPI (recommended)
pip install TTS_ka
# Or install from source
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .
2. Basic Usage (Auto-Optimized by Default)
# Ultra-fast generation with auto-optimization (default behavior)
python -m TTS_ka "Hello, how are you today?" --lang en
# Georgian text with automatic optimization
python -m TTS_ka "แแแแแ แฏแแแ, แ แแแแ แฎแแ แ?" --lang ka
# Russian text with smart chunking
python -m TTS_ka "ะัะธะฒะตั, ะบะฐะบ ะดะตะปะฐ?" --lang ru
3. Clipboard Workflow (FASTEST)
# Copy any text, then run (fastest workflow):
python -m TTS_ka clipboard --lang en
# For different languages:
python -m TTS_ka clipboard --lang ka # Georgian
python -m TTS_ka clipboard --lang ru # Russian
4. File Processing
# Process text files directly (auto-optimized)
python -m TTS_ka document.txt --lang en
# Long files with custom settings
python -m TTS_ka large_file.txt --chunk-seconds 30 --parallel 6 --lang ru
๐ Complete Usage Guide
Command Syntax
python -m TTS_ka [TEXT_SOURCE] [OPTIONS]
Text Sources
- Direct text:
"Your text here" - Clipboard:
clipboard(copy text first) - File path:
file.txt,document.md, etc.
Essential Options
| Option | Description | Examples |
|---|---|---|
--lang |
Language: ka (Georgian), ru (Russian), en (English) |
--lang ka |
--stream |
๐ Enable streaming playback (audio starts while generating) | --stream |
--chunk-seconds |
Chunk size in seconds (0=auto, 20-60 optimal) | --chunk-seconds 30 |
--parallel |
Workers (0=auto, 2-8 recommended) | --parallel 6 |
--no-play |
Skip automatic audio playback | --no-play |
--no-turbo |
Disable auto-optimization (legacy mode) | --no-turbo |
--help-full |
Show comprehensive help with examples | --help-full |
๐โโ๏ธ Performance Examples
Speed Comparison (1000 words)
- Traditional TTS: 25-40 seconds
- TTS_ka Direct: 15-25 seconds
- TTS_ka Turbo: 8-15 seconds
- TTS_ka Chunked: 6-12 seconds โก
- TTS_ka Streaming: ๐ 2-3 seconds to first audio (NEW!)
๐ Streaming Playback - Audio Starts Immediately!
The new streaming feature starts playing audio within 2-3 seconds while the rest continues generating in the background. This provides an 85-90% reduction in perceived wait time!
Quick Usage:
# Basic streaming - audio starts almost instantly!
python -m TTS_ka "Your long text..." --lang en --stream
# From file with streaming
python -m TTS_ka article.txt --lang ka --stream
# Clipboard with streaming (fastest workflow)
python -m TTS_ka clipboard --stream
How It Works:
- Text is split into chunks (if needed)
- Chunks generate in parallel (2-8 workers)
- First chunk plays immediately (~2-3 seconds)
- Remaining chunks continue generating in background
- Final merged audio file is saved
Performance:
- Without streaming: Wait 10-30+ seconds for all audio
- With streaming: Hear audio in 2-3 seconds โก
- Platform support: Windows, Linux, macOS
Advanced Streaming:
# Custom chunking for optimal streaming
python -m TTS_ka longtext.txt --stream --chunk-seconds 25 --parallel 6
# Streaming without final playback
python -m TTS_ka text.txt --stream --no-play
Real-World Examples
# 1. Quick phrases (instant generation)
python -m TTS_ka "Thank you very much!" --lang en
# โก Completed in 2.3s (optimized)
# 2. Medium text (paragraph)
python -m TTS_ka "Lorem ipsum dolor sit amet..." --lang en
# โก Completed in 5.7s (direct)
# 3. Long document (chunked processing)
python -m TTS_ka large_document.txt --lang en
# Strategy: chunked generation, 6 workers
# โก Completed in 12.4s (chunked)
# 4. Clipboard workflow (daily usage)
python -m TTS_ka clipboard --lang ka
# OPTIMIZED MODE - Georgian
# Processing: 45 words, 287 characters
# โก Completed in 4.1s
๐ Language Support
| Language | Code | Voice Quality | Speed | Example |
|---|---|---|---|---|
| Georgian ๐ฌ๐ช | ka |
Premium Neural | Fast | --lang ka |
| Russian ๐ท๐บ | ru |
High Quality | Very Fast | --lang ru |
| English ๐ฌ๐ง | en |
Premium Neural | Maximum | --lang en |
Voice Details
- Georgian:
ka-GE-EkaNeural- Premium female voice - Russian:
ru-RU-SvetlanaNeural- High-quality female voice - English:
en-GB-SoniaNeural- British English neural voice
โ๏ธ Advanced Usage
Custom Optimization
# Manual chunking for very long texts
python -m TTS_ka book_chapter.txt --chunk-seconds 45 --parallel 4 --lang en
# Maximum parallelization (for powerful systems)
python -m TTS_ka large_text.txt --parallel 8 --lang ru
# Batch processing (no audio playback)
python -m TTS_ka document.txt --no-play --lang ka
# Legacy mode (disable auto-optimization)
python -m TTS_ka "text" --no-turbo --lang en
Workflow Integration
# Create alias for daily use
alias speak='python -m TTS_ka clipboard --lang en'
# Windows batch file (speak.bat)
@echo off
python -m TTS_ka clipboard --lang en
# Read web articles (with browser copy)
# 1. Copy article text
# 2. Run: python -m TTS_ka clipboard --lang en
๐ง Installation & Requirements
System Requirements
- Python: 3.6+ (3.8+ recommended)
- OS: Windows, macOS, Linux
- Memory: 256MB+ available RAM
- Network: Internet connection for voice synthesis
Dependencies
Required:
pip install edge-tts>=6.1.9 # Core TTS engine
pip install pydub>=0.25.1 # Audio processing
pip install tqdm>=4.65.0 # Progress bars
pip install pyperclip>=1.8.2 # Clipboard support
System Requirements:
- FFmpeg: Required for audio processing
- Windows: Download from ffmpeg.org
- macOS:
brew install ffmpeg - Ubuntu:
sudo apt install ffmpeg
Complete Installation
# Method 1: PyPI installation (simplest)
pip install TTS_ka
# Method 2: Development installation
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .
# Method 3: Manual dependencies
pip install edge-tts pydub tqdm pyperclip
# Verify installation
python -m TTS_ka "Installation successful!" --turbo --lang en
๐ฎ AutoHotkey Integration (Windows)
Quick Setup
- Install AutoHotkey v2
- Create
tts_hotkeys.ahk:
; Ultra-fast TTS hotkeys
!e:: ; Alt+E - English
{
Run("cmd /k python -m TTS_ka clipboard --lang en")
}
!r:: ; Alt+R - Russian
{
Run("cmd /k python -m TTS_ka clipboard --lang ru")
}
!x:: ; Alt+X - Georgian
{
Run("cmd /k python -m TTS_ka clipboard --lang ka")
}
- Double-click to run, then:
- Copy text โ Alt+E for English
- Copy text โ Alt+R for Russian
- Copy text โ Alt+X for Georgian
Daily Workflow
- Browse web โ Copy interesting text
- Press Alt+E โ Instant speech
- Continue browsing while listening
๐ Troubleshooting
Common Issues
1. "No module named 'edge_tts'"
pip install edge-tts>=6.1.9
2. "FFmpeg not found"
# Windows: Download and add to PATH
# macOS: brew install ffmpeg
# Linux: sudo apt install ffmpeg
3. Slow generation
# Auto-optimization is enabled by default
python -m TTS_ka "text" --lang en
# Reduce parallel workers if network issues
python -m TTS_ka "text" --parallel 2 --lang en
# Use legacy mode only if needed
python -m TTS_ka "text" --no-turbo --lang en
4. Empty clipboard
# Ensure text is copied first
# Then run: python -m TTS_ka clipboard --turbo --lang en
Performance Optimization
For Maximum Speed:
# Use these exact settings for best performance (auto-optimized by default)
python -m TTS_ka clipboard --chunk-seconds 30 --parallel 6 --lang en
For System with Limited Resources:
# Reduce workers and chunk size
python -m TTS_ka text --parallel 2 --chunk-seconds 60 --lang en
๐ Performance Benchmarks
Text Length vs Generation Time
| Words | Direct Mode | Turbo Mode | Chunked (6 workers) |
|---|---|---|---|
| 10-50 | 2-4s | 1-3s | 2-4s |
| 100-300 | 8-12s | 5-8s | 4-6s |
| 500-1000 | 18-25s | 12-15s | 8-12s |
| 1000+ | 30-45s | 18-25s | 10-18s |
Optimal Settings by Text Length
# Short text (< 100 words): Direct generation (auto-optimized)
python -m TTS_ka "short text" --lang en
# Medium text (100-500 words): Auto-optimized mode
python -m TTS_ka medium_text.txt --lang en
# Long text (500+ words): Chunked processing (auto-detected)
python -m TTS_ka long_text.txt --chunk-seconds 30 --parallel 6 --lang en
๐ Examples & Use Cases
Daily Workflows
1. Article Reading
# Copy web article โ instant speech
python -m TTS_ka clipboard --lang en
2. Document Processing
# Process research papers, books, etc.
python -m TTS_ka research_paper.pdf.txt --lang en
3. Language Learning
# Practice pronunciation with different languages
python -m TTS_ka "แกแฌแแแแแแแ แฅแแ แแฃแแก" --lang ka
python -m TTS_ka "Learning Russian ัะทัะบ" --lang ru
4. Accessibility
# Screen reader alternative
python -m TTS_ka clipboard --no-play --lang en > audio_file.mp3
Batch Processing
# Process multiple files
for file in *.txt; do
python -m TTS_ka "$file" --no-play --lang en
done
# Windows batch processing
for %f in (*.txt) do python -m TTS_ka "%f" --no-play --lang en
๐ ๏ธ Advanced Configuration
Environment Variables
# Set default language
export TTS_DEFAULT_LANG=ka
# Set default mode
export TTS_DEFAULT_MODE=turbo
# Custom output directory
export TTS_OUTPUT_DIR=/path/to/audio/files
Configuration File
Create ~/.tts_config.json:
{
"default_lang": "en",
"turbo_mode": true,
"chunk_seconds": 30,
"parallel_workers": 6,
"auto_play": true
}
๐ API Integration
Python Script Integration
#!/usr/bin/env python3
import subprocess
import sys
def text_to_speech(text, lang="en", turbo=True):
"""Convert text to speech using TTS_ka"""
cmd = [
"python", "-m", "TTS_ka",
text,
"--lang", lang
]
if turbo:
cmd.append("--turbo")
subprocess.run(cmd)
# Usage
text_to_speech("Hello world!", "en")
text_to_speech("แแแแแ แฏแแแ!", "ka")
Web Integration
# URL to speech (with curl + TTS_ka)
curl -s "https://example.com/article" | \
python -m TTS_ka /dev/stdin --turbo --lang en
๐ฑ Mobile & Remote Usage
SSH/Remote Usage
# Generate audio on remote server
ssh user@server "python -m TTS_ka 'Remote generation' --turbo --no-play"
# Download and play locally
scp user@server:data.mp3 ./remote_audio.mp3
Docker Usage
FROM python:3.9
RUN pip install TTS_ka
RUN apt-get update && apt-get install -y ffmpeg
ENTRYPOINT ["python", "-m", "TTS_ka"]
# Docker usage
docker run tts_container "Hello Docker!" --turbo --lang en
๐ฏ Tips & Best Practices
Performance Tips
- Auto-optimization is enabled by default - no flags needed!
- Use clipboard workflow for fastest daily usage
- Chunk long texts with
--chunk-seconds 30 - Optimize workers with
--parallel 4-6for most systems - Pre-install FFmpeg for best audio processing
Quality Tips
- Georgian text: Use
--lang kafor best quality - Mixed languages: Process separately for optimal results
- Technical text: Use shorter chunks (
--chunk-seconds 20) - Clean input: Remove extra whitespace and formatting
Workflow Tips
- Create aliases for frequent commands
- Use hotkeys (AutoHotkey on Windows)
- Batch process large document collections
- Test settings with small text first
๐ File Format Support
Supported Input Formats
- Text files:
.txt,.md,.rst - Code files:
.py,.js,.html(extracts text) - Clipboard: Any copied text
- Direct input: Command-line strings
Output Format
- Audio: MP3 (high quality, compressed)
- Bitrate: 128kbps (optimal size/quality balance)
- Sample Rate: 24kHz (neural voice quality)
๐ Updates & Maintenance
Keeping Updated
# Update to latest version
pip install --upgrade TTS_ka
# Check current version
python -m TTS_ka --version
# Update dependencies
pip install --upgrade edge-tts pydub tqdm pyperclip
Health Check
# Test installation
python -m TTS_ka "System check" --turbo --lang en
# Verify FFmpeg
ffmpeg -version
# Check Python version
python --version # Should be 3.6+
๐ค Contributing
We welcome contributions! See our GitHub repository for:
- Bug reports and feature requests
- Code contributions and pull requests
- Documentation improvements
- Language support additions
Development Setup
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e ".[dev]"
pytest # Run tests
๐ Support
Getting Help
- Documentation: Use
--help-fullfor comprehensive help - Issues: Report bugs on GitHub Issues
- Discussions: Join GitHub Discussions
Quick Diagnostics
# Check system compatibility
python -m TTS_ka --help-full
# Test with minimal command
python -m TTS_ka "test" --turbo --lang en
# Verify FFmpeg installation
ffmpeg -version
๐ License & Credits
License: MIT License - see LICENSE file
Credits:
- Edge-TTS: Microsoft's edge-tts library for voice synthesis
- PyDub: Audio processing and manipulation
- FFmpeg: Audio encoding and format conversion
Author: David Chincharashvili (davidchincharashvili@gmail.com)
โญ Star this project on GitHub if you find it useful!
๐ Report issues to help improve the tool
๐ค Contribute to make it even better
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tts_ka-1.4.2.tar.gz.
File metadata
- Download URL: tts_ka-1.4.2.tar.gz
- Upload date:
- Size: 52.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9a6497a6423c5d38f84b7cbb3a4bf7430e893a62f8e8b9dd0e28c2ff7ec8de7
|
|
| MD5 |
73ab937c55fb56d60ae838a1c4c59edf
|
|
| BLAKE2b-256 |
2508f3dcc339a422494f49f32d04afba57a40dd85f6445ee9fdd9c6d0d9fa7d0
|
File details
Details for the file tts_ka-1.4.2-py3-none-any.whl.
File metadata
- Download URL: tts_ka-1.4.2-py3-none-any.whl
- Upload date:
- Size: 27.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
acf2a9da04a92bdcd194c3799da29f503fa49c07936820492d35bf826bdee9f5
|
|
| MD5 |
60ee81c3a38a22f38b4683a9fea4f26d
|
|
| BLAKE2b-256 |
00ab82ad806584ae8ef2072ef8187133b4d085f236d8f609fd24aa8b310ca034
|