Ukrainian TTS using ESPNET
Project description
title: "Ukrainian TTS" emoji: 🐌 colorFrom: blue colorTo: yellow sdk: gradio sdk_version : 5.7.1 python_version: 3.10.3 app_file: app.py pinned: false
Ukrainian TTS 📢🤖
Ukrainian TTS (text-to-speech) using ESPNET.
Link to online demo -> https://huggingface.co/spaces/robinhad/ukrainian-tts
Note: online demo saves user input to improve user experience; by using it, you consent to analyze this data.
Link to source code and models -> https://github.com/robinhad/ukrainian-tts
Telegram bot -> https://t.me/uk_tts_bot
Features ⚙️
- Completely offline
- Multiple voices
- Automatic stress with priority queue:
acute->user-defined>dictionary>model - Control speech speed
- Python package works on Windows, Mac (x86/M1), Linux(x86/ARM)
- Inference on mobile devices (inference models through
espnet_onnxwithout cleaners)
Support ❤️
If you like my work, please support ❤️ -> https://send.monobank.ua/jar/48iHq4xAXm
You're welcome to join UA Speech Recognition and Synthesis community: Telegram https://t.me/speech_recognition_uk
Examples 🤖
Oleksa (male):
https://github.com/robinhad/ukrainian-tts/assets/5759207/ace842ef-06d0-4b1f-ad49-5fda92999dbb
More voices 📢🤖
Tetiana (female):
https://github.com/robinhad/ukrainian-tts/assets/5759207/a6ecacf6-62ae-4fc5-b6d5-41e6cdd3d992
Dmytro (male):
https://github.com/robinhad/ukrainian-tts/assets/5759207/67d3dac9-6626-40ef-98e5-ec194096bbe0
Lada (female):
https://github.com/robinhad/ukrainian-tts/assets/5759207/fcf558b2-3ff9-4539-ad9e-8455b52223a4
Mykyta (male):
https://github.com/robinhad/ukrainian-tts/assets/5759207/033f5215-3f09-4021-ba19-1f55158445ca
How to use: 📢
Quickstart
Installation
Option 1: Using pip (Recommended)
pip install ukrainian-tts
Option 2: Using uv (Fast & Modern)
uv add ukrainian-tts
Option 3: Development Installation
# Using pip
pip install ukrainian-tts[dev]
# Using uv
uv add ukrainian-tts[dev]
Features Included
- ✅ Self-contained: No external git dependencies required
- ✅ All stress methods: Both dictionary and model-based stress
- ✅ Multiple voices: 5 different Ukrainian voices
- ✅ Cross-platform: Works on Windows, macOS, and Linux
- ✅ Fast installation: Optimized for modern Python package managers
Alternative Installation Methods
From Source (Development)
git clone https://github.com/robinhad/ukrainian-tts.git
cd ukrainian-tts
pip install -e .
Using uv for Development
git clone https://github.com/robinhad/ukrainian-tts.git
cd ukrainian-tts
uv pip install -e .
Code example:
from ukrainian_tts.tts import TTS, Voices, Stress
import IPython.display as ipd
tts = TTS(device="cpu") # can try gpu, mps
# Option 1: Save to file
with open("test.wav", mode="wb") as file:
_, output_text = tts.tts("Привіт, як у тебе справи?", Voices.Dmytro.value, Stress.Dictionary.value, file)
print("Accented text:", output_text)
# Option 2: Get audio as variable (numpy array)
audio_array, sample_rate, accented_text = tts.tts_to_array("Привіт, як у тебе справи?", Voices.Dmytro.value, Stress.Dictionary.value)
print("Audio shape:", audio_array.shape)
print("Sample rate:", sample_rate)
print("Accented text:", accented_text)
# Option 3: Get audio as bytes
audio_bytes, accented_text = tts.tts_to_bytes("Привіт, як у тебе справи?", Voices.Dmytro.value, Stress.Dictionary.value)
print("Audio bytes length:", len(audio_bytes))
print("Accented text:", accented_text)
# Play audio in Jupyter
ipd.Audio(audio_array, rate=sample_rate)
Audio Output Methods 🎵
The Ukrainian TTS library provides three flexible ways to get audio output:
1. File Output (Traditional)
# Save to file
with open("output.wav", "wb") as file:
_, accented_text = tts.tts("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value, file)
2. Numpy Array (For Processing)
# Get raw audio data as numpy array
audio_array, sample_rate, accented_text = tts.tts_to_array("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)
# Perfect for:
# - Audio analysis and processing
# - Machine learning pipelines
# - Real-time audio manipulation
# - Jupyter notebook playback
3. Bytes Output (For APIs)
# Get WAV file as bytes
audio_bytes, accented_text = tts.tts_to_bytes("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)
# Perfect for:
# - Web APIs and HTTP responses
# - Database storage
# - Streaming applications
# - Microservices
4. Memory Buffer (BytesIO)
# Get file-like object in memory
output_buffer, accented_text = tts.tts("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)
audio_bytes = output_buffer.getvalue() # Extract bytes when needed
# Perfect for:
# - When you need file-like interface
# - Temporary storage
# - Integration with other libraries
Use Case Examples:
🎯 Web API:
@app.route('/synthesize', methods=['POST'])
def synthesize():
text = request.json['text']
voice = request.json['voice']
audio_bytes, _ = tts.tts_to_bytes(text, voice, Stress.Dictionary.value)
return Response(audio_bytes, mimetype='audio/wav')
🔬 Audio Analysis:
audio_array, sample_rate, _ = tts.tts_to_array("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)
# Analyze audio features, apply filters, etc.
📱 Real-time Applications:
# Stream audio chunks
for chunk in audio_array.reshape(-1, 1024): # Process in chunks
# Send to audio output
pass
See example notebook: tts_example.ipynb
macOS Installation 🍎
Simple Installation (Recommended)
The package is now self-contained and doesn't require system dependencies for basic usage:
# Using pip
pip install ukrainian-tts
# Using uv (faster)
uv add ukrainian-tts
Note: The package now includes all dependencies and works out-of-the-box! No system dependencies required for basic usage.
For Development/Advanced Usage
If you need to build the package from source or encounter issues, you can use our automated installation script:
git clone https://github.com/robinhad/ukrainian-tts.git
cd ukrainian-tts
./install.sh
This script handles:
- ✅ System dependencies (SentencePiece, CMake, pkg-config)
- ✅ Python virtual environment setup
- ✅ Package installation and testing
Troubleshooting
Flash Attention Warning:
Failed to import Flash Attention, using ESPnet default: No module named 'flash_attn'
This warning is normal on macOS and can be safely ignored. Flash Attention is designed for NVIDIA GPUs and not available on macOS.
System Dependencies (if needed):
brew install sentencepiece cmake pkg-config libsndfile
export PKG_CONFIG_PATH="/opt/homebrew/lib/pkgconfig:$PKG_CONFIG_PATH"
How to contribute: 🙌
Look into this list with current problems: https://github.com/robinhad/ukrainian-tts/issues/35
How to train: 🏋️
Link to guide: training/STEPS.md
Attribution 🤝
- Model training - Yurii Paniv @robinhad
- Open Source Ukrainian Text-to-Speech dataset - Yehor Smoliakov @egorsmkv
- Dmytro voice - Dmytro Chaplynskyi @dchaplinsky
- Silence cutting using HMM-GMM - Volodymyr Kyrylov @proger
- Autostress (with dictionary) using ukrainian-word-stress - Oleksiy Syvokon @asivokon
- Autostress (with model) using ukrainian-accentor - Bohdan Mykhailenko @NeonBohdan + Yehor Smoliakov @egorsmkv
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ukrainian_tts-6.0.2.tar.gz.
File metadata
- Download URL: ukrainian_tts-6.0.2.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db4ba3e48390639f74d35c4501b4ea191ef93b3d061b02d8b6819c87aab43cc6
|
|
| MD5 |
304e836ce7cd0cd06b432b1b9d477a61
|
|
| BLAKE2b-256 |
31718c4f13b2435349b5d7203e114e6f2ed2cb42883e9716dc56a60dfa21f859
|
File details
Details for the file ukrainian_tts-6.0.2-py3-none-any.whl.
File metadata
- Download URL: ukrainian_tts-6.0.2-py3-none-any.whl
- Upload date:
- Size: 1.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38546d0f69469aeb5f26516c9148d963c0690796d73fd84da8f036caa840020a
|
|
| MD5 |
5bc77168edcc8b139ff8689e66be2405
|
|
| BLAKE2b-256 |
83290fad99e20680820055698986fafb03b96d579657721408f2e98be1ed12b9
|