Skip to main content

Ukrainian TTS using ESPNET

Project description


title: "Ukrainian TTS" emoji: 🐌 colorFrom: blue colorTo: yellow sdk: gradio sdk_version : 5.7.1 python_version: 3.10.3 app_file: app.py pinned: false

Ukrainian TTS 📢🤖

Ukrainian TTS (text-to-speech) using ESPNET.

pytest Open In HF🤗 Space Open In Colab Open Bot chat

Link to online demo -> https://huggingface.co/spaces/robinhad/ukrainian-tts
Note: online demo saves user input to improve user experience; by using it, you consent to analyze this data.
Link to source code and models -> https://github.com/robinhad/ukrainian-tts
Telegram bot -> https://t.me/uk_tts_bot

Features ⚙️

  • Completely offline
  • Multiple voices
  • Automatic stress with priority queue: acute -> user-defined > dictionary > model
  • Control speech speed
  • Python package works on Windows, Mac (x86/M1), Linux(x86/ARM)
  • Inference on mobile devices (inference models through espnet_onnx without cleaners)

Support ❤️

If you like my work, please support ❤️ -> https://send.monobank.ua/jar/48iHq4xAXm
You're welcome to join UA Speech Recognition and Synthesis community: Telegram https://t.me/speech_recognition_uk

Examples 🤖

Oleksa (male):

https://github.com/robinhad/ukrainian-tts/assets/5759207/ace842ef-06d0-4b1f-ad49-5fda92999dbb

More voices 📢🤖

Tetiana (female):

https://github.com/robinhad/ukrainian-tts/assets/5759207/a6ecacf6-62ae-4fc5-b6d5-41e6cdd3d992

Dmytro (male):

https://github.com/robinhad/ukrainian-tts/assets/5759207/67d3dac9-6626-40ef-98e5-ec194096bbe0

Lada (female):

https://github.com/robinhad/ukrainian-tts/assets/5759207/fcf558b2-3ff9-4539-ad9e-8455b52223a4

Mykyta (male):

https://github.com/robinhad/ukrainian-tts/assets/5759207/033f5215-3f09-4021-ba19-1f55158445ca

How to use: 📢

Quickstart

Installation

Option 1: Using pip (Recommended)

pip install ukrainian-tts

Option 2: Using uv (Fast & Modern)

uv add ukrainian-tts

Option 3: Development Installation

# Using pip
pip install ukrainian-tts[dev]

# Using uv
uv add ukrainian-tts[dev]

Features Included

  • Self-contained: No external git dependencies required
  • All stress methods: Both dictionary and model-based stress
  • Multiple voices: 5 different Ukrainian voices
  • Cross-platform: Works on Windows, macOS, and Linux
  • Fast installation: Optimized for modern Python package managers

Alternative Installation Methods

From Source (Development)

git clone https://github.com/robinhad/ukrainian-tts.git
cd ukrainian-tts
pip install -e .

Using uv for Development

git clone https://github.com/robinhad/ukrainian-tts.git
cd ukrainian-tts
uv pip install -e .

Code example:

from ukrainian_tts.tts import TTS, Voices, Stress
import IPython.display as ipd

tts = TTS(device="cpu") # can try gpu, mps

# Option 1: Save to file
with open("test.wav", mode="wb") as file:
    _, output_text = tts.tts("Привіт, як у тебе справи?", Voices.Dmytro.value, Stress.Dictionary.value, file)
print("Accented text:", output_text)

# Option 2: Get audio as variable (numpy array)
audio_array, sample_rate, accented_text = tts.tts_to_array("Привіт, як у тебе справи?", Voices.Dmytro.value, Stress.Dictionary.value)
print("Audio shape:", audio_array.shape)
print("Sample rate:", sample_rate)
print("Accented text:", accented_text)

# Option 3: Get audio as bytes
audio_bytes, accented_text = tts.tts_to_bytes("Привіт, як у тебе справи?", Voices.Dmytro.value, Stress.Dictionary.value)
print("Audio bytes length:", len(audio_bytes))
print("Accented text:", accented_text)

# Play audio in Jupyter
ipd.Audio(audio_array, rate=sample_rate)

Audio Output Methods 🎵

The Ukrainian TTS library provides three flexible ways to get audio output:

1. File Output (Traditional)

# Save to file
with open("output.wav", "wb") as file:
    _, accented_text = tts.tts("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value, file)

2. Numpy Array (For Processing)

# Get raw audio data as numpy array
audio_array, sample_rate, accented_text = tts.tts_to_array("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)

# Perfect for:
# - Audio analysis and processing
# - Machine learning pipelines
# - Real-time audio manipulation
# - Jupyter notebook playback

3. Bytes Output (For APIs)

# Get WAV file as bytes
audio_bytes, accented_text = tts.tts_to_bytes("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)

# Perfect for:
# - Web APIs and HTTP responses
# - Database storage
# - Streaming applications
# - Microservices

4. Memory Buffer (BytesIO)

# Get file-like object in memory
output_buffer, accented_text = tts.tts("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)
audio_bytes = output_buffer.getvalue()  # Extract bytes when needed

# Perfect for:
# - When you need file-like interface
# - Temporary storage
# - Integration with other libraries

Use Case Examples:

🎯 Web API:

@app.route('/synthesize', methods=['POST'])
def synthesize():
    text = request.json['text']
    voice = request.json['voice']
    audio_bytes, _ = tts.tts_to_bytes(text, voice, Stress.Dictionary.value)
    return Response(audio_bytes, mimetype='audio/wav')

🔬 Audio Analysis:

audio_array, sample_rate, _ = tts.tts_to_array("Привіт!", Voices.Dmytro.value, Stress.Dictionary.value)
# Analyze audio features, apply filters, etc.

📱 Real-time Applications:

# Stream audio chunks
for chunk in audio_array.reshape(-1, 1024):  # Process in chunks
    # Send to audio output
    pass

See example notebook: tts_example.ipynb Open In Colab

macOS Installation 🍎

Simple Installation (Recommended)

The package is now self-contained and doesn't require system dependencies for basic usage:

# Using pip
pip install ukrainian-tts

# Using uv (faster)
uv add ukrainian-tts

Note: The package now includes all dependencies and works out-of-the-box! No system dependencies required for basic usage.

For Development/Advanced Usage

If you need to build the package from source or encounter issues, you can use our automated installation script:

git clone https://github.com/robinhad/ukrainian-tts.git
cd ukrainian-tts
./install.sh

This script handles:

  • ✅ System dependencies (SentencePiece, CMake, pkg-config)
  • ✅ Python virtual environment setup
  • ✅ Package installation and testing

Troubleshooting

Flash Attention Warning:

Failed to import Flash Attention, using ESPnet default: No module named 'flash_attn'

This warning is normal on macOS and can be safely ignored. Flash Attention is designed for NVIDIA GPUs and not available on macOS.

System Dependencies (if needed):

brew install sentencepiece cmake pkg-config libsndfile
export PKG_CONFIG_PATH="/opt/homebrew/lib/pkgconfig:$PKG_CONFIG_PATH"

How to contribute: 🙌

Look into this list with current problems: https://github.com/robinhad/ukrainian-tts/issues/35

How to train: 🏋️

Link to guide: training/STEPS.md

Attribution 🤝

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ukrainian_tts-6.0.2.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ukrainian_tts-6.0.2-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file ukrainian_tts-6.0.2.tar.gz.

File metadata

  • Download URL: ukrainian_tts-6.0.2.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.14

File hashes

Hashes for ukrainian_tts-6.0.2.tar.gz
Algorithm Hash digest
SHA256 db4ba3e48390639f74d35c4501b4ea191ef93b3d061b02d8b6819c87aab43cc6
MD5 304e836ce7cd0cd06b432b1b9d477a61
BLAKE2b-256 31718c4f13b2435349b5d7203e114e6f2ed2cb42883e9716dc56a60dfa21f859

See more details on using hashes here.

File details

Details for the file ukrainian_tts-6.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ukrainian_tts-6.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 38546d0f69469aeb5f26516c9148d963c0690796d73fd84da8f036caa840020a
MD5 5bc77168edcc8b139ff8689e66be2405
BLAKE2b-256 83290fad99e20680820055698986fafb03b96d579657721408f2e98be1ed12b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page