High-fidelity speech synthesis for Ukrainian using modern neural networks.

These details have not been verified by PyPI

Project links

Project description

Text-to-Speech for Ukrainian

High-fidelity speech synthesis for Ukrainian using modern neural networks.

Statuses

Demo

Check out our demo on Hugging Face space or just listen to samples here.

Features

Multi-speaker model: 2 female (Tetiana, Lada) + 1 male (Mykyta) voices;
Fine-grained control over speech parameters, including duration, fundamental frequency (F0), and energy;
High-fidelity speech generation using the RAD-TTS++ acoustic model;
Fast vocoding using Vocos;
Synthesizes long sentences effectively;
Supports a sampling rate of 44.1 kHz;
Tested on Linux environments and Windows/WSL;
Python API (requires Python 3.9 or later);
CUDA-enabled for GPU acceleration.

Installation

# Install from PyPI
pip install tts-uk

# OR, for the latest development version:
pip install git+https://github.com/egorsmkv/tts_uk

# OR, use git and local setup
git clone https://github.com/egorsmkv/tts_uk
cd tts_uk
uv sync # uv will handle the virtual environment

Read uv's installation section.

Also, you can download the repository as a ZIP archive.

Getting started

Code example:

import torchaudio

from tts_uk.inference import synthesis

sampling_rate = 44_100

# Perform the synthesis, `synthesis` function returns:
# - mels: Mel spectrograms of the generated audio.
# - wave: The synthesized waveform by a Vocoder as a PyTorch tensor.
# - stats: A dictionary containing synthesis statistics (processing time, duration, speech rate, etc).
mels, wave, stats = synthesis(
    text="Ви можете протестувати синтез мовлення українською мовою. Просто введіть текст, який ви хочете прослухати.",
    voice="tetiana",  # tetiana, mykyta, lada
    n_takes=1,
    use_latest_take=False,
    token_dur_scaling=1,
    f0_mean=0,
    f0_std=0,
    energy_mean=0,
    energy_std=0,
    sigma_decoder=0.8,
    sigma_token_duration=0.666,
    sigma_f0=1,
    sigma_energy=1,
)

print(stats)

# Save the generated audio to a WAV file.
torchaudio.save("audio.wav", wave.cpu(), sampling_rate, encoding="PCM_S")

Use these Google colabs:

CPU inference
GPU inference on T4 card (long document to synthesize)

Or run synthesis in a terminal:

uv run example.py

If you need to synthesize articles we recommend consider wtpsplit.

Get help and support

Please feel free to connect with us using the Issues section.

License

Code has the MIT license.

Model authors

Acoustic

Yehor Smoliakov, HF profile

Vocoder

Serhiy Stetskovych, HF profile

Community

Discord: https://bit.ly/discord-uds
Speech Recognition: https://t.me/speech_recognition_uk
Speech Synthesis: https://t.me/speech_synthesis_uk

Also, follow our Speech-UK initiative on Hugging Face!

Acknowledgements

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3.7

Mar 15, 2025

1.3.6

Mar 6, 2025

1.3.5

Mar 4, 2025

1.3.4

Mar 4, 2025

1.3.2

Mar 4, 2025

1.3.1

Mar 4, 2025

1.3.0

Mar 4, 2025

1.2.1

Mar 3, 2025

1.2.0

Mar 3, 2025

1.1.0

Mar 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tts_uk-1.3.7.tar.gz (859.0 kB view details)

Uploaded Mar 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tts_uk-1.3.7-py3-none-any.whl (56.8 kB view details)

Uploaded Mar 15, 2025 Python 3

File details

Details for the file tts_uk-1.3.7.tar.gz.

File metadata

Download URL: tts_uk-1.3.7.tar.gz
Upload date: Mar 15, 2025
Size: 859.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for tts_uk-1.3.7.tar.gz
Algorithm	Hash digest
SHA256	`6ffe120939f72e2fe3fbe99eca219a250e5ff983c824807cfba70aee0019db75`
MD5	`eba4ced47f4c9fa7ed61c61aee447f0c`
BLAKE2b-256	`a5c30f828ae40ea050358524fc0da56e36134efa90a465ea72b8d88eb3e21bd4`

See more details on using hashes here.

File details

Details for the file tts_uk-1.3.7-py3-none-any.whl.

File metadata

Download URL: tts_uk-1.3.7-py3-none-any.whl
Upload date: Mar 15, 2025
Size: 56.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for tts_uk-1.3.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`809b29cc54017e18decf872cac585e2e42d8563ee8bdf0035ce7ab606a928c4f`
MD5	`8f09fdc7c06b92e8a190dc9f7fb2b090`
BLAKE2b-256	`f6d655c26c9a431ec4ebdb6877fd0d12c30c1672da1801057a2c917bd0c5431d`

See more details on using hashes here.

tts-uk 1.3.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Text-to-Speech for Ukrainian

Statuses

Demo

Features

Installation

Getting started

Get help and support

License

Model authors

Acoustic

Vocoder

Community

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes