VietTTS: An Open-Source Vietnamese Text to Speech

These details have not been verified by PyPI

Project description

VietTTS: An Open-Source Vietnamese Text to Speech

VietTTS is an open-source toolkit providing the community with a powerful Vietnamese TTS model, capable of natural voice synthesis and robust voice cloning. Designed for effective experimentation, VietTTS supports research and application in Vietnamese voice technologies.

⭐ Key Features

TTS: Text-to-Speech generation with any voice via prompt audio
OpenAI-API-compatible: Compatible with OpenAI's Text-to-Speech API format

🛠️ Installation

VietTTS can be installed via a Python installer (Linux only, with Windows and macOS support coming soon) or Docker.

Python Installer (Python>=3.10)

git clone https://github.com/dangvansam/viet-tts.git
cd viet-tts

# (Optional) Install Python environment with conda, you could also use virtualenv 
conda create --name viettts python=3.10
conda activate viettts

# Install
pip install -e . && pip cache purge

Docker

Install Docker, NVIDIA Driver, NVIDIA Container Toolkit, and CUDA.
Run the following commands:

git clone https://github.com/dangvansam/viet-tts.git
cd viet-tts

# Build docker images
docker compose build

# Run with docker-compose - will create server at: http://localhost:8298
docker compose up -d

# Or run with docker run - will create server at: http://localhost:8298
docker run -itd --gpu=alls -p 8298:8298 -v ./pretrained-models:/app/pretrained-models -n viet-tts-service viet-tts:latest viettts server --host 0.0.0.0 --port 8298

🚀 Usage

Built-in Voices 🤠

You can use available voices bellow to synthesize speech.

Expand

ID	Voice	Gender
1	nsnd-le-chuc	👨
2	speechify_10	👩
3	atuan	👨
4	speechify_11	👩
5	cdteam	👨
6	speechify_12	👩
7	cross_lingual_prompt	👩
8	speechify_2	👩
9	diep-chi	👨
10	speechify_3	👩
11	doremon	👨
12	speechify_4	👩
13	jack-sparrow	👨
14	speechify_5	👩
15	nguyen-ngoc-ngan	👩
16	speechify_6	👩
17	nu-nhe-nhang	👩
18	speechify_7	👩
19	quynh	👩
20	speechify_8	👩
21	speechify_9	👩
22	son-tung-mtp	👨
23	zero_shot_prompt	👩
24	speechify_1	👩

Command Line Interface (CLI)

The VietTTS Command Line Interface (CLI) allows you to quickly generate speech directly from the terminal. Here's how to use it:

# Usage
viettts --help

# Start API Server
viettts server --host 0.0.0.0 --port 8298

# List all built-in voices
viettts show-voices

# Synthesize speech from text with built-in voices
viettts synthesis --text "Xin chào" --voice 0 --output test.wav

# Clone voice from a local audio file
viettts synthesis --text "Xin chào" --voice Download/voice.wav --output cloned.wav

API Client

Python (OpenAI Client)

You need to set environment variables for the OpenAI Client:

# Set base_url and API key as environment variables
export OPENAI_BASE_URL=http://localhost:8298
export OPENAI_API_KEY=viet-tts # not use in current version

To create speech from input text:

from pathlib import Path
from openai import OpenAI

client = OpenAI()

output_file_path = Path(__file__).parent / "speech.wav"

with client.audio.speech.with_streaming_response.create(
  model='tts-1',
  voice='cdteam',
  input='Xin chào Việt Nam.',
  speed=1.0,
  response_format='wav'
) as response:
  response.stream_to_file('a.wav')

CURL

# Get all built-in voices
curl --location http://0.0.0.0:8298/v1/voices

# OpenAI format (bult-in voices)
curl http://localhost:8298/v1/audio/speech \
  -H "Authorization: Bearer viet-tts" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Xin chào Việt Nam.",
    "voice": "son-tung-mtp"
  }' \
  --output speech.wav

# API with voice from local file
curl --location http://0.0.0.0:8298/v1/tts \
  --form 'text="xin chào"' \
  --form 'audio_file=@"/home/viettts/Downloads/voice.mp4"' \
  --output speech.wav

Node

import fs from "fs";
import path from "path";
import OpenAI from "openai";

const openai = new OpenAI();

const speechFile = path.resolve("./speech.wav");

async function main() {
  const mp3 = await openai.audio.speech.create({
    model: "tts-1",
    voice: "1",
    input: "Xin chào Việt Nam.",
  });
  console.log(speechFile);
  const buffer = Buffer.from(await mp3.arrayBuffer());
  await fs.promises.writeFile(speechFile, buffer);
}
main();

🙏 Acknowledgement

💡 Borrowed code from Cosyvoice
🎙️ VAD model from silero-vad
📝 Text normalization with Vinorm

📜 License

The VietTTS source code is released under the Apache 2.0 License. Pre-trained models and audio samples are licensed under the CC BY-NC License, based on an in-the-wild dataset. We apologize for any inconvenience this may cause.

⚠️ Disclaimer

The content provided above is for academic purposes only and is intended to demonstrate technical capabilities. Some examples are sourced from the internet. If any content infringes on your rights, please contact us to request its removal.

💬 Contact

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

viet_tts-0.1.0.tar.gz (468.3 kB view details)

Uploaded Apr 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

viet_tts-0.1.0-py3-none-any.whl (477.7 kB view details)

Uploaded Apr 24, 2025 Python 3

File details

Details for the file viet_tts-0.1.0.tar.gz.

File metadata

Download URL: viet_tts-0.1.0.tar.gz
Upload date: Apr 24, 2025
Size: 468.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1018-azure

File hashes

Hashes for viet_tts-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b8f1920cd2ff21de06275651a7e710c4a020b553a1d6ddc17bfb2c941627d308`
MD5	`814c0ba580b3f52effa1763c698891d9`
BLAKE2b-256	`9e06252b0f7b4ec0aca1959fafb9d63ba81fe71ba6296348fc341496ef3660da`

See more details on using hashes here.

File details

Details for the file viet_tts-0.1.0-py3-none-any.whl.

File metadata

Download URL: viet_tts-0.1.0-py3-none-any.whl
Upload date: Apr 24, 2025
Size: 477.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-1018-azure

File hashes

Hashes for viet_tts-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f6d70b332328a7ad32332b1d6d1150cfb23509c23c095415c120c3794d4d1c9b`
MD5	`6f624947e0b7c47c46b3609d746d7b1b`
BLAKE2b-256	`ad9688f006281baff809dab12c46022ee3df69269853879d9d9f2fa49334c4e9`

See more details on using hashes here.

viet-tts 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

VietTTS: An Open-Source Vietnamese Text to Speech

⭐ Key Features

🛠️ Installation

Python Installer (Python>=3.10)

Docker

🚀 Usage

Built-in Voices 🤠

Command Line Interface (CLI)

API Client

Python (OpenAI Client)

CURL

Node

🙏 Acknowledgement

📜 License

⚠️ Disclaimer

💬 Contact

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes