Skip to main content

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

Project description

Kokoro TTS

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

ngpt-s-c

Features

  • Multiple language and voice support
  • Voice blending with customizable weights
  • EPUB, PDF and TXT file input support
  • Standard input (stdin) and | piping from other programs
  • Streaming audio playback
  • Split output into chapters
  • Adjustable speech speed
  • WAV and MP3 output formats
  • Chapter merging capability
  • Detailed debug output option
  • GPU Support

Demo

Kokoro TTS is an open-source CLI tool that delivers high-quality text-to-speech right from your terminal. Think of it as your personal voice studio, capable of transforming any text into natural-sounding speech with minimal effort.

https://github.com/user-attachments/assets/8413e640-59e9-490e-861d-49187e967526

Demo Audio (MP3) | Demo Audio (WAV)

TODO

  • Add GPU support
  • Add PDF support
  • Add GUI

Prerequisites

  • Python 3.11-3.12 (Python 3.13+ is not currently supported)

Installation

Method 1: Install from PyPI (Recommended)

The easiest way to install Kokoro TTS is from PyPI:

# Using uv (recommended)
uv tool install kokoro-tts

# Using pip
pip install kokoro-tts

After installation, you can run:

kokoro-tts --help

Method 2: Install from Git

Install directly from the repository:

# Using uv (recommended)
uv tool install git+https://github.com/nazdridoy/kokoro-tts

# Using pip
pip install git+https://github.com/nazdridoy/kokoro-tts

Method 3: Clone and Install Locally

  1. Clone the repository:
git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts
  1. Install the package:

With uv (recommended):

uv venv
uv pip install -e .

With pip:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e .
  1. Run the tool:
# If using uv
uv run kokoro-tts --help

# If using pip with activated venv
kokoro-tts --help

Method 4: Run Without Installation

If you prefer to run without installing:

  1. Clone the repository:
git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts
  1. Install dependencies only:

With uv:

uv venv
uv sync

With pip:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
  1. Run directly:
# With uv
uv run -m kokoro_tts --help

# With pip (venv activated)
python -m kokoro_tts --help

Download Model Files

After installation, download the required model files to your working directory:

# Download voice data (bin format is preferred)
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/voices-v1.0.bin

# Download the model
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/kokoro-v1.0.onnx

The script requires voices-v1.0.bin and kokoro-v1.0.onnx to be present in the same directory where you run the kokoro-tts command.

Supported voices:

Category Voices Language Code
🇺🇸 👩 af_alloy, af_aoede, af_bella, af_heart, af_jessica, af_kore, af_nicole, af_nova, af_river, af_sarah, af_sky en-us
🇺🇸 👨 am_adam, am_echo, am_eric, am_fenrir, am_liam, am_michael, am_onyx, am_puck en-us
🇬🇧 bf_alice, bf_emma, bf_isabella, bf_lily, bm_daniel, bm_fable, bm_george, bm_lewis en-gb
🇫🇷 ff_siwis fr-fr
🇮🇹 if_sara, im_nicola it
🇯🇵 jf_alpha, jf_gongitsune, jf_nezumi, jf_tebukuro, jm_kumo ja
🇨🇳 zf_xiaobei, zf_xiaoni, zf_xiaoxiao, zf_xiaoyi, zm_yunjian, zm_yunxi, zm_yunxia, zm_yunyang cmn

Usage

Basic Usage

kokoro-tts <input_text_file> [<output_audio_file>] [options]

[!NOTE]

  • If you installed via Method 1 (PyPI) or Method 2 (git install), use kokoro-tts directly
  • If you installed via Method 3 (local install), use uv run kokoro-tts or activate your virtual environment first
  • If you're using Method 4 (no install), use uv run -m kokoro_tts or python -m kokoro_tts with activated venv

Commands

  • -h, --help: Show help message
  • --help-languages: List supported languages
  • --help-voices: List available voices
  • --merge-chunks: Merge existing chunks into chapter files

Options

  • --stream: Stream audio instead of saving to file
  • --speed <float>: Set speech speed (default: 1.0)
  • --lang <str>: Set language (default: en-us)
  • --voice <str>: Set voice or blend voices (default: interactive selection)
    • Single voice: Use voice name (e.g., "af_sarah")
    • Blended voices: Use "voice1:weight,voice2:weight" format
  • --split-output <dir>: Save each chunk as separate file in directory
  • --format <str>: Audio format: wav or mp3 (default: wav)
  • --debug: Show detailed debug information during processing

Input Formats

  • .txt: Text file input
  • .epub: EPUB book input (will process chapters)
  • .pdf: PDF document input (extracts chapters from TOC or content)
  • - or /dev/stdin (Linux/macOS) or CONIN$ (Windows): Standard input (stdin)

Examples

# Basic usage with output file
kokoro-tts input.txt output.wav --speed 1.2 --lang en-us --voice af_sarah

# Read from standard input (stdin)
echo "Hello World" | kokoro-tts - --stream
cat input.txt | kokoro-tts - output.wav

# Cross-platform stdin support:
# Linux/macOS: echo "text" | kokoro-tts - --stream
# Windows: echo "text" | kokoro-tts - --stream
# All platforms also support: kokoro-tts /dev/stdin --stream (Linux/macOS) or kokoro-tts CONIN$ --stream (Windows)

# Use voice blending (60-40 mix)
kokoro-tts input.txt output.wav --voice "af_sarah:60,am_adam:40"

# Use equal voice blend (50-50)
kokoro-tts input.txt --stream --voice "am_adam,af_sarah"

# Process EPUB and split into chunks
kokoro-tts input.epub --split-output ./chunks/ --format mp3

# Stream audio directly
kokoro-tts input.txt --stream --speed 0.8

# Merge existing chunks
kokoro-tts --merge-chunks --split-output ./chunks/ --format wav

# Process EPUB with detailed debug output
kokoro-tts input.epub --split-output ./chunks/ --debug

# Process PDF and split into chapters
kokoro-tts input.pdf --split-output ./chunks/ --format mp3

# List available voices
kokoro-tts --help-voices

# List supported languages
kokoro-tts --help-languages

[!TIP] If you're using Method 3, replace kokoro-tts with uv run kokoro-tts in the examples above. If you're using Method 4, replace kokoro-tts with uv run -m kokoro_tts or python -m kokoro_tts in the examples above.

Features in Detail

EPUB Processing

  • Automatically extracts chapters from EPUB files
  • Preserves chapter titles and structure
  • Creates organized output for each chapter
  • Detailed debug output available for troubleshooting

Audio Processing

  • Chunks long text into manageable segments
  • Supports streaming for immediate playback
  • Voice blending with customizable mix ratios
  • Progress indicators for long processes
  • Handles interruptions gracefully

Output Options

  • Single file output
  • Split output with chapter organization
  • Chunk merging capability
  • Multiple audio format support

Debug Mode

  • Shows detailed information about file processing
  • Displays NCX parsing details for EPUB files
  • Lists all found chapters and their metadata
  • Helps troubleshoot processing issues

Input Options

  • Text file input (.txt)
  • EPUB book input (.epub)
  • Standard input (stdin)
  • Supports piping from other programs

Contributing

This is a personal project. But if you want to contribute, please feel free to submit a Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kokoro_tts-2.3.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kokoro_tts-2.3.1-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file kokoro_tts-2.3.1.tar.gz.

File metadata

  • Download URL: kokoro_tts-2.3.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kokoro_tts-2.3.1.tar.gz
Algorithm Hash digest
SHA256 8abdf0a4620a383318803a3e0bddc2b0149fddad4e39d7589c2e48be11270316
MD5 b7ccbbee76d5692d45347df48e7e69ad
BLAKE2b-256 47321c7a401297d257f92c64b87ea987ee922b13fbaa6119e06736bff51300c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for kokoro_tts-2.3.1.tar.gz:

Publisher: python-publish.yml on nazdridoy/kokoro-tts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kokoro_tts-2.3.1-py3-none-any.whl.

File metadata

  • Download URL: kokoro_tts-2.3.1-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kokoro_tts-2.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 07f5748ede87f0ab77f41037f355fcb69cd7e04109ab32ed317a3ee0b74a5b00
MD5 2545a4df704fe38f42a107fa650f67fb
BLAKE2b-256 b48850a083483df89e60fbcde2e80254bcb53dfb02beedb950c5bac9c9a50eb7

See more details on using hashes here.

Provenance

The following attestation bundles were made for kokoro_tts-2.3.1-py3-none-any.whl:

Publisher: python-publish.yml on nazdridoy/kokoro-tts

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page