A CLI that provides local text-to-speech using Kokoro TTS on Apple Silicon
Project description
kokoro-tts-tool
A CLI that provides local text-to-speech using Kokoro TTS on Apple Silicon. No API keys required.
Table of Contents
- About
- Features
- Installation
- Quick Start
- Usage
- Infinite Streaming
- Available Voices
- Multi-Level Verbosity Logging
- Shell Completion
- Development
- Testing
- Security
- Contributing
- License
- Author
About
kokoro-tts-tool is a Python CLI tool for local text-to-speech synthesis using the Kokoro-82M model. It runs entirely on your machine with no cloud dependencies, optimized for Apple Silicon Macs.
Key highlights:
- Local inference: Uses ONNX runtime for fast, CPU-optimized synthesis
- 60+ voices: Multiple languages and accents (English, Japanese, Mandarin, etc.)
- Near real-time: Fast enough for interactive use on Apple Silicon
- Infinite streaming: Continuous TTS for long documents without audio artifacts
- No API keys: Everything runs locally, completely free
Features
- Local TTS with Kokoro-82M (82 million parameters)
- 60+ voices across 8 languages
- Near real-time synthesis on Apple Silicon
- Auto-download of model files (~350MB)
- WAV output or direct speaker playback
- Infinite streaming for long documents (books, articles)
- Seamless audio without pop artifacts between chunks
- Fast offline rendering (20-50x real-time on M4)
- Type-safe with mypy strict mode
- Tested with pytest
- Multi-level verbosity logging (-v/-vv/-vvv)
- Shell completion for bash, zsh, and fish
- Security scanning with bandit, pip-audit, and gitleaks
Installation
Prerequisites
- Python 3.14 or higher
- uv package manager
- Apple Silicon Mac (recommended) or any platform with Python 3.14+
Install from source
# Clone the repository
git clone https://github.com/dnvriend/kokoro-tts-tool.git
cd kokoro-tts-tool
# Install globally with uv
uv tool install .
Install with mise (recommended for development)
cd kokoro-tts-tool
mise trust
mise install
uv sync
uv tool install .
Verify installation
kokoro-tts-tool --version
Quick Start
# 1. Initialize (downloads models on first run, ~350MB)
kokoro-tts-tool init
# 2. Synthesize text to speakers
kokoro-tts-tool synthesize "Hello world!"
# 3. Save to file
kokoro-tts-tool synthesize "Hello world!" --output hello.wav
# 4. Use different voice
kokoro-tts-tool synthesize "This is Adam." --voice am_adam
# 5. List available voices
kokoro-tts-tool list-voices
Usage
Commands
# Show all commands
kokoro-tts-tool --help
# Download/update models
kokoro-tts-tool init
# Synthesize text
kokoro-tts-tool synthesize "Your text here"
kokoro-tts-tool synthesize "Your text" --output speech.wav
kokoro-tts-tool synthesize "Your text" --voice bf_emma --speed 1.2
# Read from stdin
echo "Hello from stdin" | kokoro-tts-tool synthesize --stdin
# List voices
kokoro-tts-tool list-voices
kokoro-tts-tool list-voices --language English
kokoro-tts-tool list-voices --gender Female
kokoro-tts-tool list-voices --json
# Show configuration
kokoro-tts-tool info
Synthesize Options
| Option | Description | Default |
|---|---|---|
--voice, -v |
Voice ID (e.g., af_heart, am_adam) | af_heart |
--output, -o |
Output WAV file path | (plays to speakers) |
--speed |
Speech speed (0.5 to 2.0) | 1.0 |
--stdin, -s |
Read text from stdin | false |
Infinite Streaming
Stream long documents (books, articles, study materials) without audio artifacts:
# Stream a markdown file to speakers
kokoro-tts-tool infinite --input book.md
# Render to WAV file (fast offline mode, 20-50x real-time on M4)
kokoro-tts-tool infinite --input book.md --output audiobook.wav
# Pipe from stdin
cat chapter.md | kokoro-tts-tool infinite --stdin
# With custom voice and speed
kokoro-tts-tool infinite --input notes.md --voice am_adam --speed 1.2
Infinite Streaming Options
| Option | Description | Default |
|---|---|---|
--input, -i |
Input text/markdown file | - |
--stdin, -s |
Read text from stdin | false |
--output, -o |
Save to WAV file (fast offline mode) | (plays to speakers) |
--voice |
Voice ID | af_heart |
--speed |
Speech speed (0.5 to 2.0) | 1.0 |
--chunk-size |
Target words per chunk (50-1000) | 200 |
--pause |
Pause between chunks in ms (0-2000) | 150 |
--no-markdown |
Treat input as plain text | false |
Available Voices
The tool includes 60+ voices across 8 languages:
American English (20 voices)
| Voice ID | Gender | Grade | Description |
|---|---|---|---|
af_heart |
Female | A | Default, emotional, soft (highest quality) |
af_bella |
Female | A- | Expressive, dynamic range |
am_adam |
Male | A- | Deep narrator (audiobooks) |
am_michael |
Male | B+ | Natural, casual |
British English (8 voices)
| Voice ID | Gender | Grade | Description |
|---|---|---|---|
bf_emma |
Female | B+ | Polished, formal (education) |
bm_george |
Male | B+ | Resonant, classic (history) |
Other Languages
- Japanese: jf_alpha, jm_kumo, and more
- Mandarin: zf_xiaobei, zm_yunjian, and more
- Spanish: ef_dora, em_alex
- French: ff_siwis
- Hindi: hf_alpha, hm_omega
- Italian: if_sara, im_nicola
- Portuguese (Brazilian): pf_dora, pm_alex
Run kokoro-tts-tool list-voices for the complete list.
Voice Quality Grades
- A/A-: Highest quality, recommended for production
- B+/B: Good quality
- B-: Acceptable quality
Multi-Level Verbosity Logging
The CLI supports progressive verbosity levels for debugging:
| Flag | Level | Output | Use Case |
|---|---|---|---|
| (none) | WARNING | Errors and warnings only | Production |
-v |
INFO | + High-level operations | Normal debugging |
-vv |
DEBUG | + Detailed info | Development |
-vvv |
TRACE | + Library internals | Deep debugging |
# Quiet mode
kokoro-tts-tool synthesize "Hello"
# With debug output
kokoro-tts-tool -vv synthesize "Hello"
Shell Completion
The CLI provides native shell completion for bash, zsh, and fish:
# Bash - add to ~/.bashrc
echo 'eval "$(kokoro-tts-tool completion bash)"' >> ~/.bashrc
# Zsh - add to ~/.zshrc
echo 'eval "$(kokoro-tts-tool completion zsh)"' >> ~/.zshrc
# Fish - save to completions
mkdir -p ~/.config/fish/completions
kokoro-tts-tool completion fish > ~/.config/fish/completions/kokoro-tts-tool.fish
Development
Setup Development Environment
git clone https://github.com/dnvriend/kokoro-tts-tool.git
cd kokoro-tts-tool
make install
make help
Available Make Commands
make install # Install dependencies
make format # Format code
make lint # Run linting
make typecheck # Type checking
make test # Run tests
make security # Security scans
make check # All checks
make pipeline # Full pipeline
Project Structure
kokoro-tts-tool/
├── kokoro_tts_tool/
│ ├── __init__.py
│ ├── cli.py # CLI entry point
│ ├── engine.py # TTS engine wrapper
│ ├── models.py # Model management
│ ├── voices.py # Voice definitions
│ ├── splitter.py # Text chunking for long documents
│ ├── streaming.py # Audio streaming for speaker playback
│ ├── utils.py # Utilities
│ ├── logging_config.py # Logging setup
│ ├── completion.py # Shell completion
│ └── commands/ # CLI commands
│ ├── synthesize_commands.py
│ ├── voice_commands.py
│ ├── init_commands.py
│ ├── info_commands.py
│ └── infinite_commands.py
├── tests/
├── references/ # Research documentation
├── plugins/ # Claude Code plugin
├── pyproject.toml
├── Makefile
├── README.md
└── CLAUDE.md
Testing
# Run all tests
make test
# Run tests with verbose output
uv run pytest tests/ -v
Security
The project includes security scanning:
# Run all security checks
make security
# Individual scans
make security-bandit # Python security linting
make security-pip-audit # Dependency CVE scanning
make security-gitleaks # Secret detection
Prerequisites
# Install gitleaks (macOS)
brew install gitleaks
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Run
make pipeline - Submit a Pull Request
License
MIT License - see LICENSE for details.
Author
Dennis Vriend - @dnvriend
Acknowledgments
- Kokoro-82M - The TTS model
- kokoro-onnx - ONNX implementation
- Click - CLI framework
- uv - Fast Python tooling
Generated with AI
This project was generated using Claude Code.
Made with Python 3.14
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kokoro_tts_tool-0.1.0.tar.gz.
File metadata
- Download URL: kokoro_tts_tool-0.1.0.tar.gz
- Upload date:
- Size: 374.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
668af3a0192543175a2cf68e04d6e746d81c4a1fb93478dc4af29fd6758a12f9
|
|
| MD5 |
f5ea3008c98c85bd3607b399cbd227f1
|
|
| BLAKE2b-256 |
12623faa67b397d1c216219c2b550a952a75e3bd6d2305bfa74bae150d5eaf90
|
Provenance
The following attestation bundles were made for kokoro_tts_tool-0.1.0.tar.gz:
Publisher:
publish.yml on dnvriend/kokoro-tts-tool
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kokoro_tts_tool-0.1.0.tar.gz -
Subject digest:
668af3a0192543175a2cf68e04d6e746d81c4a1fb93478dc4af29fd6758a12f9 - Sigstore transparency entry: 747680618
- Sigstore integration time:
-
Permalink:
dnvriend/kokoro-tts-tool@97891b8e944a374cf2e2892bf11a1ac47cdc1e74 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/dnvriend
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@97891b8e944a374cf2e2892bf11a1ac47cdc1e74 -
Trigger Event:
push
-
Statement type:
File details
Details for the file kokoro_tts_tool-0.1.0-py3-none-any.whl.
File metadata
- Download URL: kokoro_tts_tool-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a4696f163ce5dc69feb7cec6eee0d349a1a15347dd8a13cf7fdaa5eb48266d1
|
|
| MD5 |
e9a6edac0ff15130cb683f9c9e5cb9fa
|
|
| BLAKE2b-256 |
85812bbbf269899987c161098d34f95827796e0be2e4c65f5ecf5c8e0c2b774e
|
Provenance
The following attestation bundles were made for kokoro_tts_tool-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on dnvriend/kokoro-tts-tool
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kokoro_tts_tool-0.1.0-py3-none-any.whl -
Subject digest:
3a4696f163ce5dc69feb7cec6eee0d349a1a15347dd8a13cf7fdaa5eb48266d1 - Sigstore transparency entry: 747680620
- Sigstore integration time:
-
Permalink:
dnvriend/kokoro-tts-tool@97891b8e944a374cf2e2892bf11a1ac47cdc1e74 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/dnvriend
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@97891b8e944a374cf2e2892bf11a1ac47cdc1e74 -
Trigger Event:
push
-
Statement type: