Skip to main content

OVOS TTS plugin for Kokoro — 82M parameter multilingual TTS by hexgrad

Project description

ovos-tts-plugin-kokoro

Status: Proof of Concept

POC status — experimental, not for production, may be abandoned. No API stability promise.

OVOS TTS plugin for Kokoro — an 82M parameter multilingual TTS model by hexgrad. Same engine used by VoiceMode, now wired up for the standard OVOS voice assistant.

Install

pip install ovos-tts-plugin-kokoro

espeak-ng is required for the underlying G2P stack:

# Debian/Ubuntu
sudo apt-get install espeak-ng
# macOS
brew install espeak-ng

English voices also need spaCy's en_core_web_sm model. Misaki (the Kokoro G2P library) attempts to download it on first use but does not reload it in the same process, so you'll want to install it ahead of time:

python -m spacy download en_core_web_sm

For Japanese or Chinese voices, install the optional G2P extras:

pip install "ovos-tts-plugin-kokoro[ja,zh]"

Linux: CPU-only torch (saves ~2GB)

On Linux, pip defaults to the CUDA torch wheel (~2.5GB). If you don't need GPU support, install torch from the CPU index first:

pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install ovos-tts-plugin-kokoro

On macOS, this is not needed — PyPI torch is already CPU-only (~60MB). With uv, torch automatically resolves to the CPU-only wheel via the tool.uv.sources block in pyproject.toml.

Configuration

{
  "tts": {
    "module": "ovos-tts-plugin-kokoro",
    "ovos-tts-plugin-kokoro": {
      "voice": "af_bella"
    }
  }
}

Voice options

Kokoro ships 56 built-in voices across 9 languages. The voice id encodes language + gender:

Prefix Language Examples
af_ American English (F) af_bella, af_heart, af_nicole
am_ American English (M) am_michael, am_onyx, am_eric
bf_ British English (F) bf_alice, bf_emma, bf_lily
bm_ British English (M) bm_george, bm_fable, bm_daniel
jf_ / jm_ Japanese jf_alpha, jm_kumo (needs [ja] extra)
zf_ / zm_ Mandarin zf_xiaoxiao, zm_yunjian (needs [zh] extra)
ef_ / em_ Spanish ef_dora, em_alex
ff_ French (F) ff_siwis
hf_ / hm_ Hindi hf_alpha, hm_omega
if_ / im_ Italian if_sara, im_nicola
pf_ / pm_ Brazilian Portuguese pf_dora, pm_alex

The voice id determines which Kokoro language pipeline is used, regardless of the OVOS active language. Picking bm_george will speak through the British pipeline even if lang is en-US.

See the full hexgrad/Kokoro-82M VOICES.md for samples.

Language support

The plugin maps the active OVOS language (BCP-47, e.g. fr-FR) to a Kokoro single-letter language code:

OVOS lang Kokoro code Language
en / en-us a American English
en-gb b British English
es e Spanish
fr f French
hi h Hindi
it i Italian
ja j Japanese
pt / pt-br p Brazilian Portuguese
zh z Mandarin

Lookup tries the full BCP-47 tag first (e.g. en-gb), then falls back to the base subtag, then to American English. Unknown languages fall back to American English with a log line. The voice id always wins over the language map — a voice prefixed bm_ always uses the British pipeline.

Override the language map

{
  "tts": {
    "module": "ovos-tts-plugin-kokoro",
    "ovos-tts-plugin-kokoro": {
      "voice": "af_bella",
      "speed": 1.0,
      "language_aliases": {
        "en": "b"
      },
      "preload_languages": ["en", "fr"]
    }
  }
}
Key Type Default Description
voice str af_bella Any built-in voice id (see table above).
speed float 1.0 Playback speed multiplier passed to KPipeline.
sample_rate int 16000 Output sample rate in Hz. Kokoro's native rate is 24000; the plugin resamples.
device str or null "cpu" Torch device — "cpu", "cuda", "mps", or null to let Kokoro auto-select.
language_aliases dict {} Override or extend the BCP-47 → Kokoro code map.
preload_languages list[str] [] BCP-47 codes to load eagerly during plugin init instead of lazy-loading.

Memory note: Each loaded language pipeline holds the 82M parameter model + a g2p stack. The plugin caches one pipeline per (language, device) pair, so leaving preload_languages empty and letting the cache warm on demand keeps the resident set small.

Apple Silicon note: Despite MPS being available on M-series Macs, CPU is the fastest device for Kokoro on Apple Silicon. The vocoder leans heavily on torch.stft/istft, which are weak spots on the Metal backend — measured RTF on an M3 Max was ~0.08 on CPU vs ~0.40 on MPS. The default of "cpu" is intentional; only set device to "cuda" if you actually have a discrete NVIDIA GPU.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ovos_tts_plugin_kokoro-0.1.0a1.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ovos_tts_plugin_kokoro-0.1.0a1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file ovos_tts_plugin_kokoro-0.1.0a1.tar.gz.

File metadata

  • Download URL: ovos_tts_plugin_kokoro-0.1.0a1.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ovos_tts_plugin_kokoro-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 c9a88b9e27e84048673b08ba76eba507aa6c4320ec356d85be99d80fee87af78
MD5 19a36e8318b239ebb1225cf652135728
BLAKE2b-256 70911acec242a89cc5c0be4e6bff08ca82d27eaa49cfdfd0e6768ec834e47140

See more details on using hashes here.

File details

Details for the file ovos_tts_plugin_kokoro-0.1.0a1-py3-none-any.whl.

File metadata

File hashes

Hashes for ovos_tts_plugin_kokoro-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 cacf159f4905bf121d53f1113179d3c0e49e361f9c6e3d1215720d787c99ffd4
MD5 8f3c79edf3041d8b75a093dd04072ee8
BLAKE2b-256 544ac6b9485c289c2fca738c55bb6b7ce98aefc04c33e654ffb231f59389524b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page