Skip to main content

Voice Blender CLI for Kokoro MLX

Project description

๐Ÿฆœ Voice Blender CLI for Kokoro MLX

Run Text-To-Speech with the MLX implementation (Mac M1-M4) of Kokoro to vastly improve processing speed. Use one voice or blend two voices by specifying a mixing ratio.

The app comes with a user-friendly gradio web interface.

Table of Contents

Prerequisites

  • Python >= 3.10
  • HuggingFace Access Token

Installation

1. Clone this repo

git clone https://github.com/tsmdt/kokoro-MLX-blender.git

2. Change to project folder

cd kokoro-MLX-blender

3. Create a python env and activate it

python3 -m venv venv_kokoro
source venv_kokoro/bin/activate

4. Install kokoro-MLX-blender

pip install .

5. Download Kokoro MLX model using huggingface-cli

Run the following command from the main project folder (./kokoro-MLX-blender/)

huggingface-cli download --local-dir models/Kokoro-82M-bf16 mlx-community/Kokoro-82M-bf16

Ensure that the folder Kokoro-82M-bf16 (Hugging Face) with a voices subfolder and various .pt files (e.g. af_heart, af_alloy, etc.) now exists within the models folder. Your directory should look like this:

kokoro-MLX-blender
โ”œโ”€โ”€ kb_mlx/
โ”œโ”€โ”€ models/
โ”‚   โ””โ”€โ”€ Kokoro-82M-bf16/
โ”‚       โ”œโ”€โ”€ samples/
โ”‚       โ”œโ”€โ”€ voices/
โ”‚       โ”œโ”€โ”€ .gitattributes
โ”‚       โ”œโ”€โ”€ config.json
โ”‚       โ”œโ”€โ”€ DONATE.md
โ”‚       โ”œโ”€โ”€ kokoro-v1_0.safetensors
โ”‚       โ”œโ”€โ”€ README.md
โ”‚       โ”œโ”€โ”€ SAMPLES.md
โ”‚       โ””โ”€โ”€ VOICES.md
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ README.md
...

[!Note] You can use different versions of the KokoroMLX model as well. Download your preferred one from HuggingFace (cf. Installation step 5) and make sure that the downloaded Kokoro model folder exists within the models folder of kokoro-MLX-blender.

6. Check if everything works correctly

Run the following command in CLI to check if everything works.

kbx list

If you see a list of voice names kokoro-MLX-blender should work. If not please make sure that you downloaded the kokoro model in the previous step and placed it correctly in your models folder.

Usage

$ kbx run

 Usage: kbx run [OPTIONS]

 Run TTS with KokoroMLX for M1-M4. Use one voice or blend two voices.

โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ *  --text        -t                   TEXT       Input text(s) as string, single .txt or directory path [default: None] [required]   โ”‚
โ”‚    --voice1      -v1                  TEXT       Name of the first voice (without .pt) [default: af_heart]                           โ”‚
โ”‚    --voice2      -v2                  TEXT       Name of second voice (without .pt); if omitted, use only voice1 [default: None]     โ”‚
โ”‚    --mix-ratio   -m                   FLOAT      Blend weight for voice1 and voice2 (0.5 = 50% each) [default: 0.5]                  โ”‚
โ”‚    --speed       -s                   FLOAT      Speed multiplier (1.5 = 50% faster, 0.5 = 50% slower) [default: 1]                  โ”‚
โ”‚    --model-dir   -md                  DIRECTORY  Path to the local Kokoro model directory [default: ./models/Kokoro-82M-bf16]        โ”‚
โ”‚    --output-dir  -o                   TEXT       Directory where output audio file will be saved [default: ./output]                 โ”‚
โ”‚    --verbose          --no-verbose               Enable verbose output [default: verbose]                                            โ”‚
โ”‚    --help                                        Show this message and exit.                                                         โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Examples

Run TTS with two blended voices (60% voice1 and 40% voice2)

kbx run -t "This is a test in blending the male American voice of Eric with the female American voice of Heart." -v1 am_eric -v2 af_heart -m 0.6

Gradio App

Launch the Gradio web app like this:

kbx app

Acknowledgment

OpenAI's o4-mini-high was used for creating the CLI app.

Citation

@software{kokoro_mlx_blender,
  author       = {Thomas Schmidt},
  title        = {Voice Blender CLI for Kokoro MLX},
  year         = {2025},
  url          = {https://github.com/tsmdt/kokoro-MLX-blender},
  note         = {Accessed: 2025-05-29}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kb_mlx-0.3.1.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kb_mlx-0.3.1-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file kb_mlx-0.3.1.tar.gz.

File metadata

  • Download URL: kb_mlx-0.3.1.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for kb_mlx-0.3.1.tar.gz
Algorithm Hash digest
SHA256 ffe7f493f91d1525d5b89d0ae2ac9c62f2b7470355c2409b1f04b30557910767
MD5 244a835080da8aab63905c6a51130199
BLAKE2b-256 812717aafed41ba5e6da89446fd6fa1b390decf3a5834c710beefd69e49e417c

See more details on using hashes here.

File details

Details for the file kb_mlx-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: kb_mlx-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for kb_mlx-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a58de8cc5d2e8048f8a2e362226f89f7c732c7c3db44665d13a4b0d335d77bf3
MD5 3bfefe1414f26da49c3905e8b1c937b9
BLAKE2b-256 3aecf0592603cf6d854e4326f17c1345f7218bed0fca4ee36a068c52fc15700f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page