Interactive CLI for kenkui — convert ebooks to audiobooks locally

These details have not been verified by PyPI

Project links

Project description

kentui

Python Platform License PyPI

Freaky fast audiobook generation from ebooks. No GPU. No nonsense.

kentui is the interactive CLI for kenkui — an ebook-to-audiobook converter powered by Kyutai's pocket-tts, running entirely on CPU.

kentui handles the interactive parts: configuration wizard, voice management, chapter selection, and progress display. The actual conversion engine is kenkui, which kentui depends on.

✨ Features

Freaky fast audiobook generation
No GPU needed, 100% CPU
Super high-quality text-to-speech
Interactive hub with live status panel and Escape to go back
Multi-voice narration — different voices for different characters, powered by an LLM
Chapter-voice mode — assign a distinct voice to each chapter
Voice pool template — persistent global defaults for automatic voice assignment
Credits chapter — synthesized audio appended to every m4b
Three tiers of voices: compiled, built-in, and custom
Flexible chapter selection with presets and manual override
Broadcast-quality audio post-processing chain
Supports EPUB, MOBI/AZW, and FB2

🚀 Quick Start

Requirements

Python 3.12+

Install

pip install kentui

Or with uv / pipx:

uv tool install kentui
pipx install kentui

Compiled voices (~440 MB) are downloaded automatically on first run. To download them ahead of time:

kentui voices download

Run

kentui book.epub

That's it. An interactive wizard walks you through the setup, then kentui runs the job and shows a live progress bar. You'll get a book.m4b alongside your ebook when it's done.

📚 Usage

Interactive wizard (default)

kentui book.epub

Opens a configuration hub showing a live status panel of your current settings, then a menu:

┌─ Current Settings ───────────────────────────────────────────┐
│  Mode:          Multi-voice                                   │
│  NLP:           Anthropic · claude-haiku-4-5                 │
│  TTS Provider:  pocket-tts · local                           │
│  Narrator:      sarah                                         │
│  Chapters:      content-only (42 selected)                   │
│  Quality:       temp 0.8 · 30 LSD steps · 96k               │
└───────────────────────────────────────────────────────────────┘

  > Submit Job
    Narrator Voice →
    Chapters →
    Narration Mode →
    Series →
    Advanced Options →
    Cancel

Press Escape at any step to go back. All settings persist to ~/.config/kenkui/last_job_profile.toml and pre-load on the next run.

Headless mode

Pass a config file with -c to skip the wizard entirely:

kentui book.epub -c my-config.toml

Exits 0 on success, 1 on failure.

`kentui add`

# Interactive wizard
kentui add book.epub

# Headless
kentui add book.epub -c my-config.toml

Pipeline step commands

kentui parse book.epub       # Stage 1-2 NLP: entity scan + character clustering
kentui attribute book.epub   # Stage 3-4 NLP: speaker attribution
kentui generate book.epub    # TTS + stitch (requires prior NLP cache)

🎙️ Narration Modes

Single voice

The default. One voice narrates everything.

Multi-voice (character narration)

kenkui uses an NLP pipeline to identify characters and assigns each a distinct voice. The narrator gets its own voice too.

Two NLP backends are available: Ollama (local, default) and cloud providers (Anthropic, OpenAI, Google).

Ollama (default)

Requirements:

Ollama running locally (ollama serve)
NLP model pulled (default: llama3.2) — ollama pull llama3.2
spaCy model — downloaded automatically if missing

Cloud providers (Anthropic, OpenAI, Google)

Run kentui config and answer yes to "Configure a cloud NLP provider API key?" to set up credentials.

Default models:

Provider	Default model
`anthropic`	`claude-sonnet-4-6`
`openai`	`gpt-4o`
`google`	`gemini/gemini-2.0-flash`

How voice assignment works:

After the scan completes, voices are assigned using a three-tier priority system:

Series record — named character → pinned voice (highest priority)
Voice pool template — role + gender + rank → voice
Round-robin pool — any remaining characters

Chapter-voice mode

Assign a distinct voice to each chapter. The wizard presents each chapter title and lets you pick a voice.

🗂️ Voice Pool Template

The voice pool template (~/.config/kenkui/voice_pool.toml) pre-assigns voices by character role, gender, and rank. It applies automatically to every multi-voice job.

[protagonist.male]
1 = "david"
2 = "james"
pool = ["oliver", "ethan"]

[protagonist.female]
1 = "sarah"
pool = ["emma", "claire"]

[supporting.male]
pool = ["oliver", "ethan", "marcus"]

[minor]
pool = []  # fallback: any non-excluded voice

🎙️ Voice System

Voices come in three tiers:

Tier	Source	Auth required?
Compiled	Downloaded from HuggingFace on first run	No
Built-in	8 pocket-tts defaults	No
Custom	`.wav` files (user-provided or fetched)	Yes (HuggingFace)

Built-in voices:

alba, marius, javert, jean, fantine, cosette, eponine, azelma

Voice manager

kentui voices

Launches an interactive voice manager: browse, audition, manage the exclusion pool, and look up the character cast for a completed multi-voice book.

Voice commands

# List voices (with optional filters)
kentui voices list
kentui voices list --gender Female
kentui voices list --accent Scottish
kentui voices list --source compiled

# Audition a voice
kentui voices audition <voice>
kentui voices audition <voice> --text "Your preview text here."

# Download compiled voices
kentui voices download
kentui voices download --force

# Fetch custom voices from HuggingFace
kentui voices fetch --repo user/repo-name

# Manage auto-assignment pool
kentui voices exclude <voice>
kentui voices include <voice>

# Look up a book's character cast
kentui voices cast <title>

⚙️ Configuration

# Create or edit the default config
kentui config

# Create a named config profile
kentui config fast-mode

# Use a named config
kentui book.epub -c fast-mode

Key settings

Key	Default	Description
`workers`	`cpu_count - 2`	Parallel TTS worker processes
`m4b_bitrate`	`96k`	Output audio bitrate
`temp`	`0.7`	Sampling temperature
`lsd_decode_steps`	`1`	LSD decode steps (higher = better quality, slower)
`default_voice`	`alba`	Fallback voice
`default_chapter_preset`	`content-only`	Default chapter filter preset
`pause_line_ms`	`800`	Pause between lines (ms)
`pause_chapter_ms`	`2000`	Pause between chapters (ms)
`pause_scene_break_ms`	`4000`	Pause at scene breaks (ms)
`nlp_provider`	`ollama`	NLP backend
`nlp_model`	`llama3.2`	Model for speaker attribution
`credits_enabled`	`true`	Append synthesized credits audio

📖 Chapter Selection

Preset	Description
`content-only`	Body chapters only (default)
`chapters-only`	Titled chapters only
`with-parts`	Chapters and part headings
`all`	Every item in the ebook
`none`	Skip everything

After selecting a preset, the wizard shows a checkbox list of all chapters with the preset's defaults pre-selected.

🔊 Audio Post-Processing

kenkui applies a broadcast-quality effects chain: noise reduction → high-pass filter → low shelf EQ → presence boost → de-esser → compressor → limiter → autogain. All parameters are configurable via kentui config.

FAQ

Do I need a GPU? No. kenkui is 100% CPU-based.

What ebook formats does it support? EPUB, MOBI/AZW/AZW3/AZW4, and FB2.

What output format does it use? M4B, with chapters, metadata, and embedded covers.

Do I need Ollama for multi-voice? No. You can use Ollama (local) or Anthropic, OpenAI, or Google. Run kentui config to set up a cloud provider.

Does it upload my books anywhere? With the default Ollama backend: no. With a cloud NLP provider, the book text is sent to that provider's API for the character scan. Nothing else is uploaded.

🙏 Special Thanks

Thanks to Project Gutenberg for providing some of the public-domain books included with kenkui.

Voice Dataset Credits

kenkui's compiled voices are derived from two publicly available speech corpora.

CSTR VCTK Corpus

Veaux, Christoph; Yamagishi, Junichi; MacDonald, Kirsten. (2019). CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit. University of Edinburgh. The Centre for Speech Technology Research (CSTR).

Licensed under Creative Commons Attribution 4.0 (CC BY 4.0). Commercial use is permitted with attribution.

EARS Dataset

Licensed under Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0).

Note: Compiled voices sourced from EARS (identifiable by EARS in the voice name via kentui voices list) may not be used for commercial purposes. If you are building a commercial product with kenkui, use only VCTK-sourced or built-in voices.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kentui-0.1.0.tar.gz (56.8 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kentui-0.1.0-py3-none-any.whl (52.3 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file kentui-0.1.0.tar.gz.

File metadata

Download URL: kentui-0.1.0.tar.gz
Upload date: May 12, 2026
Size: 56.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for kentui-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f5c872bf10d5b9f7898a4ff415dc522e6114e6ba70d34cf51c18923ffe3a7451`
MD5	`58afa24f7e47bbbc8b100123a21cfe0a`
BLAKE2b-256	`c5449242ed2d74cdebc46445bbc510ba63aae2bf5b6a30bd279e29c8ddae24de`

See more details on using hashes here.

File details

Details for the file kentui-0.1.0-py3-none-any.whl.

File metadata

Download URL: kentui-0.1.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 52.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for kentui-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5cb5685cec96553e139b45a0352e1115dcf9a4d80a7b3e248b00e7d64bab1cf1`
MD5	`c27af57314418f848f8d488308cd98c7`
BLAKE2b-256	`40b196b14fc0be50955d3523ff2f0836168cd55c6c4a3e0621c31108b12b0d5d`

See more details on using hashes here.

kentui 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

kentui

✨ Features

🚀 Quick Start

Requirements

Install

Run

📚 Usage

Interactive wizard (default)

Headless mode

kentui add

Pipeline step commands

🎙️ Narration Modes

Single voice

Multi-voice (character narration)

Ollama (default)

Cloud providers (Anthropic, OpenAI, Google)

Chapter-voice mode

🗂️ Voice Pool Template

🎙️ Voice System

Voice manager

Voice commands

⚙️ Configuration

Key settings

📖 Chapter Selection

🔊 Audio Post-Processing

FAQ

🙏 Special Thanks

Voice Dataset Credits

CSTR VCTK Corpus

EARS Dataset

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`kentui add`