Convert ebooks (EPUB, PDF, TXT) to audiobooks (M4B/MP3) with chapter markers

These details have not been verified by PyPI

Project links

Project description

Book to Audiobook

Convert ebooks (EPUB, MOBI, AZW3, PDF, TXT) to audiobooks (M4B / MP3) with chapter markers.

Prerequisites

Python >= 3.11
FFmpeg
Calibre (optional, for MOBI/AZW3 — note: Calibre is GPLv3, separate from this project's MIT license)

Install

Create an isolated environment first to avoid dependency conflicts:

venv:

python3 -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

conda:

conda create -n book2audio python=3.12 -y
conda activate book2audio

Kokoro (the default provider on Intel Mac) needs Python ≤ 3.12 because onnxruntime has no macOS x86 wheel for Python 3.13.

Then choose one:

From PyPI:

pip install book-to-audiobook

From source:

git clone https://github.com/hugcosmos/book-to-audiobook.git
cd book-to-audiobook
pip install -e .

The default install pulls in a working local TTS engine: Qwen3 MLX on Apple Silicon, Kokoro on Intel Mac (82M ONNX/CPU model, 100+ Chinese voices). No extra step needed — the provider is selected automatically. Cloud providers (Edge/Baidu/iFlytek/ElevenLabs) only need API keys.

Optional local providers

Supertonic (ONNX, 31 languages): pip install "book-to-audiobook[supertonic]"
Kokoro on Apple Silicon: pip install "book-to-audiobook[kokoro]"

Quick Start

# Web UI
book2audio serve                          # http://localhost:8000

# CLI
book2audio chapters book.epub            # preview chapters
book2audio convert book.epub -c 1-10     # convert chapters 1-10
book2audio doc                            # full command reference

`serve` vs `start.sh`

book2audio serve runs the server in the foreground with hot-reload — best for everyday use and development. To run it in the background (survives logout, logs to .server.log), use the helper scripts from the source checkout:

./start.sh     # background, reload off by default (B2A_RELOAD=1 to re-enable)
./stop.sh      # stop the background server (also sets HF_HUB_ENABLE_HF_TRANSFER=1)

start.sh automatically finds a Python that has the project dependencies installed (it checks your active venv/conda environment, then any book2audio* conda env) — the default python3 usually won't have them. To override, set PYTHON=/path/to/venv/bin/python.

CLI Reference

Run book2audio doc for detailed help, or book2audio <command> --help.

Command	Description
`convert`	Convert ebook to audiobook
`chapters`	List, view (`-t N`), and edit (`-e N`) chapters
`voice`	Manage voices (list/add/delete)
`config`	Manage configuration
`library`	Manage audiobook library
`serve`	Start web server
`doc`	Show full documentation

Convert

book2audio convert book.epub
book2audio convert book.epub -c 1-10 -p edge -v zh-CN-XiaoyiNeural -s 1.2
book2audio convert --book-id a74e947e332e -c 11-20
book2audio convert book.pdf -p qwen3_mlx -l en-US

Chapters

book2audio chapters book.epub                    # list chapters
book2audio chapters book.epub -t 3               # view chapter 3 text
book2audio chapters book.epub -t 3 --head 20     # first 20 lines only
book2audio chapters book.epub -e 3               # edit chapter 3 in $EDITOR
book2audio chapters --book-id abc123 -e 3-5      # edit chapters 3-5

Edited text is saved separately (original never modified). Conversion uses edited text automatically. Edited chapters show [edited] tag.

Web UI

Open http://localhost:8000. Drag & drop ebooks, select chapters, configure TTS, convert. Chapter text can be edited inline via the edit button on each chapter.

CLI and Web share the same library — books, edits, and conversion records persist across both.

TTS Providers

Cloud

Provider	Setup	Cost
Edge TTS	No config needed	Free
ElevenLabs	`B2A_ELEVENLABS__API_KEY`	Paid
Baidu TTS	`B2A_BAIDU_TTS__API_KEY` + `B2A_BAIDU_TTS__SECRET_KEY`	Paid
iFlytek TTS	`B2A_IFLYTEK_TTS__APP_ID` + `B2A_IFLYTEK_TTS__API_KEY` + `B2A_IFLYTEK_TTS__API_SECRET`	Paid

Local — Qwen3 TTS via MLX (Apple Silicon)

On-device, no API key. Requires Apple Silicon Mac (M1+).

pip install hf-transfer
export HF_HUB_ENABLE_HF_TRANSFER=1
# China users: export HF_ENDPOINT=https://hf-mirror.com

Model	Size	Quality	Memory
`0.6B-CustomVoice-8bit`	Default	Good	~0.6GB
`1.7B-CustomVoice-4bit`	Larger	Great	~0.85GB
`1.7B-CustomVoice-8bit`	Larger	Great	~1.7GB

Set via Settings page or book2audio config set qwen3_mlx.model_name <model>.

Local — Kokoro (ONNX / CPU, via kokoro-onnx)

Default on Intel Mac. Kokoro-82M is a compact 82M-parameter TTS model running on CPU via ONNX Runtime. The v1.1-zh model supports 100 Chinese voices + English. On Apple Silicon, install via pip install "book-to-audiobook[kokoro]".

Model files (~380MB) are downloaded automatically on first use to ~/.cache/book2audio/kokoro (override with B2A_KOKORO__MODEL_DIR). 12 voices are curated by default (6 female, 6 male); all 100 Chinese voices are available via the Voice Manager.

Local — Supertonic (ONNX)

On-device, 33 languages, no API key. Works on any platform (CPU/GPU).

pip install "book-to-audiobook[supertonic]"
# or directly: pip install supertonic

10 built-in voices (5 male, 5 female). Supports English, Japanese, Korean, Arabic, German, French, Spanish, Russian, and 22 more languages.

Supported Languages

Availability depends on provider:

Language	Edge	Qwen3 MLX	ElevenLabs	Supertonic	Kokoro	Baidu	iFlytek
Chinese (zh-CN)	✓	✓	✓	—	✓	✓	✓
English (en-US)	✓	✓	✓	✓	✓	—	—
Japanese	✓	✓	✓	✓	—	—	—
Korean	✓	✓	✓	✓	—	—	—
French	✓	✓	✓	✓	—	—	—
German	✓	✓	✓	✓	—	—	—
Spanish	✓	✓	✓	✓	—	—	—
Russian	✓	✓	✓	✓	—	—	—
Portuguese	—	✓	✓	✓	—	—	—
Italian	—	✓	✓	✓	—	—	—
+ 22 more	—	—	—	✓	—	—	—

Configuration

Environment variables with B2A_ prefix, or book2audio config set:

book2audio config show
book2audio config set tts.provider edge
book2audio config set qwen3_mlx.speed 1.2

Variable	Default	Description
`B2A_HOST`	`0.0.0.0`	Server bind address
`B2A_PORT`	`8000`	Server port
`B2A_UPLOAD_DIR`	`uploads`	Ebook storage
`B2A_OUTPUT_DIR`	`output`	Audio output

Project Structure

app/               # FastAPI web app (routes, templates, static)
cli/               # Click CLI commands
core/              # converter, models, parsers, TTS providers, audio builder
config/            # Settings (pydantic-settings) + user_settings.json
uploads/           # Uploaded ebooks + chapter edits + meta.json
output/            # Generated audiobook files

Disclaimer

Edge TTS: This project includes edge-tts as one TTS provider, which connects to Microsoft Edge's online text-to-speech service. This is not an official Microsoft API and may violate Microsoft's Terms of Service.

Users can choose alternative providers (Kokoro, Qwen3 MLX, Supertonic, ElevenLabs, Baidu, iFlytek) to avoid Edge TTS. Use at your own risk — the authors are not responsible for any violations of third-party terms of service.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Jun 24, 2026

0.1.23

May 24, 2026

0.1.17

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

book_to_audiobook-0.2.0.tar.gz (98.7 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

book_to_audiobook-0.2.0-py3-none-any.whl (114.3 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file book_to_audiobook-0.2.0.tar.gz.

File metadata

Download URL: book_to_audiobook-0.2.0.tar.gz
Upload date: Jun 24, 2026
Size: 98.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for book_to_audiobook-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`fdd5a51445c8b438ad13c117108204c2d74f6187cc1c728f85dacc2007d918ba`
MD5	`f08fe2f5ecc9aeb7112a2bdea2109bca`
BLAKE2b-256	`68c63d7e8fa1887305d6306f26e54c61a098679f433a018d868c78868fd02831`

See more details on using hashes here.

File details

Details for the file book_to_audiobook-0.2.0-py3-none-any.whl.

File metadata

Download URL: book_to_audiobook-0.2.0-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 114.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for book_to_audiobook-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9128b4e10f3504852325363e7fdacad1f8a42f755ce3ab6a58965a0fcda3ea1c`
MD5	`d14c751e3765fd0fdf96764069ace2c2`
BLAKE2b-256	`5d8c1a866cbbc1df58c8126bb59236863455d2a98631bbd7313cf5e20d013899`

See more details on using hashes here.

book-to-audiobook 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Book to Audiobook

Prerequisites

Install

Optional local providers

Quick Start

serve vs start.sh

CLI Reference

Convert

Chapters

Web UI

TTS Providers

Cloud

Local — Qwen3 TTS via MLX (Apple Silicon)

Local — Kokoro (ONNX / CPU, via kokoro-onnx)

Local — Supertonic (ONNX)

Supported Languages

Configuration

Project Structure

Disclaimer

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`serve` vs `start.sh`