Convert ebooks (EPUB, PDF, TXT) to audiobooks (M4B/MP3) with chapter markers

These details have not been verified by PyPI

Project links

Project description

Book to Audiobook

Convert ebooks (EPUB, MOBI, AZW3, PDF, TXT) to audiobooks (M4B / MP3) with chapter markers.

Prerequisites

Python >= 3.11
FFmpeg (required for audio processing)
Calibre (optional, for MOBI/AZW3 support via ebook-convert)

Install

pip install book-to-audiobook

Or install from source:

git clone https://github.com/hugcosmos/book-to-audiobook.git
cd book-to-audiobook
pip install -e .

Usage

Web Interface

# Start server
./start.sh

# Stop server
./stop.sh

# Or run directly
python -m app.main

Open http://localhost:8000 in your browser.

Command Line Interface (CLI)

After installation, the book2audio command is available system-wide:

# Verify installation
book2audio --help

CLI Commands

Command	Description
`convert`	Convert ebook to audiobook
`chapters`	Preview book chapters (shows char count and estimated time)
`voice`	Manage voices (list/add/delete)
`config`	Manage configuration (show/get/set/reset)
`library`	Manage audiobook library (`list`, `delete`)
`serve`	Start web server

Convert Options

Option	Description
`-c, --chapters`	Chapter range (e.g., `1-5,7,10-` for chapters 1-5, 7, 10+)
`-p, --provider`	TTS provider: `edge-tts`, `elevenlabs`, `baidu-tts`, `iflytek-tts`, `qwen3_mlx`
`-v, --voice`	Voice name
`-l, --language`	Language code (zh-CN, en-US, ja-JP, etc.)
`-s, --speed`	Speech speed (0.5-2.0)
`--model-path`	Local model path (for qwen3_mlx provider)
`--book-id`	Convert using existing library book by ID (no INPUT_FILE needed)

CLI Examples

# Convert entire book with default settings
book2audio convert /path/to/book.pdf

# Convert specific chapters
book2audio convert book.epub -c 1-10

# Use specific provider and voice
book2audio convert book.pdf -p edge-tts -v zh-CN-XiaoyiNeural -s 1.2

# Use local qwen3_mlx model (avoid repeated downloads)
book2audio convert book.epub -p qwen3_mlx --model-path ~/.cache/huggingface/hub/models--mlx-community--Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit

# Add to existing library book (shares output with web app, preserves cover)
book2audio convert --book-id a74e947e332e -c 11-20

# Or provide file path (auto-registers if not in library)
book2audio convert new_chapters.pdf --book-id a74e947e332e -c 11-20

# Preview chapters (shows char count and estimated conversion time)
book2audio chapters my_book.pdf

# List available voices
book2audio voice list

# Add custom voice
book2audio voice add --provider elevenlabs --voice-id "xxx" --name "My Voice" --language en-US

# Configure settings
book2audio config show
book2audio config set tts.provider edge-tts
book2audio config set qwen3_mlx.model_path "/path/to/local/model"

# List library books (to get book_id)
book2audio library list

# Start web server
book2audio serve

CLI Quick Start Example

# 1. Clone and install
git clone https://github.com/hugcosmos/book-to-audiobook.git
cd book-to-audiobook
pip install -e .

# 2. Check available voices
book2audio voice list

# 3. Configure default provider (optional)
book2audio config set tts.provider edge-tts

# 4. Set local model path to avoid repeated downloads (qwen3_mlx users)
book2audio config set qwen3_mlx.model_path "~/.cache/huggingface/hub/models--mlx-community--Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit"

# 5. Preview book chapters (auto-registers book in library)
book2audio chapters my_book.pdf

# 6. Convert to audiobook
book2audio convert my_book.pdf --chapters 1-10 --speed 1.2

# 7. Find output files
# Output saved to: output/{book_id}/ (check with: book2audio library list)

CLI and Web App Integration

CLI and Web app share the same library — books, metadata, and conversion records:

chapters and convert commands auto-register books into the shared library
Configuration: config/user_settings.json
Book storage: uploads/{book_id}/ with meta.json
Output: output/{book_id}/ (per-chapter MP3s + combined M4B/MP3)
CLI conversions appear in Web, and vice versa
--book-id lets you convert without providing the file path again

Workflow

Web Interface

Upload — Drag & drop an ebook on the library page
Select chapters — Click the book card, pick chapters to convert
Configure TTS — Choose provider, language, voice, speed
Convert — Progress shows inline, no page navigation
Download — Files appear in Generated Files after completion

Command Line

Preview — book2audio chapters /path/to/book.pdf
Configure — book2audio config set tts.provider edge-tts (optional)
Convert — book2audio convert /path/to/book.pdf -c 1-10
Locate — Output files saved to output/ directory

Conversions are saved — restart the server and all books/history remain unless you delete them.

TTS Providers

Cloud Providers (require API keys)

Provider	Env Variables	Description
Edge TTS	None needed	Microsoft Edge online TTS. Free, no API key.
ElevenLabs	`B2A_ELEVENLABS__API_KEY`	High-quality multilingual TTS. Get key at elevenlabs.io.
Baidu TTS	`B2A_BAIDU_TTS__API_KEY`, `B2A_BAIDU_TTS__SECRET_KEY`	Baidu speech synthesis. Get credentials at cloud.baidu.com.
iFlytek TTS	`B2A_IFLYTEK_TTS__APP_ID`, `B2A_IFLYTEK_TTS__API_KEY`, `B2A_IFLYTEK_TTS__API_SECRET`	iFlytek speech synthesis. Get credentials at xfyun.cn.

Local Models (no API key needed)

Qwen3 TTS via MLX — Apple Silicon

Runs entirely on-device. Requires Apple Silicon Mac (M1/M2/M3/M4).

# Faster model downloads (parallel transfer)
pip install hf-transfer

# China users: use HuggingFace mirror for faster downloads
export HF_ENDPOINT=https://hf-mirror.com

# Enable parallel downloads
export HF_HUB_ENABLE_HF_TRANSFER=1

Available models (configured in Settings page or config/user_settings.json):

Model	Quantization	Quality	Speed	Memory
`0.6B-CustomVoice-4bit`	4-bit	Good	Fastest	~0.3GB
`0.6B-CustomVoice-8bit`	8-bit	Good	Fast	~0.6GB
`1.7B-CustomVoice-4bit`	4-bit	Great	Fast	~0.85GB
`1.7B-CustomVoice-8bit`	8-bit	Great	Fast	~1.7GB
`1.7B-CustomVoice-bf16`	bf16	Best	Slower	~3.4GB

Set model in config/user_settings.json:

{
  "qwen3_mlx": {
    "model_name": "mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit",
    "speed": 1.2
  }
}

Configuration

All settings use the B2A_ env prefix:

Variable	Default	Description
`B2A_HOST`	`0.0.0.0`	Server bind address
`B2A_PORT`	`8000`	Server port
`B2A_UPLOAD_DIR`	`uploads`	Uploaded ebook storage
`B2A_OUTPUT_DIR`	`output`	Generated audio output
`B2A_MAX_UPLOAD_SIZE_MB`	`500`	Max upload file size
`B2A_FFMPEG_PATH`	`ffmpeg`	Path to ffmpeg binary
`B2A_FFPROBE_PATH`	`ffprobe`	Path to ffprobe binary

Set via environment variables or .env file:

export B2A_PORT=8000
./start.sh

Supported Languages

Chinese, English (US/UK), Japanese, Korean, French, German, Spanish, Russian.

Project Structure

app/               # FastAPI web app
  main.py          # Entry point
  routes/          # HTTP routes
  templates/       # Jinja2 HTML templates
  static/          # CSS + JS
cli/               # Command Line Interface
  main.py          # CLI entry point with click commands
core/              # Core logic
  converter.py     # Conversion orchestrator + state persistence
  models.py        # Pydantic data models
  book_parser/     # EPUB, MOBI, PDF, TXT parsers
  tts_provider/    # TTS providers (Edge, Baidu, iFlytek, ElevenLabs, Qwen3 MLX)
  audio_builder/   # FFmpeg audio assembly (M4B/MP3)
  text_processor/  # Text cleaning + chunking
config/            # Settings (pydantic-settings)
uploads/           # Uploaded ebooks + meta.json state files
output/            # Generated audiobook files

Dependencies & Licensing

All dependencies use permissive licenses compatible with MIT:

License	Packages
MIT	fastapi, pydantic, pydantic-settings, beautifulsoup4, pdfplumber, pydub, sentencex, mlx-audio
BSD-3-Clause	uvicorn, jinja2, lxml, httpx, websockets, soundfile
Apache-2.0	python-multipart, aiofiles, hf-transfer
LGPL-3.0	edge-tts

EPUB parsing uses a built-in parser (zipfile + lxml) — no ebooklib dependency.

License

MIT — see LICENSE.

Disclaimer

Edge TTS: This project includes edge-tts as one TTS provider, which connects to Microsoft Edge's online text-to-speech service. This is not an official Microsoft API and may violate Microsoft's Terms of Service.

Alternative Providers: Users can choose alternative TTS providers (ElevenLabs, Baidu TTS, iFlytek TTS, or local Qwen3 MLX models) to avoid using Edge TTS. See the TTS Providers section for configuration details.

Use at your own risk: The authors are not responsible for any violations of third-party terms of service.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.5

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

book_to_audiobook-0.1.5.tar.gz (67.2 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

book_to_audiobook-0.1.5-py3-none-any.whl (79.9 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file book_to_audiobook-0.1.5.tar.gz.

File metadata

Download URL: book_to_audiobook-0.1.5.tar.gz
Upload date: May 13, 2026
Size: 67.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for book_to_audiobook-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`9009b7d571602f6d75be7f6724e1d7f222ebf4bb439ad6f8fe2814f5ee12db9f`
MD5	`3e25d33411c143d79c6d3f8c1eb53acc`
BLAKE2b-256	`57e741033634c2c44b1d93a5d40c187cf106bd5e22f617291e34b6d0200eec6b`

See more details on using hashes here.

File details

Details for the file book_to_audiobook-0.1.5-py3-none-any.whl.

File metadata

Download URL: book_to_audiobook-0.1.5-py3-none-any.whl
Upload date: May 13, 2026
Size: 79.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for book_to_audiobook-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58e6655cfeea4331ba1b04a574fd2cbfa05be963ddc335d86b4a16f0ca50fdc9`
MD5	`0c92bdad9c79a5e70c9e4b9dd9e03afc`
BLAKE2b-256	`ade0e3fcacc537dd0262b3576d8b755e424d1c819a31259edb85d4e5ddd4a0ac`

See more details on using hashes here.

book-to-audiobook 0.1.5

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Book to Audiobook

Prerequisites

Install

Usage

Web Interface

Command Line Interface (CLI)

CLI Commands

Convert Options

CLI Examples

CLI Quick Start Example

CLI and Web App Integration

Workflow

Web Interface

Command Line

TTS Providers

Cloud Providers (require API keys)

Local Models (no API key needed)

Qwen3 TTS via MLX — Apple Silicon

Configuration

Supported Languages

Project Structure

Dependencies & Licensing

License

Disclaimer

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes