Convert EPUB e-books into high-quality audiobooks using multiple Text-to-Speech providers (Azure, Doubao, Qwen)

These details have not been verified by PyPI

Project links

Project description

EPUB to Speech

English | 中文

Convert EPUB e-books into high-quality audiobooks using multiple Text-to-Speech providers.

Features

📚 EPUB Support: Compatible with EPUB 2 and EPUB 3 formats
🎙️ Multiple TTS Providers: Supports Azure, Doubao, and Qwen TTS services
🔄 Auto-Detection: Automatically detects configured provider
🌍 Multi-Language Support: Supports various languages and voices
📱 M4B Output: Generates standard M4B audiobook format with chapter navigation
🧹 Noise Filtering: Automatically removes common reading noise (table-of-contents lines, decorative separators, isolated page numbers)
🔧 CLI Interface: Easy-to-use command-line tool with progress tracking

Basic Usage

epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural

Installation

Prerequisites

Python 3.11 or higher
FFmpeg (for audio processing)
TTS provider credentials (Azure, Doubao, or Qwen)

Install Dependencies

# Install Python dependencies
pip install poetry
poetry install

# Install FFmpeg
# macOS: brew install ffmpeg
# Ubuntu/Debian: sudo apt install ffmpeg
# Windows: Download from https://ffmpeg.org/download.html

Quick Start

Option 1: Using Azure TTS

Set environment variables and run:

export AZURE_SPEECH_KEY="your-subscription-key"
export AZURE_SPEECH_REGION="your-region"

epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural

Where to get credentials:

Create an Azure account at https://azure.microsoft.com
Create a Speech Service resource in Azure Portal
Get your subscription key and region from the dashboard

Available voices:

Voice list: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support?tabs=tts#voice-styles-and-roles
Voice gallery (preview): https://speech.microsoft.com/portal/voicegallery

Option 2: Using Doubao TTS

Set environment variables and run:

export DOUBAO_ACCESS_TOKEN="your-access-token"
export DOUBAO_BASE_URL="your-api-base-url"

epub2speech input.epub output.m4b --voice zh_male_lengkugege_emo_v2_mars_bigtts

Where to get credentials:

Get your Doubao access token and API base URL from Volcengine console

Available voices: https://www.volcengine.com/docs/6561/1257544 (Find voice IDs in the Doubao TTS documentation)

Option 3: Using Qwen TTS

Set environment variables and run:

export QWEN_ACCESS_TOKEN="your-access-token"
export QWEN_BASE_URL="your-api-base-url"

epub2speech input.epub output.m4b --provider qwen --voice Cherry

Where to get credentials:

Get your Qwen access token and API base URL from Alibaba console

Available voices: https://help.aliyun.com/zh/model-studio/qwen-tts#bac280ddf5a1u (Find voice IDs in the Qwen TTS documentation)

Provider Auto-Detection

If you have configured only one provider, it will be automatically detected and used. If multiple providers are configured, specify which one to use:

# Explicitly use Azure
epub2speech input.epub output.m4b --provider azure --voice zh-CN-XiaoxiaoNeural

# Explicitly use Doubao
epub2speech input.epub output.m4b --provider doubao --voice zh_male_lengkugege_emo_v2_mars_bigtts

# Explicitly use Qwen
epub2speech input.epub output.m4b --provider qwen --voice Cherry

Advanced Options

General Options

# Limit to first 5 chapters
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural --max-chapters 5

# Use custom workspace directory
epub2speech input.epub output.m4b --voice zh-CN-YunxiNeural --workspace /tmp/my-workspace

# Quiet mode (no progress output)
epub2speech input.epub output.m4b --voice ja-JP-NanamiNeural --quiet

# Set maximum characters per TTS segment (default: 500)
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural --max-tts-segment-chars 800

# Use more conservative cleaning (keep more short lines)
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural --cleaning-strictness conservative

# Dump per-chapter cleaning reports into workspace
epub2speech input.epub output.m4b --voice zh-CN-XiaoxiaoNeural --dump-cleaning-report

Azure TTS Configuration

Pass credentials via command-line arguments:

epub2speech input.epub output.m4b \
  --voice zh-CN-XiaoxiaoNeural \
  --azure-key YOUR_KEY \
  --azure-region YOUR_REGION

Doubao TTS Configuration

Pass credentials via command-line arguments:

epub2speech input.epub output.m4b \
  --voice zh_male_lengkugege_emo_v2_mars_bigtts \
  --doubao-token YOUR_TOKEN \
  --doubao-url YOUR_BASE_URL

Qwen TTS Configuration

Pass credentials via command-line arguments:

epub2speech input.epub output.m4b \
  --provider qwen \
  --voice Cherry \
  --qwen-token YOUR_TOKEN \
  --qwen-url YOUR_BASE_URL

How It Works

EPUB Parsing: Extracts text content and metadata from EPUB files
Chapter Detection: Identifies chapters using EPUB navigation data
Text Processing: Cleans and segments text for optimal speech synthesis
Audio Generation: Converts text to speech using your chosen TTS provider
M4B Creation: Combines audio files with chapter metadata into M4B format

Development

Using as a Library

You can integrate epub2speech into your own Python application:

from pathlib import Path
from epub2speech import convert_epub_to_m4b, ConversionProgress
from epub2speech.tts.azure_provider import AzureTextToSpeech
# Or use: from epub2speech.tts.doubao_provider import DoubaoTextToSpeech
# Or use: from epub2speech.tts.qwen_provider import QwenTextToSpeech

# Initialize TTS provider
tts = AzureTextToSpeech(
    subscription_key="your-key",
    region="your-region"
)

# Optional: Define progress callback
def on_progress(progress: ConversionProgress):
    print(f"{progress.progress:.1f}% - Chapter {progress.current_chapter}/{progress.total_chapters}")

# Convert EPUB to M4B
result = convert_epub_to_m4b(
    epub_path=Path("input.epub"),
    workspace=Path("./workspace"),
    output_path=Path("output.m4b"),
    tts_protocol=tts,
    voice="zh-CN-XiaoxiaoNeural",
    max_chapters=None,  # Optional: limit chapters
    max_tts_segment_chars=500,  # Optional: max characters per TTS segment (default: 500)
    cleaning_strictness="balanced",  # Optional: conservative / balanced / aggressive
    dump_cleaning_report=False,  # Optional: write cleaning_report.json per chapter
    progress_callback=on_progress  # Optional
)

if result:
    print(f"Success: {result}")

Running Tests

python test.py

Run specific test modules:

python test.py --test test_epub_picker
python test.py --test test_tts

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

ebooklib for EPUB parsing
FFmpeg for audio processing
spaCy for natural language processing

Support

For issues and questions:

Check existing GitHub issues
Create a new issue with detailed information
Include EPUB file samples if relevant (ensure no copyright restrictions)”，“file_path”:

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.10

Mar 26, 2026

0.0.9

Mar 25, 2026

0.0.8

Mar 3, 2026

0.0.7

Mar 3, 2026

0.0.6

Mar 3, 2026

0.0.5

Jan 19, 2026

0.0.4

Jan 13, 2026

0.0.3

Sep 25, 2025

0.0.2

Sep 25, 2025

0.0.1

Sep 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epub2speech-0.0.10.tar.gz (31.6 kB view details)

Uploaded Mar 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

epub2speech-0.0.10-py3-none-any.whl (35.6 kB view details)

Uploaded Mar 26, 2026 Python 3

File details

Details for the file epub2speech-0.0.10.tar.gz.

File metadata

Download URL: epub2speech-0.0.10.tar.gz
Upload date: Mar 26, 2026
Size: 31.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/25.3.0

File hashes

Hashes for epub2speech-0.0.10.tar.gz
Algorithm	Hash digest
SHA256	`837251d4e86e9967c03aefa151e722a41066ee6582f7e4f50f27b6a30eb3ed66`
MD5	`405af647afa68f1119248275a48d5b54`
BLAKE2b-256	`82ad6619171420e7accc917908732f1662b978a7c723650465a7aea5cdbe2cbd`

See more details on using hashes here.

File details

Details for the file epub2speech-0.0.10-py3-none-any.whl.

File metadata

Download URL: epub2speech-0.0.10-py3-none-any.whl
Upload date: Mar 26, 2026
Size: 35.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/25.3.0

File hashes

Hashes for epub2speech-0.0.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b0f6ba660ee534ae0a9730c042fc3425748a94d24851b763fcddee09ce7524b2`
MD5	`d566d88dfadbe7006f61a4911f13b603`
BLAKE2b-256	`1c516f2c80b5e2f85d34382ecf21350c9c5b527a9eb95770a06bdca5e2f5ee64`

See more details on using hashes here.

epub2speech 0.0.10

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

EPUB to Speech

Features

Basic Usage

Installation

Prerequisites

Install Dependencies

Quick Start

Option 1: Using Azure TTS

Option 2: Using Doubao TTS

Option 3: Using Qwen TTS

Provider Auto-Detection

Advanced Options

General Options

Azure TTS Configuration

Doubao TTS Configuration

Qwen TTS Configuration

How It Works

Development

Using as a Library

Running Tests

Contributing

License

Acknowledgments

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes