Convert EPUB ebooks to OGG audiobooks with chapter markers
Project description
EPUB to Audiobook Converter
A command-line tool and library that converts EPUB ebooks to OGG audiobooks with chapter markers, using local text-to-speech processing with Kokoro.
Features
- Converts EPUB 3.0 files to OGG audiobooks
- Local text-to-speech processing using Kokoro
- Support for 9 languages with multiple voices:
- 🇺🇸 American English (11F, 9M voices)
- 🇬🇧 British English (4F, 4M voices)
- 🇯🇵 Japanese (4F, 1M voices)
- 🇨🇳 Mandarin Chinese (4F, 4M voices)
- 🇪🇸 Spanish (1F, 2M voices)
- 🇫🇷 French (1F voice)
- 🇮🇳 Hindi (2F, 2M voices)
- 🇮🇹 Italian (1F, 1M voices)
- 🇧🇷 Brazilian Portuguese (1F, 2M voices)
- Chapter markers in output files
- Configurable voice selection and speech rate
- Progress reporting with optional quiet mode
- Metadata preservation from EPUB to audio file
Requirements
- Python 3.10 or higher
- Dependencies listed in
pyproject.toml
Installation
- Pip install
pip install epub2audio
- Clone this repository:
git clone https://github.com/clayrosenthal/epub2audio.git
cd epub2audio
- Install dev setup using mise (recommended):
mise install
Or manually with a virtual environment:
python -m venv .venv
source .venv/bin/activate
pip install -e .
Usage
Basic Usage
Convert an EPUB file to an audiobook with default settings:
epub2audio input.epub
The audiobook will be saved as Book_Title.ogg, but can be set with --output.
Voice Selection
The tool supports multiple voices across different languages. Here are some notable voices:
American English
af_bella(Female, Grade A-) - High quality with extended trainingaf_heart(Female, Grade A) - Best overall qualityaf_nicole(Female, Grade B-) - Good quality with extended trainingam_fenrir(Male, Grade C+) - Best male voice option
British English
bf_emma(Female, Grade B-) - Best British female voicebm_fable(Male, Grade C) - Best British male voice
Other Languages
ff_siwis(French Female, Grade B-)jf_alpha(Japanese Female, Grade C+)if_sara(Italian Female, Grade C)hf_alpha(Hindi Female, Grade C)
To use a specific voice:
epub2audio input.epub --voice af_bella
Advanced Options
epub2audio input.epub \
--output output.ogg \
--voice af_bella \
--speech-rate 1.0 \
--quiet
Command Line Options
input.epub: Path to input EPUB file--output,-o: Path of output audiobook file, defaults to title of the ebook.--voice,-v: Name of the voice to use (default: af_heart)--speech-rate,-r: Speech rate multiplier (default: 1.0)--quiet,-q: Suppress progress reporting--verbose,-v: Output more verbose logs--cache,-c: Cache generated audio files for reuse--max-chapters,-m: Max number of chapters to generate, or -1 for unlimited--format,-f: Output container format
Voice Quality Grades
Voices are graded based on quality and training data:
- A: Exceptional quality, extensive training
- B: Good quality, suitable for most uses
- C: Average quality, may have minor issues
- D: Basic quality, may have noticeable issues
- F: Limited quality, recommended only if necessary
Modifiers (+/-) indicate slight variations within each grade.
Development
Project Structure
src/
├── __init__.py
├── epub_processor.py # EPUB parsing and text extraction
├── audio_converter.py # TTS conversion using Kokoro
├── audio_handler.py # OGG creation, chapter markers, metadata
├── epub2audio.py # Main class, command line interface
├── voices.py # Voice definitions and management
├── helpers.py # Utility functions
└── config.py # Configuration settings
Running Tests
Using mise:
# Run all tests
mise run test
# Run integration tests
mise run test-integration
# Run with coverage
mise run test-coverage
Or manually:
# Run all tests
pytest
# Run integration tests
pytest --run-integration
# Run with coverage
pytest --cov=src tests/
Code Quality
The project uses:
- Ruff for formatting and linting
- MyPy for type checking
- Pytest for testing
To format and lint code:
mise run format # Format code
mise run lint # Check code
mise run fix # Auto-fix issues
Error Handling
Critical Errors (Exit with error code)
- Invalid/corrupted EPUB file
- Invalid voice model selection
- File system errors (read/write permissions)
- Insufficient disk space
Non-Critical Errors (Warning and continue)
- Non-text elements in EPUB
- Unsupported metadata fields
- Minor formatting issues
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
mise run test) - Format code (
mise run format) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
TO-DO:
- Add support for epub by link
- Add support for taking in multiple epubs on the command kine
- Add a webserver so you could host this
- Add support for more audio output formats
- Add support for images and vector graphics
- Either their alt text, or generate it with AI
- Better integration tests
- Add support for ONNX runtime
- Add support for other AI models
Development Guidelines
- Follow Google Python style guide
- Add tests for new features
- Update documentation as needed
- Keep commits focused and atomic
License
AGPL-3.0-or-later - See LICENSE file for details
Acknowledgments
- Kokoro for text-to-speech processing
- Huggingface for model weights
- ebooklib for EPUB handling
- mutagen for audio metadata
- Voice training data contributors (see individual voice attributions)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file epub2audio-0.5.0.tar.gz.
File metadata
- Download URL: epub2audio-0.5.0.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b8ed027b06d8035667b08f69da144ad1a2bd01e1c450602d6d9ba9b204086d8
|
|
| MD5 |
527d8a508c59fcce8b050be78cb42621
|
|
| BLAKE2b-256 |
32b624433a9cd44feeb09a95fdf5836f87ef7f772298a63dbc48594b85d2f7ae
|
Provenance
The following attestation bundles were made for epub2audio-0.5.0.tar.gz:
Publisher:
pypi.yaml on clayrosenthal/epub2audio
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
epub2audio-0.5.0.tar.gz -
Subject digest:
0b8ed027b06d8035667b08f69da144ad1a2bd01e1c450602d6d9ba9b204086d8 - Sigstore transparency entry: 199784629
- Sigstore integration time:
-
Permalink:
clayrosenthal/epub2audio@fb58eab3ad97a0a7c272ee223c39d2b6fa2abf66 -
Branch / Tag:
refs/tags/0.5.0 - Owner: https://github.com/clayrosenthal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yaml@fb58eab3ad97a0a7c272ee223c39d2b6fa2abf66 -
Trigger Event:
push
-
Statement type:
File details
Details for the file epub2audio-0.5.0-py3-none-any.whl.
File metadata
- Download URL: epub2audio-0.5.0-py3-none-any.whl
- Upload date:
- Size: 38.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cea54de63a63a419e901faa940ccb9bf4b2cf2e231113f1400a0f5e52cfa395d
|
|
| MD5 |
6d8ce89ab23bae8f7cd1dbdb3b63b082
|
|
| BLAKE2b-256 |
9cad164119056f5f44e638a76fdb76c714660ae28d5060a43e53b4efa56cfa0f
|
Provenance
The following attestation bundles were made for epub2audio-0.5.0-py3-none-any.whl:
Publisher:
pypi.yaml on clayrosenthal/epub2audio
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
epub2audio-0.5.0-py3-none-any.whl -
Subject digest:
cea54de63a63a419e901faa940ccb9bf4b2cf2e231113f1400a0f5e52cfa395d - Sigstore transparency entry: 199784632
- Sigstore integration time:
-
Permalink:
clayrosenthal/epub2audio@fb58eab3ad97a0a7c272ee223c39d2b6fa2abf66 -
Branch / Tag:
refs/tags/0.5.0 - Owner: https://github.com/clayrosenthal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yaml@fb58eab3ad97a0a7c272ee223c39d2b6fa2abf66 -
Trigger Event:
push
-
Statement type: