Skip to main content

TEPUB - Tools for EPUB: A comprehensive toolkit for EPUB processing including translation, audiobook generation, and export

Project description

TEPUB - Tools for EPUB

Transform EPUB books into translations, audiobooks, and web pages โ€“ automatically.

TEPUB is a comprehensive toolkit for processing EPUB files. Translate books into any language, create professional audiobooks with natural voices, export to markdown, or publish as interactive websites.

Version Python License


Features

๐Ÿ“– Translation

  • Multi-language support: Translate to/from any language
  • AI-powered: OpenAI GPT-4, Anthropic Claude, Google Gemini, xAI Grok, DeepL, or Ollama
  • Dual output modes:
    • Bilingual: Original and translation side-by-side (perfect for learning)
    • Translation-only: Professional translated edition
  • Smart processing: Auto-skip front/back matter, parallel translation, resume capability

๐ŸŽง Audiobook Creation

  • Dual TTS providers:
    • Edge TTS (Free): 57+ voices in multiple languages, no API key required
    • OpenAI TTS (Premium): 6 high-quality voices with superior naturalness
  • Professional output: M4A format with chapter markers and embedded cover art
  • Chapter management: Export, edit, and update chapter titles and timestamps
  • Flexible control: Adjustable speed, voice selection, resume support
  • Cost: Free with Edge TTS, or ~$11-22 per 300-page book with OpenAI TTS

๐Ÿ“ฑ Export Formats

  • Web: Interactive HTML viewer with live translation toggle
  • Markdown: Plain text with preserved formatting and images
  • EPUB: Bilingual or translation-only editions

Quick Start

Installation

Automatic (Mac/Linux)

git clone https://github.com/xiaolai/tepub.git
cd tepub
./install.sh
source .venv/bin/activate

Manual (All platforms)

git clone https://github.com/xiaolai/tepub.git
cd tepub
python -m venv .venv
source .venv/bin/activate    # Windows: .venv\Scripts\activate
pip install -e .[dev]

See INSTALL.md for detailed platform-specific instructions.

Translation Setup

1. Get an API key from your preferred provider:

  • OpenAI (Recommended: GPT-4, ~$0.50-2.00/book)
  • Anthropic (Claude, great for literature)
  • Ollama (Free, runs locally)

2. Configure TEPUB:

# Create .env file with your API key
echo 'OPENAI_API_KEY=sk-your-key-here' > .env

Basic Usage

Translate a book:

tepub extract mybook.epub
tepub translate mybook.epub --to "Simplified Chinese"
tepub export mybook.epub --epub

Create audiobook (Free Edge TTS):

tepub extract mybook.epub
tepub audiobook generate mybook.epub
# Interactive voice selection will appear

Create audiobook (Premium OpenAI TTS):

tepub audiobook generate mybook.epub --tts-provider openai --voice nova
# Requires OPENAI_API_KEY in environment

All-in-one pipeline:

tepub pipeline mybook.epub --to Spanish --epub

Common Tasks

Translation

Translate to different languages:

tepub pipeline book.epub --to "Simplified Chinese" --epub
tepub pipeline book.epub --to Spanish --epub
tepub pipeline book.epub --to French --epub

Choose translation provider:

tepub translate book.epub --to Spanish --provider anthropic
tepub translate book.epub --to Spanish --provider ollama

Translation-only output (smaller file):

tepub export book.epub --epub --output-mode translated-only

Audiobooks

Edge TTS (Free, 57+ voices):

# Interactive voice selection
tepub audiobook generate book.epub

# Specify voice directly
tepub audiobook generate book.epub --voice en-US-GuyNeural    # Male
tepub audiobook generate book.epub --voice en-US-JennyNeural  # Female
tepub audiobook generate book.epub --voice en-GB-RyanNeural   # British

# See all voices
edge-tts --list-voices

OpenAI TTS (Premium, 6 voices):

# Standard quality (tts-1)
tepub audiobook generate book.epub --tts-provider openai --voice nova

# Higher quality (tts-1-hd)
tepub audiobook generate book.epub --tts-provider openai --tts-model tts-1-hd --voice nova

# Adjust speed
tepub audiobook generate book.epub --tts-provider openai --voice nova --tts-speed 1.2

# Available OpenAI voices:
# - alloy: Neutral, balanced
# - echo: Male, authoritative
# - fable: British, expressive
# - onyx: Deep male, professional
# - nova: Female, friendly
# - shimmer: Female, warm

Custom cover image:

tepub audiobook generate book.epub --cover-path ~/Pictures/mycover.jpg

Chapter management:

# Preview chapter structure before generating audiobook
tepub audiobook export-chapters book.epub
# Edit chapters.yaml to customize chapter titles
tepub audiobook generate book.epub  # Uses custom titles from chapters.yaml

# Extract chapters from existing audiobook
tepub audiobook export-chapters audiobook.m4a

# Update audiobook with edited chapter markers
tepub audiobook update-chapters audiobook.m4a chapters.yaml

Export

Create web version:

tepub export book.epub --web
# Opens browser with interactive viewer

Export to markdown:

tepub extract book.epub
# Markdown files created automatically in: book/markdown/

Configuration

TEPUB uses a two-level configuration system:

Global Config: ~/.tepub/config.yaml

Apply settings to all books:

# Translation
source_language: auto
target_language: Simplified Chinese
translation_workers: 3

primary_provider:
  name: openai
  model: gpt-4o

# Audiobook
audiobook_tts_provider: edge    # or: openai
audiobook_workers: 3

# Skip rules
skip_rules:
  - keyword: index
  - keyword: appendix

Per-Book Config: book/config.yaml

Created automatically when you run tepub extract book.epub. Override global settings:

# Choose TTS provider
audiobook_tts_provider: openai
audiobook_tts_model: tts-1-hd
audiobook_voice: nova

# Or use Edge TTS
audiobook_tts_provider: edge
audiobook_voice: en-US-AriaNeural

# Custom cover
cover_image_path: ~/Pictures/mycover.jpg

# Output mode
output_mode: translated_only

# Skip specific sections
skip_rules:
  - keyword: prologue
  - keyword: epilogue

See config.example.yaml for all available options with detailed explanations.


Output Structure

Translation

mybook.epub                      # Original
mybook/                          # Workspace
โ”œโ”€โ”€ config.yaml                  # Per-book settings
โ”œโ”€โ”€ segments.json                # Extracted content
โ”œโ”€โ”€ state.json                   # Translation progress
โ””โ”€โ”€ markdown/                    # Markdown export
    โ”œโ”€โ”€ 001_chapter-1.md
    โ””โ”€โ”€ images/
mybook_bilingual.epub            # Output: both languages
mybook_translated.epub           # Output: translation only
mybook_web/                      # Web viewer

Audiobooks

mybook/
โ”œโ”€โ”€ audiobook@edgetts/           # Edge TTS audiobooks
โ”‚   โ”œโ”€โ”€ mybook.m4b               # Final audiobook
โ”‚   โ””โ”€โ”€ segments/                # Cached audio segments
โ””โ”€โ”€ audiobook@openaitts/         # OpenAI TTS audiobooks
    โ”œโ”€โ”€ mybook.m4b
    โ””โ”€โ”€ segments/

Provider-specific folders let you create both versions for comparison.


Advanced Features

Resume Interrupted Work

TEPUB automatically saves progress. To resume:

# Just run the same command again
tepub translate book.epub --to Spanish
tepub audiobook generate book.epub

Parallel Processing

Speed up translation (uses more API credits):

# In config.yaml
translation_workers: 5    # Default: 3
audiobook_workers: 5      # Default: 3

Custom Translation Style

# In config.yaml
prompt_preamble: |
  You are a literary translator specializing in preserving artistic voice.
  {language_instruction}
  {mode_instruction}
  Maintain the author's style, tone, metaphors, and cultural nuances.

Selective File Processing

After extraction, edit book/config.yaml:

# Only translate specific files
translation_files:
  - Text/chapter-001.xhtml
  - Text/chapter-002.xhtml
  # - Text/appendix.xhtml    # Commented = skipped

# Different files for audiobook
audiobook_files:
  - Text/chapter-001.xhtml
  # - Text/copyright.xhtml   # Skip copyright in audiobook

Debug Commands

tepub debug workspace book.epub    # Show workspace info
tepub debug pending                 # What's left to translate
tepub debug show-skip-list          # What was skipped

Cost Estimates

Translation (300-page book)

  • OpenAI GPT-4o: ~$0.50-2.00
  • Anthropic Claude: ~$0.30-1.50
  • Ollama (local): Free (requires powerful computer)

Audiobook (300-page book, ~750,000 characters)

  • Edge TTS: Free
  • OpenAI tts-1: ~$11.25
  • OpenAI tts-1-hd: ~$22.50

Recommendations

  • Best Quality: OpenAI GPT-4 + OpenAI TTS-1-HD (~$25 total)
  • Best Value: OpenAI GPT-4 + Edge TTS (~$1.50 total)
  • Free: Ollama + Edge TTS (requires local GPU)

Troubleshooting

"API key not found"

# Set environment variable
export OPENAI_API_KEY="sk-your-key-here"

# Or create .env file
echo 'OPENAI_API_KEY=sk-your-key-here' > .env

"ModuleNotFoundError: No module named 'openai'"

pip install -e .[dev]
# Or specifically: pip install openai

Audiobook has no sound

# Install FFmpeg
brew install ffmpeg           # Mac
sudo apt install ffmpeg       # Linux
# Windows: download from ffmpeg.org

Translation fails

# Check status
tepub debug pending

# Reset errors and retry
rm book/state.json
tepub translate book.epub --to Spanish

More solutions: GitHub Issues


Privacy & Security

  • Local processing: Books stay on your computer (except API calls)
  • No telemetry: TEPUB collects no usage data
  • Provider privacy: Translation APIs see text but don't store it long-term
  • Maximum privacy: Use Ollama for fully local operation

Requirements

  • Python: 3.10 or newer (3.11+ recommended)
  • OS: macOS, Linux, or Windows 10+
  • Disk: ~500 MB
  • RAM: 2-4 GB
  • FFmpeg: Required for audiobooks (auto-installed on Mac/Linux)

Support


Credits

Built with:


License

MIT License - see LICENSE for details.


For Developers

Development Setup
git clone https://github.com/xiaolai/tepub.git
cd tepub
python -m venv .venv
source .venv/bin/activate
pip install -e .[dev]

Run tests:

pytest
pytest --cov=src --cov-report=html

Code quality:

ruff check src tests
black src tests

Project structure:

src/
โ”œโ”€โ”€ cli/              # Command-line interface
โ”œโ”€โ”€ extraction/       # EPUB extraction
โ”œโ”€โ”€ translation/      # Translation pipeline
โ”œโ”€โ”€ audiobook/        # TTS and audiobook creation
โ”œโ”€โ”€ injection/        # Insert translations into EPUB
โ”œโ”€โ”€ web_export/       # Web viewer generation
โ”œโ”€โ”€ epub_io/          # EPUB reading/writing
โ”œโ”€โ”€ config/           # Configuration management
โ””โ”€โ”€ state/            # Progress tracking

Made with โค๏ธ for language learners, audiobook enthusiasts, and book lovers everywhere.

Version 0.2.0 | Changelog | Issues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tepub-0.2.3.tar.gz (106.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tepub-0.2.3-py3-none-any.whl (131.7 kB view details)

Uploaded Python 3

File details

Details for the file tepub-0.2.3.tar.gz.

File metadata

  • Download URL: tepub-0.2.3.tar.gz
  • Upload date:
  • Size: 106.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for tepub-0.2.3.tar.gz
Algorithm Hash digest
SHA256 a0f895345ddee47975d77cf5af23f5ac45ce74bb67e762a9326fcd388169f152
MD5 1cb6e3f815be3fee61b3a16d42123384
BLAKE2b-256 650d713db357a5566085f4d76b0f76cadd2f700e818e823d3dc6aa524eeca77e

See more details on using hashes here.

File details

Details for the file tepub-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: tepub-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 131.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for tepub-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d5b981742abd9740a3e2df56187b6d10402f74beb9ecc0f9aade314c9da67183
MD5 4d6c75c9334d91516d9d44a0d1064edf
BLAKE2b-256 54b88cdeb30985e0c77cf15745f45ea9de3ebf8933fd2f9e2af9db2dc5673667

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page