Translate the epub book using LLM. The translated book will retain the original text and list the translated text side by side with the original text.
Project description
Translate EPUB books using Large Language Models while preserving the original text. The translated content is displayed side-by-side with the original, creating bilingual books perfect for language learning and cross-reference reading.
Features
- Bilingual Output: Preserves original text alongside translations for easy comparison
- LLM-Powered: Leverages large language models for high-quality, context-aware translations
- Format Preservation: Maintains EPUB structure, styles, images, and formatting
- Complete Translation: Translates chapter content, table of contents, and metadata
- Progress Tracking: Monitor translation progress with built-in callbacks
- Flexible LLM Support: Works with any OpenAI-compatible API endpoint
- Caching: Built-in caching for progress recovery when translation fails
Installation
pip install epub-translator
Requirements: Python 3.11, 3.12, or 3.13
Quick Start
Using OOMOL Studio (Recommended)
The easiest way to use EPUB Translator is through OOMOL Studio with a visual interface:
Using Python API
from pathlib import Path
from epub_translator import LLM, translate, language
# Initialize LLM with your API credentials
llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
)
# Translate EPUB file using language constants
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
llm=llm,
)
With Progress Tracking
from tqdm import tqdm
with tqdm(total=100, desc="Translating", unit="%") as pbar:
last_progress = 0.0
def on_progress(progress: float):
nonlocal last_progress
increment = (progress - last_progress) * 100
pbar.update(increment)
last_progress = progress
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language="English",
llm=llm,
on_progress=on_progress,
)
API Reference
LLM Class
Initialize the LLM client for translation:
LLM(
key: str, # API key
url: str, # API endpoint URL
model: str, # Model name (e.g., "gpt-4")
token_encoding: str, # Token encoding (e.g., "o200k_base")
cache_path: PathLike | None = None, # Cache directory path
timeout: float | None = None, # Request timeout in seconds
top_p: float | tuple[float, float] | None = None,
temperature: float | tuple[float, float] | None = None,
retry_times: int = 5, # Number of retries on failure
retry_interval_seconds: float = 6.0, # Interval between retries
log_dir_path: PathLike | None = None, # Log directory path
)
translate Function
Translate an EPUB file:
translate(
source_path: PathLike | str, # Source EPUB file path
target_path: PathLike | str, # Output EPUB file path
target_language: str, # Target language (e.g., "English", "Chinese")
user_prompt: str | None = None, # Custom translation instructions
max_retries: int = 5, # Maximum retries for failed translations
max_group_tokens: int = 1200, # Maximum tokens per translation group
llm: LLM | None = None, # Single LLM instance for both translation and filling
translation_llm: LLM | None = None, # LLM instance for translation (overrides llm)
fill_llm: LLM | None = None, # LLM instance for XML filling (overrides llm)
on_progress: Callable[[float], None] | None = None, # Progress callback (0.0-1.0)
on_fill_failed: Callable[[FillFailedEvent], None] | None = None, # Error callback
)
Note: Either llm or both translation_llm and fill_llm must be provided. Using separate LLMs allows for task-specific optimization.
Language Constants
EPUB Translator provides predefined language constants for convenience. You can use these constants instead of writing language names as strings:
from epub_translator import language
# Usage example:
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
llm=llm,
)
# You can also use custom language strings:
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language="Icelandic", # For languages not in the constants
llm=llm,
)
Error Handling with on_fill_failed
Monitor and handle translation errors using the on_fill_failed callback:
from epub_translator import FillFailedEvent
def handle_fill_error(event: FillFailedEvent):
print(f"Translation error (attempt {event.retried_count}):")
print(f" {event.error_message}")
if event.over_maximum_retries:
print(" Maximum retries exceeded!")
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
llm=llm,
on_fill_failed=handle_fill_error,
)
The FillFailedEvent contains:
error_message: str- Description of the errorretried_count: int- Current retry attempt numberover_maximum_retries: bool- Whether max retries has been exceeded
Dual-LLM Architecture
Use separate LLM instances for translation and XML structure filling with different optimization parameters:
# Create two LLM instances with different temperatures
translation_llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
temperature=0.8, # Higher temperature for creative translation
)
fill_llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
temperature=0.3, # Lower temperature for structure preservation
)
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language=language.ENGLISH,
translation_llm=translation_llm,
fill_llm=fill_llm,
)
Configuration Examples
OpenAI
llm = LLM(
key="sk-...",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
)
Azure OpenAI
llm = LLM(
key="your-azure-key",
url="https://your-resource.openai.azure.com/openai/deployments/your-deployment",
model="gpt-4",
token_encoding="o200k_base",
)
Other OpenAI-Compatible Services
Any service with an OpenAI-compatible API can be used:
llm = LLM(
key="your-api-key",
url="https://your-service.com/v1",
model="your-model",
token_encoding="o200k_base", # Match your model's encoding
)
Use Cases
- Language Learning: Read books in their original language with side-by-side translations
- Academic Research: Access foreign literature with bilingual references
- Content Localization: Prepare books for international audiences
- Cross-Cultural Reading: Enjoy literature while understanding cultural nuances
Advanced Features
Custom Translation Prompts
Provide specific translation instructions:
translate(
source_path=Path("source.epub"),
target_path=Path("translated.epub"),
target_language="English",
llm=llm,
user_prompt="Use formal language and preserve technical terminology",
)
Caching for Progress Recovery
Enable caching to resume translation progress after failures:
llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
cache_path="./translation_cache", # Translations are cached here
)
Related Projects
PDF Craft
PDF Craft converts PDF files into EPUB and other formats, with a focus on scanned books. Combine PDF Craft with EPUB Translator to convert and translate scanned PDF books into bilingual EPUB format.
Workflow: Scanned PDF → [PDF Craft] → EPUB → [EPUB Translator] → Bilingual EPUB
For a complete tutorial, watch: Convert scanned PDF books to EPUB format and translate them into bilingual books
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Issues: GitHub Issues
- OOMOL Studio: Open in OOMOL Studio
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file epub_translator-0.1.3.tar.gz.
File metadata
- Download URL: epub_translator-0.1.3.tar.gz
- Upload date:
- Size: 71.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/25.1.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
291b0f92ca7e1ed87b99b4aa095fba96fc4aaa53c7719f4f36a9b073b27b0589
|
|
| MD5 |
73669d35531807bbb169f7cd256dddd1
|
|
| BLAKE2b-256 |
cd6db1723f2814c7857ee4d66b33041add3960d4498f73a12e22fa055b3033ee
|
File details
Details for the file epub_translator-0.1.3-py3-none-any.whl.
File metadata
- Download URL: epub_translator-0.1.3-py3-none-any.whl
- Upload date:
- Size: 91.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/25.1.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ace80b832213c1106ec8918b8a27de8d5180c899ef0d45e52cd50280ac8cef8
|
|
| MD5 |
522295328d1770e26c585c698a33df5c
|
|
| BLAKE2b-256 |
9e44eee0fdc90386900cf3b061a1ec4c94efad685b0255beb5797ef8f85a8ec1
|