Skip to main content

Translate the epub book using LLM. The translated book will retain the original text and list the translated text side by side with the original text.

Project description

EPUB Translator

ci pip install epub-translator pypi epub-translator python versions license

Open in OOMOL Studio

English | 中文

Translate EPUB books using Large Language Models while preserving the original text. The translated content is displayed side-by-side with the original, creating bilingual books perfect for language learning and cross-reference reading.

Translation Effect

Features

  • Bilingual Output: Preserves original text alongside translations for easy comparison
  • LLM-Powered: Leverages large language models for high-quality, context-aware translations
  • Format Preservation: Maintains EPUB structure, styles, images, and formatting
  • Complete Translation: Translates chapter content, table of contents, and metadata
  • Progress Tracking: Monitor translation progress with built-in callbacks
  • Flexible LLM Support: Works with any OpenAI-compatible API endpoint
  • Caching: Built-in caching for progress recovery when translation fails

Installation

pip install epub-translator

Requirements: Python 3.11, 3.12, or 3.13

Quick Start

Using OOMOL Studio (Recommended)

The easiest way to use EPUB Translator is through OOMOL Studio with a visual interface:

Watch the Tutorial

Using Python API

from epub_translator import LLM, translate, language, SubmitKind

# Initialize LLM with your API credentials
llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
)

# Translate EPUB file using language constants
translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language=language.ENGLISH,
    submit=SubmitKind.APPEND_BLOCK,
    llm=llm,
)

With Progress Tracking

from tqdm import tqdm

with tqdm(total=100, desc="Translating", unit="%") as pbar:
    last_progress = 0.0

    def on_progress(progress: float):
        nonlocal last_progress
        increment = (progress - last_progress) * 100
        pbar.update(increment)
        last_progress = progress

    translate(
        source_path="source.epub",
        target_path="translated.epub",
        target_language="English",
        submit=SubmitKind.APPEND_BLOCK,
        llm=llm,
        on_progress=on_progress,
    )

API Reference

LLM Class

Initialize the LLM client for translation:

LLM(
    key: str,                          # API key
    url: str,                          # API endpoint URL
    model: str,                        # Model name (e.g., "gpt-4")
    token_encoding: str,               # Token encoding (e.g., "o200k_base")
    cache_path: PathLike | None = None,           # Cache directory path
    timeout: float | None = None,                  # Request timeout in seconds
    top_p: float | tuple[float, float] | None = None,
    temperature: float | tuple[float, float] | None = None,
    retry_times: int = 5,                         # Number of retries on failure
    retry_interval_seconds: float = 6.0,          # Interval between retries
    log_dir_path: PathLike | None = None,         # Log directory path
)

translate Function

Translate an EPUB file:

translate(
    source_path: PathLike | str,       # Source EPUB file path
    target_path: PathLike | str,       # Output EPUB file path
    target_language: str,              # Target language (e.g., "English", "Chinese")
    submit: SubmitKind,                # How to insert translations (REPLACE, APPEND_TEXT, or APPEND_BLOCK)
    user_prompt: str | None = None,    # Custom translation instructions
    max_retries: int = 5,              # Maximum retries for failed translations
    max_group_tokens: int = 1200,      # Maximum tokens per translation group
    llm: LLM | None = None,            # Single LLM instance for both translation and filling
    translation_llm: LLM | None = None,  # LLM instance for translation (overrides llm)
    fill_llm: LLM | None = None,       # LLM instance for XML filling (overrides llm)
    on_progress: Callable[[float], None] | None = None,  # Progress callback (0.0-1.0)
    on_fill_failed: Callable[[FillFailedEvent], None] | None = None,  # Error callback
)

Note: Either llm or both translation_llm and fill_llm must be provided. Using separate LLMs allows for task-specific optimization.

Submit Modes

The submit parameter controls how translated content is inserted into the document. Use SubmitKind enum to specify the insertion mode:

from epub_translator import SubmitKind

# Three available modes:
# - SubmitKind.REPLACE: Replace original content with translation (single-language output)
# - SubmitKind.APPEND_TEXT: Append translation as inline text (bilingual output)
# - SubmitKind.APPEND_BLOCK: Append translation as block elements (bilingual output, recommended)

Mode Comparison:

  • SubmitKind.REPLACE: Creates a single-language translation by replacing original text with translated content. Useful for creating books in the target language only.

  • SubmitKind.APPEND_TEXT: Appends translations as inline text immediately after the original content. Both languages appear in the same paragraph, creating a continuous reading flow.

  • SubmitKind.APPEND_BLOCK (Recommended): Appends translations as separate block elements (paragraphs) after the original. This creates clear visual separation between languages, making it ideal for side-by-side bilingual reading.

Example:

# For bilingual books (recommended)
translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language=language.ENGLISH,
    submit=SubmitKind.APPEND_BLOCK,
    llm=llm,
)

# For single-language translation
translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language=language.ENGLISH,
    submit=SubmitKind.REPLACE,
    llm=llm,
)

Language Constants

EPUB Translator provides predefined language constants for convenience. You can use these constants instead of writing language names as strings:

from epub_translator import language

# Usage example:
translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language=language.ENGLISH,
    submit=SubmitKind.APPEND_BLOCK,
    llm=llm,
)

# You can also use custom language strings:
translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language="Icelandic",  # For languages not in the constants
    submit=SubmitKind.APPEND_BLOCK,
    llm=llm,
)

Error Handling with on_fill_failed

Monitor and handle translation errors using the on_fill_failed callback:

from epub_translator import FillFailedEvent

def handle_fill_error(event: FillFailedEvent):
    print(f"Translation error (attempt {event.retried_count}):")
    print(f"  {event.error_message}")
    if event.over_maximum_retries:
        print("  Maximum retries exceeded!")

translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language=language.ENGLISH,
    submit=SubmitKind.APPEND_BLOCK,
    llm=llm,
    on_fill_failed=handle_fill_error,
)

The FillFailedEvent contains:

  • error_message: str - Description of the error
  • retried_count: int - Current retry attempt number
  • over_maximum_retries: bool - Whether max retries has been exceeded

Dual-LLM Architecture

Use separate LLM instances for translation and XML structure filling with different optimization parameters:

# Create two LLM instances with different temperatures
translation_llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
    temperature=0.8,  # Higher temperature for creative translation
)

fill_llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
    temperature=0.3,  # Lower temperature for structure preservation
)

translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language=language.ENGLISH,
    submit=SubmitKind.APPEND_BLOCK,
    translation_llm=translation_llm,
    fill_llm=fill_llm,
)

Configuration Examples

OpenAI

llm = LLM(
    key="sk-...",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
)

Azure OpenAI

llm = LLM(
    key="your-azure-key",
    url="https://your-resource.openai.azure.com/openai/deployments/your-deployment",
    model="gpt-4",
    token_encoding="o200k_base",
)

Other OpenAI-Compatible Services

Any service with an OpenAI-compatible API can be used:

llm = LLM(
    key="your-api-key",
    url="https://your-service.com/v1",
    model="your-model",
    token_encoding="o200k_base",  # Match your model's encoding
)

Use Cases

  • Language Learning: Read books in their original language with side-by-side translations
  • Academic Research: Access foreign literature with bilingual references
  • Content Localization: Prepare books for international audiences
  • Cross-Cultural Reading: Enjoy literature while understanding cultural nuances

Advanced Features

Custom Translation Prompts

Provide specific translation instructions:

translate(
    source_path="source.epub",
    target_path="translated.epub",
    target_language="English",
    submit=SubmitKind.APPEND_BLOCK,
    llm=llm,
    user_prompt="Use formal language and preserve technical terminology",
)

Caching for Progress Recovery

Enable caching to resume translation progress after failures:

llm = LLM(
    key="your-api-key",
    url="https://api.openai.com/v1",
    model="gpt-4",
    token_encoding="o200k_base",
    cache_path="./translation_cache",  # Translations are cached here
)

Related Projects

PDF Craft

PDF Craft converts PDF files into EPUB and other formats, with a focus on scanned books. Combine PDF Craft with EPUB Translator to convert and translate scanned PDF books into bilingual EPUB format.

Workflow: Scanned PDF → [PDF Craft] → EPUB → [EPUB Translator] → Bilingual EPUB

For a complete tutorial, watch: Convert scanned PDF books to EPUB format and translate them into bilingual books

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epub_translator-0.1.4.tar.gz (75.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epub_translator-0.1.4-py3-none-any.whl (95.4 kB view details)

Uploaded Python 3

File details

Details for the file epub_translator-0.1.4.tar.gz.

File metadata

  • Download URL: epub_translator-0.1.4.tar.gz
  • Upload date:
  • Size: 75.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/25.1.0

File hashes

Hashes for epub_translator-0.1.4.tar.gz
Algorithm Hash digest
SHA256 dfd18623f32a49f51ee154d55c011651924988bfd453e9b4e8ffbaf9c7f2522d
MD5 211c84b86b5556d9c39a8864e64fc19d
BLAKE2b-256 62155ba667a8dd48d38ea6e92f08ccd0d0f3cc6ccf71f851fa381d4c683fc282

See more details on using hashes here.

File details

Details for the file epub_translator-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: epub_translator-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 95.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.13.4 Darwin/25.1.0

File hashes

Hashes for epub_translator-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 bab18e495ec1eddf3edbea492cd04de46b21b73d56d09700aba9458a701fb98c
MD5 8eef55798aba0d99084cfe439ecc9076
BLAKE2b-256 677579a4269370259b46743cb9189bcee3a7a512fe5bc933dc90be124250f869

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page