A CLI tool for automatically translating manga pages from Japanese to English. Detects speech bubbles, extracts Japanese text using OCR, translates to English, and renders the translated text back onto images with proper alignment.

Project description

Manga Translation CLI

Fully automated and offline manga translation pipeline. Intelligently detects speech bubbles, extracts Japanese text with OCR, translates to English, and seamlessly renders translated text back onto pages with proper alignment and customizable fonts. Supports both single images and batch folder processing with GPU acceleration.

Showcase

Example 1: Complete Pipeline

Original	Detection	Cleaned	Translated

Source: Magus of the Library

Example 2: Translation Result

Original	Translated

Source: Witch Hat Atelier

Original	Translated

Source: Ajin: Demi-Human

Original	Translated

Source: Frieren: Beyond Journey's End

Pipeline stages:

Original: Input manga page with Japanese text
Detection: YOLO model identifies speech bubble locations (green boxes)
Cleaned: Bubble interiors filled with base color, text removed
Translated: English text rendered within bubble shapes

Disclaimer: Example images are from published manga and used for demonstration purposes only. All rights belong to their respective copyright holders. This tool is intended for personal use with legally obtained content.

Showcase
Features
Installation
Usage
Output Structure
Dependencies
How It Works
Limitations
Known Issues
Contributing
Credits
Changelog
Notes

Features

Automatic speech bubble detection using YOLO (YOLOv8m)
Japanese text extraction using PaddleOCR-VL transformer model
High-quality translation using Sugoi-v4 (specialized for Japanese→English)
Smart text rendering with automatic font sizing and alignment within bubble shapes
Custom font support for personalized text styling
Batch processing for entire folders with optimized GPU utilization
GPU acceleration with CUDA support for faster processing
Configurable detection with adjustable confidence and IoU thresholds
Intermediate outputs for debugging (bubble masks, cleaned images, detections)

Installation

Requirements

Python 3.13 or higher
UV package manager (recommended) or pip (currently untested)

For uv installation, visit: https://github.com/astral-sh/uv

Install from PyPI

GPU Installation (CUDA 12.8)

Installs with CUDA support for GPU acceleration. Requires CUDA-compatible NVIDIA GPU.

Using uv (recommended):

uv tool install manga-translator-cli[cuda] --index https://download.pytorch.org/whl/cu128 --index-strategy unsafe-best-match

Using pip:

pip install manga-translator-cli[cuda] --extra-index-url https://download.pytorch.org/whl/cu128

CPU-Only Installation

For systems without a GPU or to save disk space.

Using uv (recommended):

uv tool install manga-translator-cli[cpu] --index https://download.pytorch.org/whl/cpu --index-strategy unsafe-best-match

Using pip:

pip install manga-translator-cli[cpu] --extra-index-url https://download.pytorch.org/whl/cpu

Install from Source

For development or to use the latest unreleased changes.

Clone the repository:

git clone https://github.com/zanbowie138/manga-translator-cli.git
cd manga-translator-cli

Install with your preferred backend:

GPU (CUDA 12.8):

uv sync --extra cuda

CPU-only:

uv sync --extra cpu

Models

Models will be automatically downloaded on first use:

YOLO model for bubble detection
PaddleOCR-VL model for text extraction
Sugoi-v4 model for translation

Usage

Single Image Translation

manga-translate input/page1.png

Folder Translation (batch mode recommended)

manga-translate input/ --batch

Common Options

Change output folder:

manga-translate input/page1.png --output folder --save-all

Save all intermediate outputs:

manga-translate input/page1.png --save-all

Use custom font:

manga-translate input/page1.png --font "fonts/CC Astro City Int Regular.ttf"

Adjust detection sensitivity:

manga-translate input/page1.png --conf-threshold 0.3 --iou-threshold 0.5

Force CPU mode (GPU is used by default if available):

manga-translate input/page1.png --device cpu

Quiet mode:

manga-translate input/page1.png --quiet

Available Options

--output, -o: Output folder path (default: output)
--folder, -f: Process entire folder instead of single file
--conf-threshold: Confidence threshold for bubble detection (0-1, default: 0.25)
--iou-threshold: IoU threshold for NMS (0-1, default: 0.45)
--font: Path to font file for translated text
--device: Device for OCR and translation (cpu or cuda, default: auto-detect, uses cuda if available). Controls which device is used for both text extraction and translation.
--save-all: Save all intermediate outputs
--save-speech-bubbles: Save annotated detection images
--save-bubble-interiors: Save bubble interior visualizations
--save-cleaned: Save cleaned images before text drawing
--quiet, -q: Suppress progress messages
--stop-on-error: Stop processing on first error (folder mode)

For complete list of options:

manga-translate --help

Output Structure

When processing files, outputs are organized in subdirectories:

translated/: Final translated images (always saved)
speech_bubbles/: Annotated images with detected bubbles (enabled with --save-speech-bubbles)
bubble_interiors/: Visualization of bubble interiors (enabled with --save-bubble-interiors)
cleaned/: Images with bubbles filled before text rendering (enabled with --save-cleaned)

Use --save-all to enable all intermediate outputs at once.

Example output structure:

output/
├── translated/
│   ├── page1.png
│   └── page2.png
├── speech_bubbles/      # if --save-speech-bubbles or --save-all
│   ├── page1.png
│   └── page2.png
├── bubble_interiors/    # if --save-bubble-interiors or --save-all
│   ├── page1.png
│   └── page2.png
└── cleaned/             # if --save-cleaned or --save-all
    ├── page1.png
    └── page2.png

Dependencies

ultralytics: YOLO model for bubble detection
transformers: PaddleOCR-VL for text extraction
ctranslate2: Fast translation inference
sentencepiece: Text tokenization
torch: Deep learning framework
opencv-python: Image processing
pillow: Image manipulation

How It Works

Loads a YOLO model to detect speech bubbles in the image
Filters out parent boxes that contain smaller child boxes
For each bubble, extracts Japanese text using PaddleOCR-VL
Detects if text contains Japanese characters
Translates Japanese text to English using Sugoi-v4
Cleans bubble interiors by filling with base color
Renders translated text within bubble shapes using binary search for optimal font size
Saves the final translated image

Limitations

Dense panels with overlapping bubbles have detection issues
Text outside of bubbles won't be translated
Complex bubble backgrounds may not fill cleanly
Currently Japanese→English only; other languages not supported

Known Issues

Bubble detection can sometimes break on compound bubbles, causing them to not be processed properly.
Translation on text outside of bubbles is not currently supported

Please open an issue if you encounter problems.

Contributing

Contributions welcome! To contribute:

Fork the repository
Create a feature branch (git checkout -b feature/improvement)
Make your changes
Run tests if applicable
Commit with clear messages (git commit -m "Add feature")
Push to your fork (git push origin feature/improvement)
Open a Pull Request

Development setup:

git clone <your-fork>
cd manga-translator-cli
uv sync --extra cuda  # GPU (or use --extra cpu for CPU-only)
uv tool install .[cuda] # GPU

Areas for contribution:

Improved bubble detection algorithms
Fix bubble detection for compound bubbles
Improved translation accuracy
Translation for text outside of bubbles
Support for additional languages
UI/web interface
Performance optimizations
Documentation improvements

Credits

Models:

YOLOv8m Manga Bubbles by Oguzhan61 - Speech bubble detection
PaddleOCR-VL For Manga by jzhang533 - Japanese text extraction
Sugoi-v4 JA-EN by entai2965 - Japanese to English translation

Libraries:

Ultralytics - YOLO implementation
Transformers - HuggingFace transformers library
CTranslate2 - Fast inference engine
PyTorch - Deep learning framework
UV - Fast Python package manager

Notes

First run downloads models (can take several minutes)
Translation quality depends on text clarity and font style
GPU requires CUDA-compatible NVIDIA GPU + drivers
--device controls both OCR and translation device
Supported formats: PNG, JPG, JPEG, WEBP
To switch PyTorch backend, reinstall with [cuda] or [cpu] extra as shown in Installation section

Project details

Release history Release notifications | RSS feed

1.1.1

Feb 1, 2026

This version

1.1.0

Feb 1, 2026

1.0.4

Feb 1, 2026

1.0.3

Feb 1, 2026

1.0.2

Feb 1, 2026

1.0.1

Feb 1, 2026

1.0.0

Feb 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

manga_translator_cli-1.1.0.tar.gz (8.8 MB view details)

Uploaded Feb 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

manga_translator_cli-1.1.0-py3-none-any.whl (84.1 kB view details)

Uploaded Feb 1, 2026 Python 3

File details

Details for the file manga_translator_cli-1.1.0.tar.gz.

File metadata

Download URL: manga_translator_cli-1.1.0.tar.gz
Upload date: Feb 1, 2026
Size: 8.8 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for manga_translator_cli-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`003500ca34f9a884ddcefdfe56b94d73e263a238dba60e077c3143b354c79243`
MD5	`f49ffecc2296baccd8616d3385208638`
BLAKE2b-256	`6a74e1ee7eacddedd18a64dbb3d34bd851c7f609865934f7940cc162b7e8c4ab`

See more details on using hashes here.

File details

Details for the file manga_translator_cli-1.1.0-py3-none-any.whl.

File metadata

Download URL: manga_translator_cli-1.1.0-py3-none-any.whl
Upload date: Feb 1, 2026
Size: 84.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for manga_translator_cli-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bcb9f27b7bd7be24f106e20ae676792a117ea3ecaa88806b947cc74d5191d0e0`
MD5	`e2e5163f44115dda9c9d5e746c3fe531`
BLAKE2b-256	`d9ac32340a888da4be5d3c8adb337628d3b045a1ea88ac91a56976f07786319d`

See more details on using hashes here.

manga-translator-cli 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Manga Translation CLI

Showcase

Example 1: Complete Pipeline

Example 2: Translation Result

Table of Contents

Features

Installation

Requirements

Install from PyPI

GPU Installation (CUDA 12.8)

CPU-Only Installation

Install from Source

Models

Usage

Single Image Translation

Folder Translation (batch mode recommended)

Common Options

Available Options

Output Structure

Dependencies

How It Works

Limitations

Known Issues

Contributing

Credits

Notes

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes