Local OCR extraction + LLM text correction. PaddleOCR for recognition, Ollama for correction, fully offline.

These details have not been verified by PyPI

Project links

Project description

RenZi 认字

Local OCR extraction + LLM text correction. PaddleOCR recognizes text in images, a local Ollama model corrects OCR mistakes, and an optional Flask web UI ties it together. Everything runs offline — no cloud APIs.

What It Does

RenZi takes images containing Chinese/English text and produces clean electronic text:

OCR — PaddleOCR (PP-OCRv5) extracts text plus per-region bounding boxes and confidence scores.
Visualization — renders an annotated image (green boxes) and a reconstructed text bitmap that preserves the original layout on a white canvas.
Correction — sends the raw OCR text to a local Ollama model (gemma3:1b by default) that fixes typos and misrecognized characters while keeping the original language and line structure.

Supported formats: .png, .jpg, .jpeg, .bmp, .tiff, .tif, .webp.

What it does not do: streaming OCR, mobile capture, cloud OCR, translation between languages.

Features

Single image or whole-directory batch OCR
PaddleOCR 3.6 + PP-OCRv5 server detection/recognition models
Optional annotated image and reconstructed text bitmap PNGs
Local Ollama LLM correction (offline, no API keys)
Flask web UI with drag-and-drop upload and side-by-side raw/corrected text
Unified Python API with ToolResult dataclass
OpenAI function-calling tools schema for agent integration
CLI with unified flags (-V -v -q --json -o)

Requirements

Python >= 3.9
PaddleOCR 3.6.0 + PaddlePaddle 3.0.0 (see install notes below)
Ollama running locally with a model pulled (e.g. ollama pull gemma3:1b)
A CJK TrueType font for bitmap rendering (auto-detected on Windows/macOS/Linux)

PaddlePaddle version pin

The known-good combination is PaddlePaddle 3.0.0 + PaddleOCR 3.6.0. Newer PaddlePaddle (3.1+) breaks old model loading via the PIR executor. Install from the Baidu mirror:

pip install paddlepaddle==3.0.0 -i https://mirror.baidu.com/pypi/simple

Installation

pip install -e .

For development dependencies:

pip install -e ".[dev]"

Ollama setup (for correction)

ollama pull gemma3:1b

Quick Start

Recognize one image and print text:

renzi photo.jpg

Batch a directory and write .txt transcripts:

renzi images/ -l ch -o transcripts/

Full pipeline with annotated image + text bitmap + corrected text file:

renzi scan.png --pipeline --bbox annotated.png --bitmap bitmap.png -o fixed.txt

Correct OCR text piped on stdin:

echo "我在吉林省、长春市" | renzi --correct -

Launch the web UI:

python -m renzi.web
# open http://127.0.0.1:8765

CLI Usage

renzi [-V] [-v] [-q] [--json] [-l LANG] [--confidence N] [-o PATH]
     [--bbox PATH] [--bitmap PATH] [--ollama-base URL] [--ollama-model NAME]
     [--timeout S] [--correct] [--pipeline] [PATH]

Flag	Meaning
`-V`, `--version`	Print version and exit
`-v`, `--verbose`	Show per-item confidence and geometry
`-q`, `--quiet`	Suppress non-essential output
`--json`	Output results as JSON
`-l`, `--lang`	OCR language hint (default `ch`)
`--confidence`	Minimum OCR confidence (default `0.5`)
`-o`, `--output`	Output file (single) or directory (batch)
`--bbox`	Write annotated image PNG to this path
`--bitmap`	Write reconstructed text bitmap PNG to this path
`--ollama-base`	Ollama API base URL
`--ollama-model`	Ollama model name
`--timeout`	LLM request timeout in seconds
`--correct`	Correct text from stdin instead of running OCR
`--pipeline`	Run OCR + render + LLM correction

Python API

from renzi import extract, extract_dir, correct, pipeline, ToolResult

# Single image
result = extract(image_path="photo.jpg", lang="ch")
print(result.success)        # True
print(result.data["text"])   # raw OCR text
for it in result.data["items"]:
    print(it["text"], it["confidence"], it["bbox"])

# Directory batch with text files
result = extract_dir(directory="images/", output="transcripts/")
for filename, info in result.data.items():
    print(filename, info["text"][:50])

# Correct raw OCR text
result = correct(text="我在吉林省、长春市")
print(result.data["text"])      # corrected
print(result.data["used_llm"])  # True/False

# Full pipeline with visualizations
result = pipeline(
    image_path="scan.png",
    bbox_image="annotated.png",
    bitmap_image="bitmap.png",
    output="fixed.txt",
)

Agent Integration

from renzi.tools import TOOLS, dispatch

# Register TOOLS in your OpenAI function-calling client, then:
result = dispatch("renzi_pipeline", {
    "image_path": "scan.png",
    "bbox_image": "annotated.png",
    "bitmap_image": "bitmap.png",
})

Web UI

from renzi.web import run
run(port=8765)

Drag-and-drop upload, three-up image comparison (original / annotated / text bitmap), and side-by-side raw vs corrected text with one-click copy and re-correct.

Model Cache

PaddleOCR downloads models on first run from ModelScope into ~/.paddlex/official_models/. To bundle models offline, place them in a directory and point PADDLEX_HOME (or RENZI_MODEL_DIR) at it:

export RENZI_MODEL_DIR=/path/to/official_models
renzi photo.jpg

Development

pip install -e ".[dev]"
ruff format . && ruff check . && pytest

License

GPL-3.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

renzi-0.1.0.tar.gz (24.9 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

renzi-0.1.0-py3-none-any.whl (23.2 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file renzi-0.1.0.tar.gz.

File metadata

Download URL: renzi-0.1.0.tar.gz
Upload date: Jun 24, 2026
Size: 24.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for renzi-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a8f80262926efe77c0032ed4948dc7b38cebddc7a8e8f3827e6c9326067437b5`
MD5	`18fcad8a18002e97b7abc09711812aa1`
BLAKE2b-256	`14a6c435d38cc7c548cb04f75e09bb9ba91097991f350a4d48c9621c809705cd`

See more details on using hashes here.

File details

Details for the file renzi-0.1.0-py3-none-any.whl.

File metadata

Download URL: renzi-0.1.0-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 23.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for renzi-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eaa54a68f5277107580178158faad147fd67b1dcdd9aee6a37a311d304deb15f`
MD5	`74bbe8279279aaf689a85a9a703e231d`
BLAKE2b-256	`8873c87f0feddb59caf144e5793f94cb72a18944aaa7305218ded0af61f466d9`

See more details on using hashes here.

renzi 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RenZi 认字

What It Does

Features

Requirements

PaddlePaddle version pin

Installation

Ollama setup (for correction)

Quick Start

CLI Usage

Python API

Agent Integration

Web UI

Model Cache

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes