A lightweight, unified OCR toolkit with a one-liner API. Supports Surya, EasyOCR, PaddleOCR, Tesseract, and Vision LLMs through a single interface.

These details have not been verified by PyPI

Project links

Project description

anyocr

One unified API over every major OCR engine — from Tesseract to vision LLMs.

PyPI Python License

anyocr gives you a single anyocr.read() call that extracts text from images and documents using whichever OCR engine you have installed. It auto-selects the best available backend by priority (Surya > EasyOCR > PaddleOCR > Tesseract > Vision-LLM), applies a smart preprocessing pipeline (auto-rotate, deskew, contrast enhancement, binarization), and returns structured results with bounding boxes, confidence scores, and reading order.

Built by Viet-Anh Nguyen at NRL.ai.

Why anyocr?

One-liner API — anyocr.read("scan.png") just works with any installed backend
Plugin architecture — Register new OCR engines via @register_backend
Local-first — Surya, EasyOCR, Paddle, Tesseract all run on your machine
Minimal core deps — Only pillow and numpy; every OCR engine is an optional extra
Production-ready — Auto-preprocessing, structured dataclass results, batch inference

Installation

pip install anyocr

For optional backends:

pip install anyocr[surya]        # Surya OCR — SOTA open source, 90+ languages
pip install anyocr[easyocr]      # EasyOCR — CRNN models, 80+ languages
pip install anyocr[paddle]       # PaddleOCR — strong Asian languages
pip install anyocr[tesseract]    # Tesseract via pytesseract (needs tesseract binary)
pip install anyocr[vlm]          # Vision-LLM via anyllm (GPT-4V, Claude, Gemini)
pip install anyocr[all]          # everything

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anyocr

# 1. Auto-selects the best installed backend (Surya > EasyOCR > Paddle > Tesseract > VLM)
result = anyocr.read("receipt.png")
print(result.text)                       # full extracted text
for line in result.lines:
    print(line.text, line.confidence, line.bbox)

# 2. Force a specific backend
text = anyocr.read("chinese.jpg", backend="paddle", lang="ch").text

# 3. Use a vision LLM for hard cases (tables, handwriting)
text = anyocr.read("handwritten.jpg", backend="vlm", model="gpt-4o").text

Models & Methods

Supported backends (auto-selected by priority)

Priority	Backend	Model family	Languages	Install
1	Surya OCR	Transformer-based detection + recognition (DETR + Donut-style)	90+	`anyocr[surya]`
2	EasyOCR	CRAFT detector + CRNN recognizer	80+	`anyocr[easyocr]`
3	PaddleOCR	PP-OCRv4 (DBNet + SVTR)	80+, strong CJK	`anyocr[paddle]`
4	Tesseract	LSTM-based (Tesseract 4+)	100+	`anyocr[tesseract]`
5	Vision-LLM	Any multi-modal LLM via anyllm (GPT-4V, Claude 3.5 Sonnet, Gemini, LLaVA)	Any	`anyocr[vlm]`

You can change the priority or force a backend via anyocr.read(..., backend="easyocr") or anyocr.set_priority(["paddle", "surya"]).

Preprocessing pipeline

Applied automatically (can be disabled per call):

EXIF orientation fix — rotate based on metadata
Auto-rotate — detect 90/180/270 rotation via text-line angle histogram
Deskew — Hough-transform-based angle correction (<= 15 degrees)
Contrast enhancement — CLAHE (adaptive histogram equalization)
Binarization — adaptive threshold for low-quality scans (opt-in)
Denoise — bilateral filter for scanned documents (opt-in)

Result dataclasses

@dataclass
class OCRLine:
    text: str
    confidence: float
    bbox: tuple[float, float, float, float]   # x1, y1, x2, y2
    polygon: list[tuple[float, float]] | None  # 4-point quad if supported

@dataclass
class OCRResult:
    text: str                  # joined full text in reading order
    lines: list[OCRLine]
    backend: str               # which backend produced this result
    language: str | None

API Reference

Function	Purpose
`anyocr.read(image, backend="auto", lang=None)`	Run OCR, returns `OCRResult`
`anyocr.read_pdf(pdf_path)`	OCR every page of a PDF
`anyocr.list_backends()`	Show installed backends
`anyocr.set_priority([...])`	Override auto-selection order
`anyocr.preprocess(image, ops=[...])`	Run preprocessing pipeline only
`anyocr.register_backend(name, cls)`	Add a custom backend

CLI Usage

anyocr read receipt.png
anyocr read scan.jpg --backend paddle --lang ch
anyocr read-pdf document.pdf --out text.txt
anyocr list-backends

Examples

OCR an entire PDF and save as text

import anyocr

# Rasterizes each page and runs the auto-selected backend
result = anyocr.read_pdf("report.pdf")
with open("report.txt", "w") as f:
    for page_num, page in enumerate(result.pages, 1):
        f.write(f"=== Page {page_num} ===\n{page.text}\n\n")

Combine preprocessing with a specific backend

import anyocr

# Run the preprocessing pipeline explicitly before OCR
cleaned = anyocr.preprocess("noisy_scan.jpg", ops=["deskew", "clahe", "binarize"])
result  = anyocr.read(cleaned, backend="tesseract", lang="eng")
print(result.text)

Compare two backends on the same image

import anyocr

for backend in ["surya", "easyocr", "paddle"]:
    r = anyocr.read("test.jpg", backend=backend)
    print(f"{backend}: {r.text[:80]}... (avg conf {r.mean_confidence():.2f})")

License

MIT (c) Viet-Anh Nguyen

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Apr 9, 2026

This version

0.2.3

Apr 9, 2026

0.2.2

Apr 9, 2026

0.2.1

Apr 9, 2026

0.2.0

Apr 9, 2026

0.0.1

Jun 10, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyocr-0.2.3.tar.gz (38.3 kB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anyocr-0.2.3-py3-none-any.whl (33.3 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file anyocr-0.2.3.tar.gz.

File metadata

Download URL: anyocr-0.2.3.tar.gz
Upload date: Apr 9, 2026
Size: 38.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyocr-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`faedac38f4c8798c28800eed6a6161230b963509cba19b1fa79b67ac85a55681`
MD5	`5562bcac1452ae85f007ac97b2c8eb3f`
BLAKE2b-256	`379ae7aba21b51070267024ac85f261120dd952ae05635c1f540d1a4dcb58a83`

See more details on using hashes here.

File details

Details for the file anyocr-0.2.3-py3-none-any.whl.

File metadata

Download URL: anyocr-0.2.3-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 33.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyocr-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6e0baeddc7c4ba444551f0b892b3015c9970dda9c4f42cd54971755531c02b70`
MD5	`0369745fb39bac63eedd3d23abf2ff4d`
BLAKE2b-256	`694225243725ad8cec5d0acd758e2deaacdd8e1b09eb94f98eccc53ce9f0a75e`

See more details on using hashes here.

anyocr 0.2.3

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

anyocr

Why anyocr?

Installation

Quick Start

Models & Methods

Supported backends (auto-selected by priority)

Preprocessing pipeline

Result dataclasses

API Reference

CLI Usage

Examples

OCR an entire PDF and save as text

Combine preprocessing with a specific backend

Compare two backends on the same image

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes