A lightweight, unified OCR toolkit with a one-liner API. Supports Surya, EasyOCR, PaddleOCR, Tesseract, and Vision LLMs through a single interface.
Project description
anyocr
One unified API over every major OCR engine — from Tesseract to vision LLMs.
anyocr gives you a single anyocr.read() call that extracts text from images and documents using whichever OCR engine you have installed. It auto-selects the best available backend by priority (Surya > EasyOCR > PaddleOCR > Tesseract > Vision-LLM), applies a smart preprocessing pipeline (auto-rotate, deskew, contrast enhancement, binarization), and returns structured results with bounding boxes, confidence scores, and reading order.
Built by Viet-Anh Nguyen at NRL.ai.
Why anyocr?
- One-liner API —
anyocr.read("scan.png")just works with any installed backend - Plugin architecture — Register new OCR engines via
@register_backend - Local-first — Surya, EasyOCR, Paddle, Tesseract all run on your machine
- Minimal core deps — Only
pillowandnumpy; every OCR engine is an optional extra - Production-ready — Auto-preprocessing, structured dataclass results, batch inference
Installation
pip install anyocr
For optional backends:
pip install anyocr[surya] # Surya OCR — SOTA open source, 90+ languages
pip install anyocr[easyocr] # EasyOCR — CRNN models, 80+ languages
pip install anyocr[paddle] # PaddleOCR — strong Asian languages
pip install anyocr[tesseract] # Tesseract via pytesseract (needs tesseract binary)
pip install anyocr[vlm] # Vision-LLM via anyllm (GPT-4V, Claude, Gemini)
pip install anyocr[all] # everything
Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)
Quick Start
import anyocr
# 1. Auto-selects the best installed backend (Surya > EasyOCR > Paddle > Tesseract > VLM)
result = anyocr.read("receipt.png")
print(result.text) # full extracted text
for line in result.lines:
print(line.text, line.confidence, line.bbox)
# 2. Force a specific backend
text = anyocr.read("chinese.jpg", backend="paddle", lang="ch").text
# 3. Use a vision LLM for hard cases (tables, handwriting)
text = anyocr.read("handwritten.jpg", backend="vlm", model="gpt-4o").text
Models & Methods
Supported backends (auto-selected by priority)
| Priority | Backend | Model family | Languages | Install |
|---|---|---|---|---|
| 1 | Surya OCR | Transformer-based detection + recognition (DETR + Donut-style) | 90+ | anyocr[surya] |
| 2 | EasyOCR | CRAFT detector + CRNN recognizer | 80+ | anyocr[easyocr] |
| 3 | PaddleOCR | PP-OCRv4 (DBNet + SVTR) | 80+, strong CJK | anyocr[paddle] |
| 4 | Tesseract | LSTM-based (Tesseract 4+) | 100+ | anyocr[tesseract] |
| 5 | Vision-LLM | Any multi-modal LLM via anyllm (GPT-4V, Claude 3.5 Sonnet, Gemini, LLaVA) | Any | anyocr[vlm] |
You can change the priority or force a backend via anyocr.read(..., backend="easyocr") or anyocr.set_priority(["paddle", "surya"]).
Preprocessing pipeline
Applied automatically (can be disabled per call):
- EXIF orientation fix — rotate based on metadata
- Auto-rotate — detect 90/180/270 rotation via text-line angle histogram
- Deskew — Hough-transform-based angle correction (<= 15 degrees)
- Contrast enhancement — CLAHE (adaptive histogram equalization)
- Binarization — adaptive threshold for low-quality scans (opt-in)
- Denoise — bilateral filter for scanned documents (opt-in)
Result dataclasses
@dataclass
class OCRLine:
text: str
confidence: float
bbox: tuple[float, float, float, float] # x1, y1, x2, y2
polygon: list[tuple[float, float]] | None # 4-point quad if supported
@dataclass
class OCRResult:
text: str # joined full text in reading order
lines: list[OCRLine]
backend: str # which backend produced this result
language: str | None
API Reference
| Function | Purpose |
|---|---|
anyocr.read(image, backend="auto", lang=None) |
Run OCR, returns OCRResult |
anyocr.read_pdf(pdf_path) |
OCR every page of a PDF |
anyocr.list_backends() |
Show installed backends |
anyocr.set_priority([...]) |
Override auto-selection order |
anyocr.preprocess(image, ops=[...]) |
Run preprocessing pipeline only |
anyocr.register_backend(name, cls) |
Add a custom backend |
CLI Usage
anyocr read receipt.png
anyocr read scan.jpg --backend paddle --lang ch
anyocr read-pdf document.pdf --out text.txt
anyocr list-backends
Examples
OCR an entire PDF and save as text
import anyocr
# Rasterizes each page and runs the auto-selected backend
result = anyocr.read_pdf("report.pdf")
with open("report.txt", "w") as f:
for page_num, page in enumerate(result.pages, 1):
f.write(f"=== Page {page_num} ===\n{page.text}\n\n")
Combine preprocessing with a specific backend
import anyocr
# Run the preprocessing pipeline explicitly before OCR
cleaned = anyocr.preprocess("noisy_scan.jpg", ops=["deskew", "clahe", "binarize"])
result = anyocr.read(cleaned, backend="tesseract", lang="eng")
print(result.text)
Compare two backends on the same image
import anyocr
for backend in ["surya", "easyocr", "paddle"]:
r = anyocr.read("test.jpg", backend=backend)
print(f"{backend}: {r.text[:80]}... (avg conf {r.mean_confidence():.2f})")
License
MIT (c) Viet-Anh Nguyen
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anyocr-0.2.3.tar.gz.
File metadata
- Download URL: anyocr-0.2.3.tar.gz
- Upload date:
- Size: 38.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
faedac38f4c8798c28800eed6a6161230b963509cba19b1fa79b67ac85a55681
|
|
| MD5 |
5562bcac1452ae85f007ac97b2c8eb3f
|
|
| BLAKE2b-256 |
379ae7aba21b51070267024ac85f261120dd952ae05635c1f540d1a4dcb58a83
|
File details
Details for the file anyocr-0.2.3-py3-none-any.whl.
File metadata
- Download URL: anyocr-0.2.3-py3-none-any.whl
- Upload date:
- Size: 33.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e0baeddc7c4ba444551f0b892b3015c9970dda9c4f42cd54971755531c02b70
|
|
| MD5 |
0369745fb39bac63eedd3d23abf2ff4d
|
|
| BLAKE2b-256 |
694225243725ad8cec5d0acd758e2deaacdd8e1b09eb94f98eccc53ce9f0a75e
|