Skip to main content

A lightweight, unified OCR toolkit with a one-liner API. Supports Surya, EasyOCR, PaddleOCR, Tesseract, and Vision LLMs through a single interface.

Project description

anyocr

One unified API over every major OCR engine — from Tesseract to vision LLMs.

PyPI Python License

anyocr gives you a single anyocr.read() call that extracts text from images and documents using whichever OCR engine you have installed. It auto-selects the best available backend by priority (Surya > EasyOCR > PaddleOCR > Tesseract > Vision-LLM), applies a smart preprocessing pipeline (auto-rotate, deskew, contrast enhancement, binarization), and returns structured results with bounding boxes, confidence scores, and reading order.

Built by Viet-Anh Nguyen at NRL.ai.

Why anyocr?

  • One-liner APIanyocr.read("scan.png") just works with any installed backend
  • Plugin architecture — Register new OCR engines via @register_backend
  • Local-first — Surya, EasyOCR, Paddle, Tesseract all run on your machine
  • Minimal core deps — Only pillow and numpy; every OCR engine is an optional extra
  • Production-ready — Auto-preprocessing, structured dataclass results, batch inference

Installation

pip install anyocr

For optional backends:

pip install anyocr[surya]        # Surya OCR — SOTA open source, 90+ languages
pip install anyocr[easyocr]      # EasyOCR — CRNN models, 80+ languages
pip install anyocr[paddle]       # PaddleOCR — strong Asian languages
pip install anyocr[tesseract]    # Tesseract via pytesseract (needs tesseract binary)
pip install anyocr[vlm]          # Vision-LLM via anyllm (GPT-4V, Claude, Gemini)
pip install anyocr[all]          # everything

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anyocr

# 1. Auto-selects the best installed backend (Surya > EasyOCR > Paddle > Tesseract > VLM)
result = anyocr.read("receipt.png")
print(result.text)                       # full extracted text
for line in result.lines:
    print(line.text, line.confidence, line.bbox)

# 2. Force a specific backend
text = anyocr.read("chinese.jpg", backend="paddle", lang="ch").text

# 3. Use a vision LLM for hard cases (tables, handwriting)
text = anyocr.read("handwritten.jpg", backend="vlm", model="gpt-4o").text

Models & Methods

Supported backends (auto-selected by priority)

Priority Backend Model family Languages Install
1 Surya OCR Transformer-based detection + recognition (DETR + Donut-style) 90+ anyocr[surya]
2 EasyOCR CRAFT detector + CRNN recognizer 80+ anyocr[easyocr]
3 PaddleOCR PP-OCRv4 (DBNet + SVTR) 80+, strong CJK anyocr[paddle]
4 Tesseract LSTM-based (Tesseract 4+) 100+ anyocr[tesseract]
5 Vision-LLM Any multi-modal LLM via anyllm (GPT-4V, Claude 3.5 Sonnet, Gemini, LLaVA) Any anyocr[vlm]

You can change the priority or force a backend via anyocr.read(..., backend="easyocr") or anyocr.set_priority(["paddle", "surya"]).

Preprocessing pipeline

Applied automatically (can be disabled per call):

  1. EXIF orientation fix — rotate based on metadata
  2. Auto-rotate — detect 90/180/270 rotation via text-line angle histogram
  3. Deskew — Hough-transform-based angle correction (<= 15 degrees)
  4. Contrast enhancement — CLAHE (adaptive histogram equalization)
  5. Binarization — adaptive threshold for low-quality scans (opt-in)
  6. Denoise — bilateral filter for scanned documents (opt-in)

Result dataclasses

@dataclass
class OCRLine:
    text: str
    confidence: float
    bbox: tuple[float, float, float, float]   # x1, y1, x2, y2
    polygon: list[tuple[float, float]] | None  # 4-point quad if supported

@dataclass
class OCRResult:
    text: str                  # joined full text in reading order
    lines: list[OCRLine]
    backend: str               # which backend produced this result
    language: str | None

API Reference

Function Purpose
anyocr.read(image, backend="auto", lang=None) Run OCR, returns OCRResult
anyocr.read_pdf(pdf_path) OCR every page of a PDF
anyocr.list_backends() Show installed backends
anyocr.set_priority([...]) Override auto-selection order
anyocr.preprocess(image, ops=[...]) Run preprocessing pipeline only
anyocr.register_backend(name, cls) Add a custom backend

CLI Usage

anyocr read receipt.png
anyocr read scan.jpg --backend paddle --lang ch
anyocr read-pdf document.pdf --out text.txt
anyocr list-backends

Examples

OCR an entire PDF and save as text

import anyocr

# Rasterizes each page and runs the auto-selected backend
result = anyocr.read_pdf("report.pdf")
with open("report.txt", "w") as f:
    for page_num, page in enumerate(result.pages, 1):
        f.write(f"=== Page {page_num} ===\n{page.text}\n\n")

Combine preprocessing with a specific backend

import anyocr

# Run the preprocessing pipeline explicitly before OCR
cleaned = anyocr.preprocess("noisy_scan.jpg", ops=["deskew", "clahe", "binarize"])
result  = anyocr.read(cleaned, backend="tesseract", lang="eng")
print(result.text)

Compare two backends on the same image

import anyocr

for backend in ["surya", "easyocr", "paddle"]:
    r = anyocr.read("test.jpg", backend=backend)
    print(f"{backend}: {r.text[:80]}... (avg conf {r.mean_confidence():.2f})")

License

MIT (c) Viet-Anh Nguyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyocr-0.2.4.tar.gz (40.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyocr-0.2.4-py3-none-any.whl (36.5 kB view details)

Uploaded Python 3

File details

Details for the file anyocr-0.2.4.tar.gz.

File metadata

  • Download URL: anyocr-0.2.4.tar.gz
  • Upload date:
  • Size: 40.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyocr-0.2.4.tar.gz
Algorithm Hash digest
SHA256 ee5e0862ce781d743b7d176141509dfa52ea698e6c6ec498fb9d50fa3bba2f5b
MD5 f023cab7239b73ff67b0aca3595f6380
BLAKE2b-256 31c07438533af7d245f9d0ddad60925edf9bb14f2143f7cbf6b7362d7ff211ec

See more details on using hashes here.

File details

Details for the file anyocr-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: anyocr-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 36.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyocr-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2661b06ee9a1188afc9631f3bf02e3f33e705af8c1201d3d1be21e1a772e096b
MD5 651ed2fc539d6d8ebd2078ef8a199fa5
BLAKE2b-256 4554fe25ecc0c0ea85ebc54a2c817e25735d3343986a7d2dbff1d418491b3b4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page