Skip to main content

A lightweight, unified OCR toolkit with a one-liner API. Supports Surya, EasyOCR, PaddleOCR, Tesseract, and Vision LLMs through a single interface.

Project description

anyocr

One unified API over every major OCR engine — from Tesseract to vision LLMs.

PyPI Python License

anyocr gives you a single anyocr.read() call that extracts text from images and documents using whichever OCR engine you have installed. It auto-selects the best available backend by priority (Surya > EasyOCR > PaddleOCR > Tesseract > Vision-LLM), applies a smart preprocessing pipeline (auto-rotate, deskew, contrast enhancement, binarization), and returns structured results with bounding boxes, confidence scores, and reading order.

Built by Viet-Anh Nguyen at NRL.ai.

Why anyocr?

  • One-liner APIanyocr.read("scan.png") just works with any installed backend
  • Plugin architecture — Register new OCR engines via @register_backend
  • Local-first — Surya, EasyOCR, Paddle, Tesseract all run on your machine
  • Minimal core deps — Only pillow and numpy; every OCR engine is an optional extra
  • Production-ready — Auto-preprocessing, structured dataclass results, batch inference

Installation

pip install anyocr

For optional backends:

pip install anyocr[surya]        # Surya OCR — SOTA open source, 90+ languages
pip install anyocr[easyocr]      # EasyOCR — CRNN models, 80+ languages
pip install anyocr[paddle]       # PaddleOCR — strong Asian languages
pip install anyocr[tesseract]    # Tesseract via pytesseract (needs tesseract binary)
pip install anyocr[vlm]          # Vision-LLM via anyllm (GPT-4V, Claude, Gemini)
pip install anyocr[all]          # everything

Python 3.8+ supported (tested on 3.8, 3.9, 3.10, 3.11, 3.12, 3.13)

Quick Start

import anyocr

# 1. Auto-selects the best installed backend (Surya > EasyOCR > Paddle > Tesseract > VLM)
result = anyocr.read("receipt.png")
print(result.text)                       # full extracted text
for line in result.lines:
    print(line.text, line.confidence, line.bbox)

# 2. Force a specific backend
text = anyocr.read("chinese.jpg", backend="paddle", lang="ch").text

# 3. Use a vision LLM for hard cases (tables, handwriting)
text = anyocr.read("handwritten.jpg", backend="vlm", model="gpt-4o").text

Models & Methods

Supported backends (auto-selected by priority)

Priority Backend Model family Languages Install
1 Surya OCR Transformer-based detection + recognition (DETR + Donut-style) 90+ anyocr[surya]
2 EasyOCR CRAFT detector + CRNN recognizer 80+ anyocr[easyocr]
3 PaddleOCR PP-OCRv4 (DBNet + SVTR) 80+, strong CJK anyocr[paddle]
4 Tesseract LSTM-based (Tesseract 4+) 100+ anyocr[tesseract]
5 Vision-LLM Any multi-modal LLM via anyllm (GPT-4V, Claude 3.5 Sonnet, Gemini, LLaVA) Any anyocr[vlm]

You can change the priority or force a backend via anyocr.read(..., backend="easyocr") or anyocr.set_priority(["paddle", "surya"]).

Preprocessing pipeline

Applied automatically (can be disabled per call):

  1. EXIF orientation fix — rotate based on metadata
  2. Auto-rotate — detect 90/180/270 rotation via text-line angle histogram
  3. Deskew — Hough-transform-based angle correction (<= 15 degrees)
  4. Contrast enhancement — CLAHE (adaptive histogram equalization)
  5. Binarization — adaptive threshold for low-quality scans (opt-in)
  6. Denoise — bilateral filter for scanned documents (opt-in)

Result dataclasses

@dataclass
class OCRLine:
    text: str
    confidence: float
    bbox: tuple[float, float, float, float]   # x1, y1, x2, y2
    polygon: list[tuple[float, float]] | None  # 4-point quad if supported

@dataclass
class OCRResult:
    text: str                  # joined full text in reading order
    lines: list[OCRLine]
    backend: str               # which backend produced this result
    language: str | None

API Reference

Function Purpose
anyocr.read(image, backend="auto", lang=None) Run OCR, returns OCRResult
anyocr.read_pdf(pdf_path) OCR every page of a PDF
anyocr.list_backends() Show installed backends
anyocr.set_priority([...]) Override auto-selection order
anyocr.preprocess(image, ops=[...]) Run preprocessing pipeline only
anyocr.register_backend(name, cls) Add a custom backend

CLI Usage

anyocr read receipt.png
anyocr read scan.jpg --backend paddle --lang ch
anyocr read-pdf document.pdf --out text.txt
anyocr list-backends

Examples

OCR an entire PDF and save as text

import anyocr

# Rasterizes each page and runs the auto-selected backend
result = anyocr.read_pdf("report.pdf")
with open("report.txt", "w") as f:
    for page_num, page in enumerate(result.pages, 1):
        f.write(f"=== Page {page_num} ===\n{page.text}\n\n")

Combine preprocessing with a specific backend

import anyocr

# Run the preprocessing pipeline explicitly before OCR
cleaned = anyocr.preprocess("noisy_scan.jpg", ops=["deskew", "clahe", "binarize"])
result  = anyocr.read(cleaned, backend="tesseract", lang="eng")
print(result.text)

Compare two backends on the same image

import anyocr

for backend in ["surya", "easyocr", "paddle"]:
    r = anyocr.read("test.jpg", backend=backend)
    print(f"{backend}: {r.text[:80]}... (avg conf {r.mean_confidence():.2f})")

License

MIT (c) Viet-Anh Nguyen

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyocr-0.2.3.tar.gz (38.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyocr-0.2.3-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file anyocr-0.2.3.tar.gz.

File metadata

  • Download URL: anyocr-0.2.3.tar.gz
  • Upload date:
  • Size: 38.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyocr-0.2.3.tar.gz
Algorithm Hash digest
SHA256 faedac38f4c8798c28800eed6a6161230b963509cba19b1fa79b67ac85a55681
MD5 5562bcac1452ae85f007ac97b2c8eb3f
BLAKE2b-256 379ae7aba21b51070267024ac85f261120dd952ae05635c1f540d1a4dcb58a83

See more details on using hashes here.

File details

Details for the file anyocr-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: anyocr-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for anyocr-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6e0baeddc7c4ba444551f0b892b3015c9970dda9c4f42cd54971755531c02b70
MD5 0369745fb39bac63eedd3d23abf2ff4d
BLAKE2b-256 694225243725ad8cec5d0acd758e2deaacdd8e1b09eb94f98eccc53ce9f0a75e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page