Accurate and Efficient General OCR System

These details have not been verified by PyPI

Project links

Project description

OpenOCR: An Open-Source Toolkit for General-OCR Research and Applications

For More Information

Visit: https://github.com/Topdu/OpenOCR

Recent Updates

0.1.4: Support the PDF file as an input; Parallel recognition of document elements; Add skill document
0.1.3: Use a unified interface for OCR, Document Parsing, and Unirec
0.0.10: Remove OpenCV version restrictions.
0.0.9: Fixing torch inference bug.
0.0.8: Automatic Downloading ONNX model.
0.0.7: Releasing the feature of ONNX model export for wider compatibility.

Quick Start Guide

Installation

# Install from PyPI (recommended)
pip install openocr-python==0.1.5

# Or install from source
git clone https://github.com/Topdu/OpenOCR.git
cd OpenOCR
python build_package.py
pip install ./build/dist/openocr_python-*.whl

Command Line Usage

1. Text Detection + Recognition (OCR)

End-to-end OCR for Chinese/English text detection and recognition:

# Basic usage
openocr --task ocr --input_path path/to/img

# With visualization
openocr --task ocr --input_path path/to/img --is_vis

# Process directory with custom output
openocr --task ocr --input_path ./images --output_path ./results --is_vis

# Use server mode (higher accuracy)
pip install torch torchvision
openocr --task ocr --input_path path/to/img --mode server --backend torch

2. Text Detection Only

Detect text regions without recognition:

# Basic detection
openocr --task det --input_path path/to/img

# With visualization
openocr --task det --input_path path/to/img --is_vis

# Use polygon detection (more accurate for curved text)
openocr --task det --input_path path/to/img --det_box_type poly

3. Text Recognition Only

Recognize text from cropped word/line images:

# Basic recognition
openocr --task rec --input_path path/to/img

# Use server mode (higher accuracy)
pip install torch torchvision
openocr --task rec --input_path path/to/img --mode server --backend torch

# Batch processing
openocr --task rec --input_path ./word_images --rec_batch_num 16

4. Universal Recognition (UniRec)

Recognize text, formulas, and tables using Vision-Language Model:

# Basic usage
openocr --task unirec --input_path path/to/img

# Process directory
openocr --task unirec --input_path ./images --output_path ./results

5. Document Parsing (OpenDoc)

Parse documents with layout analysis, table/formula/table recognition:

# Full document parsing with all outputs
openocr --task doc --input_path path/to/img --use_layout_detection --save_vis --save_json --save_markdown

# Parse PDF document
openocr --task doc --input_path document.pdf --use_layout_detection --save_vis --save_json --save_markdown

# Custom layout threshold
openocr --task doc --input_path path/to/img --use_layout_detection --save_vis --save_json --save_markdown --layout_threshold 0.5

Launch Interactive Demos

# Install gradio
pip install gradio

OCR Demo

Launch Gradio web interface for OCR tasks:

# Local access only
openocr --task launch_openocr_demo --server_port 7860

# Public share link
openocr --task launch_openocr_demo --server_port 7860 --share

UniRec Demo

Launch Gradio web interface for universal recognition:

openocr --task launch_unirec_demo --server_port 7861 --share

OpenDoc Demo

Launch Gradio web interface for document parsing:

openocr --task launch_opendoc_demo --server_port 7862 --share

Python API Usage

OCR Task

import json
from openocr import OpenOCR

# Initialize OCR engine
ocr = OpenOCR(task='ocr', mode='mobile')

# Process single image
results, time_dicts = ocr(
    image_path='path/to/image.jpg',
    save_dir='./output',
    is_visualize=True
)

# Access results
for result in results:
    image_name, ocr_result = result.split('\t')
    ocr_result = json.loads(ocr_result)
    print(f"✅ OCR: {image_name} results: {ocr_result}")

Detection Task

from openocr import OpenOCR

# Initialize detector
detector = OpenOCR(task='det')

# Detect text regions
results = detector(image_path='path/to/image.jpg')

# Access detection boxes
boxes = results[0]['boxes']
print(f"Found {len(boxes)} text regions")

Recognition Task

from openocr import OpenOCR

# Initialize recognizer
recognizer = OpenOCR(task='rec', mode='server', backend='torch') # pip install torch torchvision

# Recognize text
results = recognizer(image_path='path/to/word.jpg')

# Access recognition result
text = results[0]['text']
score = results[0]['score']
print(f"Text: {text}, Confidence: {score}")

UniRec Task

from openocr import OpenOCR

# Initialize UniRec
unirec = OpenOCR(task='unirec')

# Recognize text/formula/table
result_text, generated_ids = unirec(
    image_path='path/to/image.jpg',
    max_length=2048
)
print(f"Result: {result_text}")

Document Parsing Task

from openocr import OpenOCR

# Initialize OpenDoc
doc_parser = OpenOCR(
    task='doc',
    use_layout_detection=True,
)

# Parse document
result = doc_parser(image_path='path/to/document.jpg')

# Save results
doc_parser.save_to_markdown(result, './output')
doc_parser.save_to_json(result, './output')
doc_parser.save_visualization(result, './output')

Common Parameters

--task: Task type (ocr, det, rec, unirec, doc, launch_*_demo)
--input_path: Input image/PDF path or directory
--output_path: Output directory (default: openocr_output/{task})
--use_gpu: GPU usage (auto, true, false)
--mode: Model mode (mobile, server) - server mode has higher accuracy
--is_vis: Visualize results
--save_vis: Save visualization (doc task)
--save_json: Save JSON results (doc task)
--save_markdown: Save Markdown results (doc task)

Output Structure

Results are saved to openocr_output/{task}/ by default:

OCR task: ocr_results.txt + visualization images (if --is_vis)
Detection task: det_results.txt + visualization images (if --is_vis)
Recognition task: rec_results.txt
UniRec task: unirec_results.txt
Doc task: JSON files, Markdown files, visualization images (based on flags)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.5

Feb 12, 2026

0.1.4

Feb 12, 2026

0.1.3

Feb 8, 2026

0.1.2

Feb 8, 2026

0.1.0.dev5 pre-release

Feb 7, 2026

0.1.0.dev4 pre-release

Feb 7, 2026

0.1.0.dev3 pre-release

Feb 7, 2026

0.1.0.dev2 pre-release

Feb 7, 2026

0.1.0.dev1 pre-release

Feb 7, 2026

0.1.0.dev0 pre-release

Feb 7, 2026

0.0.10

Jul 10, 2025

0.0.9

Mar 28, 2025

0.0.8

Mar 24, 2025

0.0.7

Mar 23, 2025

0.0.6

Dec 4, 2024

0.0.5

Dec 4, 2024

0.0.4

Nov 25, 2024

0.0.3

Nov 25, 2024

0.0.2

Nov 25, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openocr_python-0.1.5.tar.gz (460.1 kB view details)

Uploaded Feb 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openocr_python-0.1.5-py3-none-any.whl (710.8 kB view details)

Uploaded Feb 12, 2026 Python 3

File details

Details for the file openocr_python-0.1.5.tar.gz.

File metadata

Download URL: openocr_python-0.1.5.tar.gz
Upload date: Feb 12, 2026
Size: 460.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for openocr_python-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`734f42bf6ac832d5339e153cb13c4b5e75b369d4b435e819d97554ad8dec7397`
MD5	`1559ba8ac7099d31136a74524e0c2f1a`
BLAKE2b-256	`7f21c484d1f31b0001bdc9d040dea1eabb165158eed57ec3280c19956e4d4279`

See more details on using hashes here.

File details

Details for the file openocr_python-0.1.5-py3-none-any.whl.

File metadata

Download URL: openocr_python-0.1.5-py3-none-any.whl
Upload date: Feb 12, 2026
Size: 710.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for openocr_python-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`903c4ddf1aeb2e632ce2240d7189d75ba08ab4a151a2e22f025ad7454ad9d336`
MD5	`37c32d6d4311e482f7673d9edcd44726`
BLAKE2b-256	`faec08007259fd5d9608f5868cafe1126ca79100efed884158beb4e9d3c3fe66`

See more details on using hashes here.

openocr-python 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OpenOCR: An Open-Source Toolkit for General-OCR Research and Applications

For More Information

Recent Updates

Quick Start Guide

Installation

Command Line Usage

1. Text Detection + Recognition (OCR)

2. Text Detection Only

3. Text Recognition Only

4. Universal Recognition (UniRec)

5. Document Parsing (OpenDoc)

Launch Interactive Demos

OCR Demo

UniRec Demo

OpenDoc Demo

Python API Usage

OCR Task

Detection Task

Recognition Task

UniRec Task

Document Parsing Task

Common Parameters

Output Structure

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes