Skip to main content

No project description provided

Project description

MyOCR - Advanced OCR Pipeline Builder

myocr logo

Docs HuggingFace Docker PyPI License

English | 简体中文

MyOCR is a highly extensible and customizable framework for building OCR systems. Engineers can easily train, integrate deep learning models into custom OCR pipelines for real-world applications.

Try the online demo on HuggingFace or ModelScope

🌟 Key Features:

⚡️ End-to-End OCR Development Framework – Designed for developers to build and integrate detection, recognition, and custom OCR models in a unified and flexible pipeline.

🛠️ Modular & Extensible – Mix and match components - swap models, predictors, or input output processors with minimal changes.

🔌 Developer-Friendly by Design - Clean Python APIs, prebuilt pipelines and processors, and straightforward customization for training and inference.

🚀 Production-Ready Performance – ONNX runtime support for fast CPU/GPU inference, support various ways of deployment.

📣 Updates

  • 🔥2025.05.17 MyOCR v0.1.1 released

🛠️ Installation

📦 Requirements

  • Python 3.11+
  • CUDA: Version 12.6 or higher is recommended for GPU acceleration. CPU-only mode is also supported.
  • Operating System: Linux, macOS, or Windows.

📥 Install Dependencies

# Clone the code from GitHub
git clone https://github.com/robbyzhaox/myocr.git
cd myocr

# You can create your own venv before the following steps
# Install dependencies
pip install -e .

# Development environment installation
pip install -e ".[dev]"

# Download pre-trained model weights to models
# for Linux, macOS
mkdir -p ~/.MyOCR/models/
# for Windows, the "models" directory can be created in the current path
Download weights from: https://drive.google.com/drive/folders/1RXppgx4XA_pBX9Ll4HFgWyhECh5JtHnY
# Alternative download link: https://pan.baidu.com/s/122p9zqepWfbEmZPKqkzGBA?pwd=yq6j

🚀 Quick Start

🖥️ Local Inference

Basic OCR Recognition

from myocr.pipelines import CommonOCRPipeline

# Initialize common OCR pipeline (using GPU)
pipeline = CommonOCRPipeline("cuda:0")  # Use "cpu" for CPU mode

# Perform OCR recognition on an image
result = pipeline("path/to/your/image.jpg")
print(result)

Structured OCR Output (Example: Invoice Information Extraction)

config chat_bot in myocr.pipelines.config.structured_output_pipeline.yaml

chat_bot:
  model: qwen2.5:14b
  base_url: http://127.0.0.1:11434/v1
  api_key: 'key'

Note: chat bot currently support:

  • Ollama API
  • OpenAI API
from pydantic import BaseModel, Field
from myocr.pipelines import StructuredOutputOCRPipeline

# Define output data model, refer to:
from myocr.pipelines.response_format import InvoiceModel

# Initialize structured OCR pipeline
pipeline = StructuredOutputOCRPipeline("cuda:0", InvoiceModel)

# Process image and get structured data
result = pipeline("path/to/invoice.jpg")
print(result.to_dict())

🐳 Docker Deployment

The framework provides support for Docker deployment, which can be built and run using the following commands:

Run the Docker Container

docker run -d -p 8000:8000 robbyzhaox/myocr:latest

# set the environment variables like following with -e option of docker run if you want use the StructuredOutputOCRPipline
docker run -d \
  -p 8000:8000 \
  -e CHAT_BOT_MODEL="qwen2.5:14b" \
  -e CHAT_BOT_BASEURL="http://127.0.0.1:11434/v1" \
  -e CHAT_BOT_APIKEY="key" \
  robbyzhaox/myocr:latest

Accessing API Endpoints (Docker)

IMAGE_PATH="your_image.jpg"

BASE64_IMAGE=$(base64 -w 0 "$IMAGE_PATH")  # Linux
#BASE64_IMAGE=$(base64 -i "$IMAGE_PATH" | tr -d '\n') # macOS

curl -X POST \
  -H "Content-Type: application/json" \
  -d "{\"image\": \"${BASE64_IMAGE}\"}" \
  http://localhost:8000/ocr

🔗 Using Rest API

The framework provides a simple Flask API service that can be called via HTTP interface:

# Start the service default port: 5000
python main.py 

API endpoints:

  • GET /ping: Check if the service is running properly
  • POST /ocr: Basic OCR recognition
  • POST /ocr-json: Structured OCR output

We also have a UI for these endpoints, please refer to doc-insight-ui

Star History

Star History Chart

🎖 Contribution Guidelines

We welcome any form of contribution, including but not limited to:

  • Submitting bug reports
  • Adding new features
  • Improving documentation
  • Optimizing performance

📄 License

This project is open-sourced under the Apache 2.0 License, see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myocr_kit-0.1.1.tar.gz (64.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

myocr_kit-0.1.1-py3-none-any.whl (68.7 kB view details)

Uploaded Python 3

File details

Details for the file myocr_kit-0.1.1.tar.gz.

File metadata

  • Download URL: myocr_kit-0.1.1.tar.gz
  • Upload date:
  • Size: 64.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for myocr_kit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c1d81db1798c583578ed3935b9f5a4101c0a6d3147751abda511adab9f167b4a
MD5 87ca79f5a8ed6bbdadf428007ecce44d
BLAKE2b-256 352c2911add96218873006a915562624b54d07dc8b62d48990cd2b06a355f41d

See more details on using hashes here.

File details

Details for the file myocr_kit-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: myocr_kit-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 68.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.12

File hashes

Hashes for myocr_kit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 32b8593c9cb4435958435eb2901f83f1d1c7dd6fc114dc3deba765cd69576dee
MD5 68512c6658be3b921df12210923cb04a
BLAKE2b-256 4cf9bf2169674955c049481dcfd26d521d99271cfb04045e34f83760d5fde6b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page