Skip to main content

Implements Sinapsis templates to perform optical character recognition on images

Project description



Sinapsis OCR

Templates for Optical Character Recognition (OCR) in images or PDFs

🐍 Installation📦 Packages🚀 Features📚 Usage example🌐 Webapp📙 Documentation🔍 License

Sinapsis OCR provides powerful and flexible implementations for extracting text from images using different OCR engines. It enables users to easily configure and run OCR tasks with minimal setup.

🐍 Installation

This mono repo consists of different packages for OCR:

  • sinapsis-deepseek-ocr
  • sinapsis-doctr
  • sinapsis-easyocr

Install using your package manager of choice. We encourage the use of uv

Example with uv:

  uv pip install sinapsis-doctr --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-doctr --extra-index-url https://pypi.sinapsis.tech

Change the name of the package for the one you want to install.

[!IMPORTANT] Templates in each package may require extra dependencies. For development, we recommend installing the package with all the optional dependencies:

with uv:

  uv pip install sinapsis-doctr[all] --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-doctr[all] --extra-index-url https://pypi.sinapsis.tech

[!TIP] You can also install all the packages within this project:

  uv pip install sinapsis-ocr[all] --extra-index-url https://pypi.sinapsis.tech

📦 Packages

Packages summary
  • Sinapsis DeepSeek OCR

    • Uses the DeepSeek OCR model for high-quality OCR
    • Supports optional grounding for bounding box extraction
    • Multiple inference modes (tiny, small, base, large, gundam)
  • Sinapsis DocTR

    • Uses the DocTR library for high-quality OCR with modern deep learning models
    • Supports multiple detection and recognition architectures
    • Provides detailed text extraction with bounding boxes and confidence scores
  • Sinapsis EasyOCR

    • Leverages the EasyOCR library for simple yet effective OCR
    • Supports multiple languages
    • Extracts text with bounding boxes and confidence scores

[!TIP] Use CLI command sinapsis info --all-template-names to show a list with all the available Template names installed with Sinapsis OCR.

[!TIP] Use CLI command sinapsis info --example-template-config TEMPLATE_NAME to produce an example Agent config for the Template specified in TEMPLATE_NAME.

For example, for DocTROCRPrediction use sinapsis info --example-template-config DocTROCRPrediction to produce an example config.

📚 Usage example

DocTR Example
agent:
  name: doctr_prediction
  description: agent to run inference with DocTR, performs on images read, recognition and save

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: FolderImageDatasetCV2
  class_name: FolderImageDatasetCV2
  template_input: InputTemplate
  attributes:
    data_dir: dataset/input

- template_name: DocTROCRPrediction
  class_name: DocTROCRPrediction
  template_input: FolderImageDatasetCV2
  attributes:
    recognized_characters_as_labels: True

- template_name: BBoxDrawer
  class_name: BBoxDrawer
  template_input: DocTROCRPrediction
  attributes:
    draw_confidence: True
    draw_extra_labels: True

- template_name: ImageSaver
  class_name: ImageSaver
  template_input: BBoxDrawer
  attributes:
    save_dir: output
    root_dir: dataset
EasyOCR Example
agent:
  name: easyocr_inference
  description: agent to run inference with EasyOCR, performs on images read, recognition and save

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: FolderImageDatasetCV2
  class_name: FolderImageDatasetCV2
  template_input: InputTemplate
  attributes:
    data_dir: dataset/input

- template_name: EasyOCR
  class_name: EasyOCR
  template_input: FolderImageDatasetCV2
  attributes: {}

- template_name: BBoxDrawer
  class_name: BBoxDrawer
  template_input: EasyOCR
  attributes:
    draw_confidence: True
    draw_extra_labels: True

- template_name: ImageSaver
  class_name: ImageSaver
  template_input: BBoxDrawer
  attributes:
    save_dir: output
    root_dir: dataset
DeepSeek OCR Example
agent:
  name: deepseek_ocr_inference
  description: agent to run inference with DeepSeek OCR

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: FolderImageDatasetCV2
  class_name: FolderImageDatasetCV2
  template_input: InputTemplate
  attributes:
    data_dir: dataset/input

- template_name: DeepSeekOCRInference
  class_name: DeepSeekOCRInference
  template_input: FolderImageDatasetCV2
  attributes:
    prompt: "Convert the document to markdown."
    enable_grounding: true
    mode: base

- template_name: BBoxDrawer
  class_name: BBoxDrawer
  template_input: DeepSeekOCRInference
  attributes:
    draw_confidence: True
    draw_extra_labels: True

- template_name: ImageSaver
  class_name: ImageSaver
  template_input: BBoxDrawer
  attributes:
    save_dir: output
    root_dir: dataset

To run, simply use:

sinapsis run name_of_the_config.yml

🌐 Webapp

The webapp provides a simple interface to extract text from images using OCR. Upload your image, and the app will process it and display the detected text with bounding boxes.

[!IMPORTANT] To run the app you first need to clone this repository:

git clone https://github.com/Sinapsis-ai/sinapsis-ocr.git
cd sinapsis-ocr

[!NOTE] If you'd like to enable external app sharing in Gradio, export GRADIO_SHARE_APP=True

[!TIP] The agent configuration can be updated using the AGENT_CONFIG_PATH environment var. For default uses the config for easy ocr but this can be chaged with: AGENT_CONFIG_PATH=/app/packages/sinapsis_doctr/src/sinapsis_doctr/configs/doctr_demo.yaml

🐳 Docker

IMPORTANT This docker image depends on the sinapsis:base image. Please refer to the official sinapsis instructions to Build with Docker.

  1. Build the sinapsis-ocr image:
docker compose -f docker/compose.yaml build
  1. Start the app container:
docker compose -f docker/compose_app.yaml up
  1. Check the status:
docker logs -f sinapsis-ocr-app
  1. The logs will display the URL to access the webapp, e.g.:

NOTE: The url can be different, check the output of logs

Running on local URL:  http://127.0.0.1:7860
  1. To stop the app:
docker compose -f docker/compose_app.yaml down
💻 UV

To run the webapp using the uv package manager, please:

  1. Create the virtual environment and sync the dependencies:
uv sync --frozen
  1. Install packages:
uv pip install sinapsis-ocr[all] --extra-index-url https://pypi.sinapsis.tech
  1. Run the webapp:
uv run webapps/gradio_ocr.py
  1. The terminal will display the URL to access the webapp, e.g.:
Running on local URL:  http://127.0.0.1:7860

NOTE: The url can be different, check the output of the terminal

  1. To stop the app press Control + C on the terminal

📙 Documentation

Documentation for this and other sinapsis packages is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_ocr-0.2.0.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sinapsis_ocr-0.2.0-py3-none-any.whl (32.8 kB view details)

Uploaded Python 3

File details

Details for the file sinapsis_ocr-0.2.0.tar.gz.

File metadata

  • Download URL: sinapsis_ocr-0.2.0.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.17

File hashes

Hashes for sinapsis_ocr-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d54987cc630b0d8ac5666db165709194fabb0da92d78254ce7fba1d611e49a01
MD5 282cfb69aa9577e048b5a4df5187d75e
BLAKE2b-256 5a13744f61880d64110bd45ac4527b38e93f4a251e4ff0386c059df74f1d8c10

See more details on using hashes here.

File details

Details for the file sinapsis_ocr-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sinapsis_ocr-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ecc6fee1078eb5bb9b2ee13c981713c3b8ff3399bab1a1b4baebd79f0f8eefc0
MD5 97db83779f8e00b5230d869363dd1712
BLAKE2b-256 7120e874636eae3de045cb3af26370b357e6bac08b687925f0b250d0f90b2022

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page