Skip to main content

Perform optical character recognition using the DocTR library

Project description



Sinapsis DocTR

DocTR-based Optical Character Recognition (OCR) for images

🐍 Installation🚀 Features📚 Usage example🌐 Webapp📙 Documentation🔍 License

Sinapsis DocTR provides a powerful and flexible implementation for extracting text from images using the DocTR OCR engine. It enables users to easily configure and run OCR tasks with minimal setup.

🐍 Installation

Install using your package manager of choice. We encourage the use of uv

Example with uv:

  uv pip install sinapsis-doctr --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-doctr --extra-index-url https://pypi.sinapsis.tech

[!IMPORTANT] Templates may require extra dependencies. For development, we recommend installing the package with all the optional dependencies:

with uv:

  uv pip install sinapsis-doctr[all] --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-doctr[all] --extra-index-url https://pypi.sinapsis.tech

[!TIP] Use CLI command sinapsis info --all-template-names to show a list with all the available Template names installed with Sinapsis OCR.

[!TIP] Use CLI command sinapsis info --example-template-config DocTROCRPrediction to produce an example Agent config for the DocTROCRPrediction template.

🚀 Features

Templates Supported

This module includes a template tailored for the DocTR OCR engine:

  • DocTROCRPrediction: Uses DocTR's OCR model to extract text, bounding boxes, and confidence scores from images.
DocTROCRPrediction Attributes
  • recognized_characters_as_labels (bool): Whether to use recognized characters as labels. Defaults to False.
  • artefact_type_as_labels (bool): Whether to use artefact type as labels. Defaults to False.
  • det_arch (str): Detection architecture to use. Defaults to "fast_base".
  • reco_arch (str): Recognition architecture to use. Defaults to "crnn_vgg16_bn".
  • pretrained (bool): Whether to use pretrained models. Defaults to True.
  • pretrained_backbone (bool): Whether to use pretrained backbone. Defaults to True.
  • assume_straight_pages (bool): Whether to assume pages are straight. Defaults to True.
  • preserve_aspect_ratio (bool): Whether to preserve aspect ratio. Defaults to True.
  • symmetric_pad (bool): Whether to use symmetric padding. Defaults to True.
  • export_as_straight_boxes (bool): Whether to export as straight boxes. Defaults to False.
  • detect_orientation (bool): Whether to detect orientation. Defaults to False.
  • straighten_pages (bool): Whether to straighten pages. Defaults to False.
  • detect_language (bool): Whether to detect language. Defaults to False.

📚 Usage example

DocTR Example
agent:
  name: doctr_prediction
  description: agent to run inference with DocTR, performs on images read, recognition and save

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: FolderImageDatasetCV2
  class_name: FolderImageDatasetCV2
  template_input: InputTemplate
  attributes:
    data_dir: dataset/input

- template_name: DocTROCRPrediction
  class_name: DocTROCRPrediction
  template_input: FolderImageDatasetCV2
  attributes:
    recognized_characters_as_labels: True

- template_name: BBoxDrawer
  class_name: BBoxDrawer
  template_input: DocTROCRPrediction
  attributes:
    draw_confidence: True
    draw_extra_labels: True

- template_name: ImageSaver
  class_name: ImageSaver
  template_input: BBoxDrawer
  attributes:
    save_dir: output
    root_dir: dataset

To run, simply use:

sinapsis run name_of_the_config.yml

🌐 Webapp

The webapp provides a simple interface to extract text from images using DocTR OCR. Upload your image, and the app will process it and display the detected text with bounding boxes.

[!IMPORTANT] To run the app you first need to clone the sinapsis-ocr repository:

git clone https://github.com/Sinapsis-ai/sinapsis-ocr.git
cd sinapsis-ocr

[!NOTE] If you'd like to enable external app sharing in Gradio, export GRADIO_SHARE_APP=True

[!IMPORTANT] To use DocTR in the webapp, set the environment variable: AGENT_CONFIG_PATH=/app/packages/sinapsis_doctr/src/sinapsis_doctr/configs/doctr_demo.yaml

🐳 Docker

IMPORTANT This docker image depends on the sinapsis:base image. Please refer to the official sinapsis instructions to Build with Docker.

  1. Build the sinapsis-ocr image:
docker compose -f docker/compose.yaml build
  1. Start the app container:
docker compose -f docker/compose_app.yaml up
  1. Check the status:
docker logs -f sinapsis-ocr-app
  1. The logs will display the URL to access the webapp, e.g.:

NOTE: The url can be different, check the output of logs

Running on local URL:  http://127.0.0.1:7860
  1. To stop the app:
docker compose -f docker/compose_app.yaml down
💻 UV

To run the webapp using the uv package manager, please:

  1. Create the virtual environment and sync the dependencies:
uv sync --frozen
  1. Install packages:
uv pip install sinapsis-doctr[all] --extra-index-url https://pypi.sinapsis.tech
  1. Run the webapp:
uv run webapps/gradio_ocr.py
  1. The terminal will display the URL to access the webapp, e.g.:
Running on local URL:  http://127.0.0.1:7860

NOTE: The url can be different, check the output of the terminal

  1. To stop the app press Control + C on the terminal

📙 Documentation

Documentation for this and other sinapsis packages is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_doctr-0.1.1.tar.gz (45.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sinapsis_doctr-0.1.1-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file sinapsis_doctr-0.1.1.tar.gz.

File metadata

  • Download URL: sinapsis_doctr-0.1.1.tar.gz
  • Upload date:
  • Size: 45.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_doctr-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b57b8e2768647febc3c7b050fba196f6cd2e69b67f58e61a5116ef09373e48f5
MD5 07db8277dfc1128bb8a7f19165168402
BLAKE2b-256 cf3c79d0d4153dca7be23948e6b60a336de64bef1d060f02d38adc53b4e5281e

See more details on using hashes here.

File details

Details for the file sinapsis_doctr-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for sinapsis_doctr-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 beb4fc4ad3e6d7e280b4152703b188d0efaa31efc01db32d6eb4a45c08c5d9d6
MD5 271d89cdec898006c171e4fb7b3fcdb3
BLAKE2b-256 eddc8270118db3e13a1b7acca3a32afdc88f61cbfe6db529cbfed1ffa60780f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page