Skip to main content

OpenCV with ONNX Runtime Inference Toolkit.

Project description

English | 中文

Capybara

Introduction

title

This project is an image processing and deep learning toolkit, mainly consisting of the following parts:

  • Vision: Provides functionalities related to computer vision, such as image and video processing.
  • Structures: Modules for handling structured data, such as BoundingBox and Polygon.
  • ONNXEngine: Provides ONNX inference functionalities, supporting ONNX format models.
  • Utils: Contains utility functions that do not belong to other modules.
  • Tests: Includes test code for various functions to verify their correctness.

Technical Documentation

For more detailed information on installation and usage, please refer to the Capybara Documents.

The document provides a detailed explanation of this project and answers to frequently asked questions.

Installation

Before starting the installation of Capybara, ensure that your system meets the following requirements:

Python Version

  • Python 3.10 or later is required.

Dependency Packages

Please install the necessary system packages according to your operating system:

  • Ubuntu

    sudo apt install libturbojpeg exiftool ffmpeg libheif-dev
    
  • MacOS

    brew install jpeg-turbo exiftool ffmpeg
    
    • Special Notes: After testing, there are some known issues when using libheif on macOS, including:

      1. Generated HEIC files cannot be opened: On macOS, HEIC files generated by libheif may not open with certain applications. This may be related to image dimensions, particularly when the image width or height is odd, causing compatibility issues.

      2. Compilation errors: When compiling libheif on macOS, you may encounter undefined symbol errors related to ffmpeg decoders. This could be caused by incorrect compilation options or dependency settings.

      3. Example programs do not run: On macOS Sonoma, the example programs of libheif might fail with dynamic link errors, indicating that libheif.1.dylib is missing. This might be related to dynamic library path settings.

      Due to these issues, we currently only run libheif on Ubuntu, and macOS support will be addressed in future versions.

pdf2image Dependency

pdf2image is a Python module used to convert PDF documents to images. Make sure the following tools are installed on your system:

  • MacOS: Install poppler

    brew install poppler
    
  • Linux: Most distributions already include pdftoppm and pdftocairo. If not, install them using:

    sudo apt install poppler-utils
    

ONNXRuntime GPU Dependencies

To use ONNXRuntime for GPU-accelerated inference, ensure that you have an appropriate version of CUDA installed. Here's an example:

sudo apt install cuda-12-4
# Add to .bashrc
echo 'export PATH=/usr/local/cuda-12.4/bin${PATH:+:${PATH}}' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc

Installation via PyPI

  1. Install the package from PyPI:

    pip install capybara-docsaid
    
  2. Verify the installation:

    python -c "import capybara; print(capybara.__version__)"
    
  3. If the version number is displayed, the installation was successful.

Installation via Git Clone

  1. Clone this repository:

    git clone https://github.com/DocsaidLab/Capybara.git
    
  2. Install the wheel package:

    pip install wheel
    
  3. Build the wheel file:

    cd Capybara
    python setup.py bdist_wheel
    
  4. Install the built wheel file:

    pip install dist/capybara_docsaid-*-py3-none-any.whl
    

Installation via Docker

To avoid environment conflicts during deployment or collaborative development, it's recommended to use Docker. Here's a brief guide:

  1. Clone this repository:

    git clone https://github.com/DocsaidLab/Capybara.git
    
  2. Enter the project directory and run the build script:

    cd Capybara
    bash docker/build.bash
    

    This will build an image using the Dockerfile in the project. The image is based on nvcr.io/nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04 by default, providing the CUDA environment required for ONNXRuntime inference.

  3. After the build is complete, mount the working directory and run the program:

    docker run -v ${PWD}:/code -it capybara_infer_image your_scripts.py
    

    To enable GPU acceleration, add --gpus all when running the command.

gosu Permissions Issues

If you encounter issues with file ownership as root when running scripts inside the container, causing permission problems, you can use gosu to switch users in the Dockerfile. Specify USER_ID and GROUP_ID when starting the container to avoid frequent permission adjustments in collaborative development.

For details, refer to the technical documentation: Integrating gosu Configuration

  1. Install gosu:

    RUN apt-get update && apt-get install -y gosu
    
  2. Use gosu in the container start command to switch to a non-root user for file read/write operations.

    # Create the entrypoint script
    RUN printf '#!/bin/bash\n\
        if [ ! -z "$USER_ID" ] && [ ! -z "$GROUP_ID" ]; then\n\
            groupadd -g "$GROUP_ID" -o usergroup\n\
            useradd --shell /bin/bash -u "$USER_ID" -g "$GROUP_ID" -o -c "" -m user\n\
            export HOME=/home/user\n\
            chown -R "$USER_ID":"$GROUP_ID" /home/user\n\
            chown -R "$USER_ID":"$GROUP_ID" /code\n\
        fi\n\
        \n\
        # Check for parameters\n\
        if [ $# -gt 0 ]; then\n\
            exec gosu ${USER_ID:-0}:${GROUP_ID:-0} python "$@"\n\
        else\n\
            exec gosu ${USER_ID:-0}:${GROUP_ID:-0} bash\n\
        fi' > "$ENTRYPOINT_SCRIPT"
    
    RUN chmod +x "$ENTRYPOINT_SCRIPT"
    
    ENTRYPOINT ["/bin/bash", "/entrypoint.sh"]
    

For more advanced configuration, refer to NVIDIA Container Toolkit and the official docker documentation.

Testing

This project uses pytest for unit testing, and users can run the tests themselves to verify the correctness of the functionalities. To install and run the tests, use the following commands:

pip install pytest
python -m pytest -vv tests

Once completed, you can check if all modules are functioning properly. If any issues arise, first check the environment settings and package versions.

If the problem persists, please report it in the Issue section.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

capybara_docsaid-0.8.0-py3-none-any.whl (68.7 kB view details)

Uploaded Python 3

File details

Details for the file capybara_docsaid-0.8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for capybara_docsaid-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8aea2f155758c22fcfb90873345bf0d5965f1e58f9c3babfe6907cf951fe71c9
MD5 3cac0dc11c993eed69b1ad33c847975e
BLAKE2b-256 3191bd0d30a3cf2b2984228a69c07911aaf4821f1cfbb5f91e76abd33692c9d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page