Skip to main content

An Image Processing and Deep Learning Toolkit.

Project description

English | 中文

Capybara

An Integrated Python Package for Image Processing and Deep Learning.

Introduction

title

This project is an image processing and deep learning toolkit, mainly consisting of the following parts:

  • Vision: Provides functionalities related to computer vision, such as image and video processing.
  • Structures: Modules for handling structured data, such as BoundingBox and Polygon.
  • ONNXEngine: Provides ONNX inference functionalities, supporting ONNX format models.
  • Utils: Contains utility functions that do not belong to other modules.
  • Tests: Includes test code for various functions to verify their correctness.

Technical Documentation

For more detailed information on installation and usage, please refer to the Capybara Documents.

The document provides a detailed explanation of this project and answers to frequently asked questions.

Prerequisites

Before the installation of Capybara, ensure that your system meets the following requirements:

Python Version

3.10+

Dependency Packages

Please install the necessary system packages according to your operating system:

Ubuntu

sudo apt install libturbojpeg exiftool ffmpeg libheif-dev poppler-utils
GPU Dependencies

To use ONNX Runtime with GPU acceleration, ensure that you install a compatible version, which can be found on the official ONNX Runtime CUDA Execution Provider requirements page.

Here's an example to install cuda-12.8:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8
# Post installation, add cuda path to .bashrc or .zshrc
export shellrc="~/.zshrc"
echo 'export PATH=/usr/local/cuda-12.8/bin${PATH:+:${PATH}}' >> $shellrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> $shellrc

For more details, please see Nvidia CUDA.

MacOS

brew install jpeg-turbo exiftool ffmpeg libheif poppler

Installation

PyPI

pip install capybara-docsaid

Git

pip install git+https://github.com/DocsaidLab/Capybara.git

Docker for Deployment

We provide a Docker script for convenient deployment, ensuring a consistent environment. Below are the steps to build the image with Capybara installed.

  1. Clone this repository:

    git clone https://github.com/DocsaidLab/Capybara.git
    
  2. Enter the project directory and run the build script:

    cd Capybara
    bash docker/build.bash
    

    This will build an image using the Dockerfile in the project. The image is based on nvidia/cuda:12.8.1-cudnn-runtime-ubuntu24.04 by default, providing the CUDA environment required for ONNXRuntime inference.

  3. After the build is complete, mount the working directory and run the program:

    docker run --gpus all -it --rm capybara_docsaid:latest bash
    

PS: If you want to compile cuda or cudnn for developing, please change the base image to nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04.

gosu Permissions Issues

If you encounter issues with file ownership as root when running scripts inside the container, causing permission problems, you can use gosu to switch users in the Dockerfile. Specify USER_ID and GROUP_ID when starting the container to avoid frequent permission adjustments in collaborative development.

For details, refer to the technical documentation: Integrating gosu Configuration

  1. Install gosu:

    RUN apt-get update && apt-get install -y gosu
    
  2. Use gosu in the container start command to switch to a non-root user for file read/write operations.

    # Create the entrypoint script
    RUN printf '#!/bin/bash\n\
        if [ ! -z "$USER_ID" ] && [ ! -z "$GROUP_ID" ]; then\n\
            groupadd -g "$GROUP_ID" -o usergroup\n\
            useradd --shell /bin/bash -u "$USER_ID" -g "$GROUP_ID" -o -c "" -m user\n\
            export HOME=/home/user\n\
            chown -R "$USER_ID":"$GROUP_ID" /home/user\n\
            chown -R "$USER_ID":"$GROUP_ID" /code\n\
        fi\n\
        \n\
        # Check for parameters\n\
        if [ $# -gt 0 ]; then\n\
            exec gosu ${USER_ID:-0}:${GROUP_ID:-0} python "$@"\n\
        else\n\
            exec gosu ${USER_ID:-0}:${GROUP_ID:-0} bash\n\
        fi' > "$ENTRYPOINT_SCRIPT"
    
    RUN chmod +x "$ENTRYPOINT_SCRIPT"
    
    ENTRYPOINT ["/bin/bash", "/entrypoint.sh"]
    

For more advanced configuration, refer to NVIDIA Container Toolkit and the official docker documentation.

Testing

This project uses pytest for unit testing, and users can run the tests themselves to verify the correctness of the functionalities. To install and run the tests, use the following commands:

pip install pytest
python -m pytest -vv tests

Once completed, you can check if all modules are functioning properly. If any issues arise, first check the environment settings and package versions.

If the problem persists, please report it in the Issue section.

Citation

@misc{lin2025capybara,
  author       = {Kun-Hsiang Lin*, Ze Yuan*},
  title        = {Capybara: An Integrated Python Package for Image Processing and Deep Learning.},
  year         = {2025},
  publisher    = {GitHub},
  howpublished = {\url{https://github.com/DocsaidLab/Capybara}},
  note         = {* equal contribution}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

capybara_docsaid-0.10.0-py3-none-any.whl (90.4 kB view details)

Uploaded Python 3

File details

Details for the file capybara_docsaid-0.10.0-py3-none-any.whl.

File metadata

File hashes

Hashes for capybara_docsaid-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 67c9d9b236b0ec9b8640834592e5a2fdb3a4fc45c4a691e31c3cbce94fef0da6
MD5 5f71e7b2ff738826eb0e911b9e8c42fa
BLAKE2b-256 9dab3f556b038adbe43fa8b9cdc4089cb7bf4254748c643e3f457f56bdab2d09

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page