Skip to main content

A library to preprocess image data.

Project description

Paidiverpy

lifecycle License Documentation DOIlink Pypi covlink

Paidiverpy is a Python package designed to create pipelines for preprocessing image data for biodiversity analysis.

IMPORTANT: This package is still in active development, and frequent updates and changes are expected. The API and features may evolve as we continue improving it.

Documentation

The official documentation is hosted on ReadTheDocs.org: https://paidiverpy.readthedocs.io/

IMPORTANT: Comprehensive documentation is under construction.

Installation

To install paidiverpy, run:

pip install paidiverpy

Build from Source

  1. Clone the repository:

    # https
    git clone https://github.com/paidiver/paidiverpy.git
    cd paidiverpy
  2. (Optional) Create a Python virtual environment to manage dependencies separately from other projects. For example, using conda:

    conda env create -f environment.yml
    conda activate Paidiverpy
  3. Install the paidiverpy package:

    pip install -e .

Usage

You can run your preprocessing pipeline using Paidiverpy in several ways, typically requiring just one to three lines of code:

Python Package

Install the package and utilize it in your Python scripts.

# Import the Pipeline class
from paidiverpy.pipeline import Pipeline

# Instantiate the Pipeline class with the configuration file path
# Please refer to the documentation for the configuration file format
pipeline = Pipeline(config_file_path="../examples/config_files/config_simple2.yml")

# Run the pipeline
pipeline.run()

You can export the output images to the specified output directory:

pipeline.save_images(image_format="png")

Command Line Interface (CLI)

Pipelines can be executed via command-line arguments. For example:

paidiverpy -c "examples/config_files/config_simple.yml"

This runs the pipeline according to the configuration file, saving output images to the directory defined in the output_path.

Docker

You can run Paidiverpy using Docker by pulling a pre-built image from GitHub Container Registry (GHCR) or Docker Hub.

docker pull ghcr.io/paidiver/paidiverpy:latest
docker tag ghcr.io/paidiver/paidiverpy:latest paidiverpy:latest

To run the container, use the following command:

docker run --rm \
  -v <INPUT_PATH>:/app/input/ \
  -v <OUTPUT_PATH>:/app/output/ \
  -v <METADATA_PATH>:/app/metadata/ \
  -v <CONFIG_DIR>:/app/config_files/ \
  paidiverpy -c /app/examples/config_files/<CONFIG_FILE>

Example Data

If you’d like to manually download example data for testing, you can use the following command:

`python from paidiverpy.utils.data import PaidiverpyData PaidiverpyData().load(DATASET_NAME) `

Available datasets:

  • plankton_csv: Plankton dataset with CSV file metadata

  • benthic_csv: Benthic dataset with CSV file metadata

  • benthic_ifdo: Benthic dataset with IFDO metadata

  • nef_raw: Sample images in Nef format (raw images) with CSV file metadata

  • benthic_raw_images: Benthic dataset in raw format with CSV file metadata

Example data will be automatically downloaded when running the example notebooks.

IMPORTANT: Please check the documentation for more information about Paidiverpy: https://paidiverpy.readthedocs.io/

Contributing to paidiverpy

Want to support or improve paidiverpy? Check out our contribution guide to learn how to get started.

Acknowledgements

This project was supported by the UK Natural Environment Research Council (NERC) through the Tools for automating image analysis for biodiversity monitoring (AIAB) Funding Opportunity, reference code UKRI052.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paidiverpy-0.3.1.tar.gz (23.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paidiverpy-0.3.1-py3-none-any.whl (150.1 kB view details)

Uploaded Python 3

File details

Details for the file paidiverpy-0.3.1.tar.gz.

File metadata

  • Download URL: paidiverpy-0.3.1.tar.gz
  • Upload date:
  • Size: 23.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for paidiverpy-0.3.1.tar.gz
Algorithm Hash digest
SHA256 aa329d5794daab2f4078c6afa39d8d213179af12db2c498a83f5ad64062bfe1d
MD5 72f40395de942d32430047b3ee60cf83
BLAKE2b-256 3ae03273b1de90cfa96e4f60bcb33d78cfddb8ac1ea89dc0873ea041bbac514d

See more details on using hashes here.

File details

Details for the file paidiverpy-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: paidiverpy-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 150.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for paidiverpy-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0425c9803f12ed701510d075ebf109d044c9d8dd6ccf4e405b994a2a27828329
MD5 a45c5fe25b0579e2fcfbe609b6e81454
BLAKE2b-256 72fcda5560065e2e5f73aa2c9c97cf37100ebc0be25639663eef439830232245

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page