Library for easier access and research of wildlife re-identification datasets
Project description
Wildlife Re-Identification (Re-ID) Datasets
The aim of the project is to provide comprehensive overview of datasets for wildlife individual re-identification and an easy-to-use package for developers of machine learning methods. The core functionality includes:
- overview of 31 publicly available wildlife re-identification datasets.
- utilities to mass download and convert them into a unified format.
- default splits for several machine learning tasks including the ability create additional splits.
- evaluation metrics for closed-set and open-set classification.
Summary of datasets
The package is able to handle the following datasets. We include basic characteristics such as publication years, number of images, number of individuals, dataset time span (difference between the last and first image taken) and additional information such as source, number of poses, inclusion of timestamps, whether the animals were captured in the wild and whether the dataset contain multiple species.
Graphical summary of datasets is located in a Jupyter notebook. Due to its size, it may be necessary to view it via nbviewer.
Installation
The installation of the package is simple by
pip install wildlife-datasets
Basic functionality
We show an example of downloading, extracting and processing the MacaqueFaces dataset.
from wildlife_datasets import analysis, datasets
datasets.MacaqueFaces.get_data('data/MacaqueFaces')
dataset = datasets.MacaqueFaces('data/MacaqueFaces')
The class dataset
contains the summary of the dataset. The content depends on the dataset. Each dataset contains the identity and paths to images. This particular dataset also contains information about the date taken and contrast. Other datasets store information about bounding boxes, segmentation masks, position from which the image was taken, keypoints or various other information such as age or gender.
dataset.df
The dataset also contains basic metadata including information about the number of individuals, time span, licences or published year.
dataset.metadata
This particular dataset already contains cropped images of faces. Other datasets may contain uncropped images with bounding boxes or even segmentation masks.
analysis.plot_grid(dataset.df, 'data/MacaqueFaces')
Additional functionality
For additional functionality including mass loading, datasets splitting or evaluation metrics we refer to the documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for wildlife_datasets-0.3.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c7fad48a71d4ac4801df978a8f594adffde91f5904b68a4cd7d66386215bb07 |
|
MD5 | ae7e9fc4e5069776525d7943b6a189a9 |
|
BLAKE2b-256 | 2812cae4c92cded1afb1c42c640a907bfa78132020128ab300c07b1f725a0360 |