DataEval provides a simple interface to characterize image data and its impact on model performance across classification and object-detection tasks

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

aria-ml

These details have not been verified by PyPI

Project links

Project description

dataeval-logo

PyPI - Python Version

DataEval

DataEval analyzes datasets and models to give users the ability to train and test performant, unbiased, and reliable AI models and monitor data for impactful shifts to deployed models.

The dataeval package provides a rigorous and reliable set of tools for developing and analyzing computer vision datasets and the resulting impact on models.

To view our extensive collection of tutorials, how-to's, explanation guides, and reference material, please visit our documentation on Read the Docs

Why DataEval?

DataEval addresses the critical need underlying every AI model -- the data. The difference between a great dataset and a poor dataset can have drastic consequences on AI model performance. Data collected in the wild is noisy, often imbalanced, and doesn't always cover the entire spectrum of conditions need for deployment. DataEval provides AI practitioners with a library of rigorous, algorithm-backed metrics for performance estimation, bias analysis, dataset cleaning and assessment, and data distribution shifts. Throughout all stages of the machine learning lifecycle -- from initial data collection through operational monitoring -- DataEval identifies data problems before they become model failures.

DataEval is easy to install, supports a wide range of Python versions, and is compatible with many of the most popular packages in the scientific and T&E communities.

Target Audience

DataEval is intended to help data scientists, developers, and T&E engineers who want to evaluate and enhance their datasets for optimum performance. For users of the JATI product suite, DataEval has native interoperability when using MAITE-compliant datasets and models.

Getting Started

Python versions: 3.10 - 3.14

Choose your preferred method of installation below or follow our installation guide.

Installing with pip
Installing with conda/mamba
Installing from GitHub

Installing with pip

You can install DataEval directly from pypi.org using the following command.

pip install dataeval

Installing with conda

DataEval can be installed in a Conda/Mamba environment using the provided environment.yaml file. As some dependencies are installed from the pytorch channel, the channel is specified in the below example.

micromamba create -f environment\environment.yaml -c pytorch

Installing from GitHub

To install DataEval from source locally on Ubuntu, pull the source down and change to the DataEval project directory.

git clone https://github.com/aria-ml/dataeval.git
cd dataeval

Using Poetry

Install DataEval.

poetry install

Enable Poetry's virtual environment.

poetry env activate

Using uv

Install DataEval with dependencies for development.

uv sync

Enable uv's virtual environment.

source .venv/bin/activate

Working with data

DataEval has two input paths depending on which part of the library you are using.

dataeval.core provides stateless functions that operate directly on NumPy arrays — embeddings, labels, image hashes, and statistics. No dataset object is required. Call these functions with arrays and get results back directly. Examples include compute_stats, label_errors, divergence_mst, and ber_knn.

dataeval.quality, dataeval.bias, dataeval.shift, and dataeval.performance provide stateful evaluator classes (Duplicates, Outliers, Prioritize, Balance, drift detectors, and so on). These accept either NumPy arrays or Modular AI Trustworthy Engineering (MAITE)-compliant datasets depending on the evaluator.

If your data is not yet in MAITE format, the sections below show what is required and how to wrap a common format, for both image classification and object detection tasks.

Image classification dataset

A MAITE-compliant image classification dataset implements __len__ and __getitem__, where each item is a tuple of (image, label, metadata). Images must be NumPy arrays of shape (H, W, C). Labels must be one-hot encoded arrays of shape (num_classes,). Metadata must be a DatumMetadata object with at minimum an id field.

import maite.protocols as mp
import maite.protocols.image_classification as ic
import numpy as np


class MyImageClassificationDataset(ic.Dataset):
    metadata: mp.DatasetMetadata

    def __init__(self, images: list[np.ndarray], labels: list[int], num_classes: int) -> None:
        # images: list of np.ndarray, each shape (H, W, C)
        # labels: list of int (class indices)
        self._images = images
        self._labels = labels
        self._num_classes = num_classes

        self.metadata = mp.DatasetMetadata(
            id="my_image_classification_dataset",
            index2label={i: f"class_{i}" for i in np.unique(labels)},  # example mapping
        )

    def __len__(self) -> int:
        return len(self._images)

    def __getitem__(self, idx: int) -> tuple[ic.InputType, ic.TargetType, ic.DatumMetadataType]:
        return (
            self._images[idx],  # np.ndarray (H, W, C)
            np.eye(self._num_classes, dtype=np.float32)[self._labels[idx]],  # np.ndarray (num_classes,)
            ic.DatumMetadataType(id=idx),
        )

Object detection dataset

A MAITE-compliant object detection dataset follows the same three-tuple structure, but the label element is replaced by a detection target object carrying per-box labels, bounding boxes, and scores. Bounding boxes use (x0, y0, x1, y1) format. Labels and scores are per-box, not per-image.

import maite.protocols as mp
import maite.protocols.object_detection as od
import numpy as np


class DetectionTarget(od.TargetType):
    """Holds per-box labels, boxes, and one-hot scores for one image."""

    def __init__(self, labels: list[int], boxes: list[list[float]], num_classes: int):
        # labels: list of int, one per box
        # boxes:  list of [x0, y0, x1, y1], one per box
        self._labels = labels
        self._boxes = boxes
        self._scores = np.eye(num_classes)[labels]

    @property
    def labels(self) -> mp.ArrayLike:
        return self._labels

    @property
    def boxes(self) -> mp.ArrayLike:
        return self._boxes

    @property
    def scores(self) -> mp.ArrayLike:
        return self._scores


class MyObjectDetectionDataset(od.Dataset):
    def __init__(
        self, images: list[np.ndarray], labels: list[list[int]], boxes: list[list[list[float]]], num_classes: int
    ) -> None:
        # images: list of np.ndarray, each shape (H, W, C)
        # labels: list of list[int] — per-box class indices, one list per image
        # boxes:  list of list[[x0,y0,x1,y1]] — one list per image
        self._images = images
        self._labels = labels
        self._boxes = boxes
        self._num_classes = num_classes

        self.metadata = mp.DatasetMetadata(
            id="my_object_detection_dataset",
            index2label={i: f"class_{i}" for i in np.unique(labels)},  # example mapping
        )

    def __len__(self) -> int:
        return len(self._images)

    def __getitem__(self, idx: int) -> tuple[od.InputType, od.TargetType, od.DatumMetadataType]:
        return (
            self._images[idx],  # np.ndarray (H, W, C)
            DetectionTarget(self._labels[idx], self._boxes[idx], self._num_classes),
            od.DatumMetadataType(id=idx),
        )

Wrapping a PyTorch dataset

If your data is in a PyTorch Dataset, wrap it to conform to the MAITE protocol. Note that torchvision tensors are (C, H, W) — permute to (H, W, C) before passing to DataEval.

import maite.protocols as mp
import maite.protocols.image_classification as ic
import numpy as np
import torch
from torchvision import transforms
from torchvision.datasets import CIFAR10

tv_cifar10 = CIFAR10(root="./data", train=True, download=True, transform=transforms.ToTensor())


class MyCIFAR10Wrapper(ic.Dataset):
    def __init__(self, source: CIFAR10) -> None:
        self._source = source
        self.metadata = mp.DatasetMetadata(
            id="tv_cifar10",
            index2label={
                0: "airplane",
                1: "automobile",
                2: "bird",
                3: "cat",
                4: "deer",
                5: "dog",
                6: "frog",
                7: "horse",
                8: "ship",
                9: "truck",
            },
        )

    def __len__(self) -> int:
        return len(tv_cifar10)

    def __getitem__(self, idx: int) -> tuple[ic.InputType, ic.TargetType, ic.DatumMetadataType]:
        tv_datum: tuple[torch.Tensor, int] = tv_cifar10[idx]
        image = tv_datum[0].permute(1, 2, 0).numpy()  # Permute image from (C, H, W) to (H, W, C)
        label = np.eye(10, dtype=np.float32)[tv_datum[1]]  # Convert label to one-hot encoding
        return image, label, mp.DatumMetadata(id=idx)


dataset: ic.Dataset = MyCIFAR10Wrapper(tv_cifar10)

Run your first evaluation

The example below uses Duplicates from dataeval.quality to detect near-duplicate images by finding groups of embeddings that are similar in embedding space. Duplicates inflate benchmark scores and cause models to overfit to repeated collection events rather than generalizing to new conditions.

from torch.nn import Flatten

from dataeval.extractors import TorchExtractor
from dataeval.flags import ImageStats
from dataeval.quality import Duplicates

# Configure a feature extractor using a pre-trained PyTorch model.
# Here we use a simple Flatten layer for demonstration, but in practice
# you would use a more powerful model like a pre-trained ResNet or ViT.
extractor = TorchExtractor(Flatten())

# Find near-duplicates using only embedding-based clustering.
# An aggressive cluster_threshold of 1.5 should produce detections
# of near duplicates even with a simple Flatten extractor.
evaluator = Duplicates(
    flags=ImageStats.NONE,
    cluster_algorithm="hdbscan",
    cluster_threshold=1.5,
    extractor=extractor,
    batch_size=64,
)
result = evaluator.evaluate(dataset)

# Near duplicates are grouped into sets of indices that are within
# the specified cluster_threshold in embedding space.
print(result)

shape: (3, 5)
┌──────────┬───────┬──────────┬────────────────┬─────────────┐
│ group_id ┆ level ┆ dup_type ┆ item_indices   ┆ methods     │
│ ---      ┆ ---   ┆ ---      ┆ ---            ┆ ---         │
│ i64      ┆ str   ┆ str      ┆ list[i64]      ┆ list[str]   │
╞══════════╪═══════╪══════════╪════════════════╪═════════════╡
│ 0        ┆ item  ┆ near     ┆ [18586, 39942] ┆ ["cluster"] │
│ 1        ┆ item  ┆ near     ┆ [23157, 31426] ┆ ["cluster"] │
│ 2        ┆ item  ┆ near     ┆ [32024, 49135] ┆ ["cluster"] │
└──────────┴───────┴──────────┴────────────────┴─────────────┘

A result with many large groups is a signal that your dataset contains repeated collection events. Before training, remove all but one sample from each group. See the deduplication how-to guide for a complete walkthrough, including how to choose which sample to keep.

Where to go next

Not sure what to evaluate first? Use the Which tool should I use? guide to find the right evaluator for your situation.

Know which tool to use, then check out the Functional Overview for a quick-reference table of each algorithm's inputs, outputs, and task applicability.

Want to just explore the documentation? The Where to go next page allows you to jump around between the different areas of the documentation with small summaries of what each page covers.

Contact Us

If you have any questions, feel free to reach out to us!

Acknowledgement

CDAO Funding Acknowledgement

This material is based upon work supported by the Chief Digital and Artificial Intelligence Office under Contract No. W519TC-23-9-2033. The views and conclusions contained herein are those of the author(s) and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government.

Project details

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

aria-ml

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.0rc0 pre-release

Jul 2, 2026

This version

1.0.6

Apr 10, 2026

1.0.5

Apr 2, 2026

1.0.4

Mar 24, 2026

1.0.3

Mar 12, 2026

1.0.2

Mar 9, 2026

1.0.1

Mar 9, 2026

1.0.0rc4 pre-release

Feb 23, 2026

1.0.0rc3 pre-release

Feb 17, 2026

1.0.0rc2 pre-release

Feb 16, 2026

1.0.0rc1 pre-release

Feb 16, 2026

1.0.0rc0 pre-release

Feb 6, 2026

0.95.0

Jan 30, 2026

0.94.0

Jan 15, 2026

0.93.1

Oct 22, 2025

0.93.0

Oct 22, 2025

0.92.3

Oct 14, 2025

0.92.2

Oct 7, 2025

0.92.1

Sep 23, 2025

0.92.0

Sep 16, 2025

0.91.3

Sep 9, 2025

0.91.2

Aug 25, 2025

0.91.1

Aug 19, 2025

0.91.0

Aug 12, 2025

0.90.1

Jul 29, 2025

0.90.0

Jul 22, 2025

0.89.1

Jul 15, 2025

0.89.0

Jul 8, 2025

0.88.1

Jul 1, 2025

0.88.0

Jun 24, 2025

0.87.0

Jun 17, 2025

0.86.9

Jun 10, 2025

0.86.8

Jun 3, 2025

0.86.7

May 29, 2025

0.86.6

May 28, 2025

0.86.5

May 26, 2025

0.86.4

May 23, 2025

0.86.3

May 22, 2025

0.86.2

May 21, 2025

0.86.1

May 14, 2025

0.86.0

May 7, 2025

0.85.0

Apr 30, 2025

0.84.1

Apr 23, 2025

0.84.0

Apr 14, 2025

0.83.0

Apr 9, 2025

0.82.1

Apr 2, 2025

0.82.0

Mar 26, 2025

0.81.0

Mar 19, 2025

0.76.1

Jan 29, 2025

0.76.0

Jan 22, 2025

0.75.0

Jan 8, 2025

0.74.2

Dec 25, 2024

0.74.1

Dec 18, 2024

0.74.0

Dec 11, 2024

0.73.1

Dec 4, 2024

0.73.0

Nov 14, 2024

0.72.2

Nov 13, 2024

0.72.1

Nov 6, 2024

0.72.0

Oct 30, 2024

0.71.1

Oct 30, 2024

0.71.0

Oct 24, 2024

0.70.1

Oct 23, 2024

0.70.0

Oct 16, 2024

0.69.4

Oct 9, 2024

0.69.3

Oct 2, 2024

0.69.2

Sep 25, 2024

0.69.1

Sep 18, 2024

0.69.0

Sep 18, 2024

0.68.0

Sep 18, 2024

0.67.0

Sep 14, 2024

0.66.0

Sep 11, 2024

0.65.0

Sep 6, 2024

0.64.0

Aug 30, 2024

0.63.0

Aug 23, 2024

0.61.0

Jul 19, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataeval-1.0.6.tar.gz (258.0 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dataeval-1.0.6-py3-none-any.whl (317.5 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file dataeval-1.0.6.tar.gz.

File metadata

Download URL: dataeval-1.0.6.tar.gz
Upload date: Apr 10, 2026
Size: 258.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dataeval-1.0.6.tar.gz
Algorithm	Hash digest
SHA256	`5816dc1cefbcb9b6934064e9db70725100108c64c9316b7f9a4e5e52fbf5b1a6`
MD5	`809b173e1658af8256dba00de823bfe9`
BLAKE2b-256	`83060d70c37e045164792d9682a6c38dd619d39c4d3fee60f124c7dd62163b50`

See more details on using hashes here.

File details

Details for the file dataeval-1.0.6-py3-none-any.whl.

File metadata

Download URL: dataeval-1.0.6-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 317.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dataeval-1.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`30eba1c68837b177897ab9fd861e37720dab0c686fbf46e9c5e4483c86fd573e`
MD5	`9bf326bb68df1043fb7fb766da0c42d2`
BLAKE2b-256	`17479505320e4122264d00b0431f8bd7e8c10679bf7a724e4b4dbdc2e03eb88c`

See more details on using hashes here.

dataeval 1.0.6

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DataEval

Why DataEval?

Target Audience

Getting Started

Installing with pip

Installing with conda

Installing from GitHub

Using Poetry

Using uv

Working with data

Image classification dataset

Object detection dataset

Wrapping a PyTorch dataset

Run your first evaluation

Where to go next

Contact Us

Acknowledgement

CDAO Funding Acknowledgement

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes