DataEval Workflows container for data evaluation

These details have not been verified by PyPI

Project links

Repository

Project description

DataEval Workflows

Workflow orchestration for DataEval with GPU support.

Quick Start

# 1. Build CUDA 11.8 container
docker build -f docker/Dockerfile.cu118 -t dataeval:cu118 .

# 2. Show help
docker run dataeval:cu118

# 3. Run with data and output
docker run --gpus all \
  --mount type=bind,source=/path/to/data,target=/dataeval,readonly \
  --mount type=bind,source=/path/to/output,target=/output \
  dataeval:cu118

Requirements

Requirement	Version
Docker	>= 20.10
NVIDIA GPU	Any (for GPU mode)
NVIDIA Driver	>= 520 (for GPU mode)
CUDA	11.8.0 (for GPU mode)

Verify GPU Access

docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi

Volume Mounts

Path	Mode	Purpose
`/dataeval`	ro	Data directory — datasets, models, configs (required)
`/output`	rw	Results (required)
`/cache`	rw	Computation cache (optional)

File Permissions

The container runs as a non-root user (dataeval, UID 1000). Mounted directories for /output and /cache must be writable by the container process. There are two approaches:

Option 1: Pass your host UID (recommended)

Use --user to run the container as your host user, so mounted directories are naturally writable:

docker run --gpus all \
  --user "$(id -u):$(id -g)" \
  --mount type=bind,source=/path/to/data,target=/dataeval,readonly \
  --mount type=bind,source=/path/to/output,target=/output \
  dataeval:cu118

Option 2: Open directory permissions

Make the output and cache directories world-writable on the host:

chmod 777 /path/to/output /path/to/cache

Then run without --user. This is simpler but less secure.

Custom Data Root

The data root path can be overridden via the DATAEVAL_DATA environment variable:

docker run --gpus all \
  -e DATAEVAL_DATA=/data \
  --mount type=bind,source=/path/to/data,target=/data,readonly \
  --mount type=bind,source=/path/to/output,target=/output \
  dataeval:cu118

Configuration

Config files (YAML or JSON) can be placed anywhere in your data directory. By default, all YAML/JSON files at the root of the data mount are auto-discovered and merged.

To specify a config path explicitly:

# Config folder within data directory
docker run --gpus all \
  --mount type=bind,source=/path/to/data,target=/dataeval,readonly \
  --mount type=bind,source=/path/to/output,target=/output \
  dataeval:cu118 --config config/

# Single config file
docker run --gpus all \
  --mount type=bind,source=/path/to/data,target=/dataeval,readonly \
  --mount type=bind,source=/path/to/output,target=/output \
  dataeval:cu118 --config params.yaml

Dataset and model paths in config files are resolved relative to the data root (/dataeval by default).

Dataset Formats

Currently supported dataset structures:

Format	Structure	Example
Dataset	Single split, used directly	`cifar10_test/`
DatasetDict	Multiple splits (dict), configured via config YAML	`cifar10_full/`

CPU Fallback

For machines without NVIDIA GPU:

docker build -f docker/Dockerfile.cpu -t dataeval:cpu .
docker run dataeval:cpu  # Shows help
docker run \
  --mount type=bind,source=/path/to/data,target=/dataeval,readonly \
  --mount type=bind,source=/path/to/output,target=/output \
  dataeval:cpu

CLI Modes

DataEval Flow has three modes:

Command	Purpose
`dataeval-flow [opts]`	Headless execution — for automation and CI/CD pipelines
`dataeval-flow app`	Interactive TUI dashboard — configure, execute, and view results
`dataeval-flow config`	Simple CLI config builder — create/edit configs without the TUI

Interactive TUI (`app`)

Installation:

uv sync --extra app          # or: pip install dataeval-flow[app]

Usage:

# Launch with a blank config
python -m dataeval_flow app

# Load an existing config for editing
python -m dataeval_flow app --config /path/to/params.yaml

The TUI provides a three-pane dashboard for config editing, task execution, and result viewing. It auto-discovers available torchvision transforms, dataeval selection classes, and workflow types, generating dynamic parameter forms from their schemas.

Simple CLI Config Builder (`config`)

For environments without the TUI dependency:

python -m dataeval_flow config
python -m dataeval_flow config --config /path/to/params.yaml

Configs can be saved as YAML or JSON.

Dependencies

dataeval - Core evaluation library
datasets - Huggingface library
maite-datasets - MAITE protocol adapter
maite - MAITE protocol library
pydantic - Structural typing and schema validation

Troubleshooting

Build appears stuck at `uv sync`

The Docker build may appear frozen during the uv sync step:

=> [builder 7/7] RUN uv sync --frozen --no-dev --no-install-project    1139.3s

This is normal. The step downloads ~2GB of dependencies (PyTorch, scipy, etc.) with no progress indicator.

Network Speed	Expected Build Time
100 Mbps	~10 minutes
30 Mbps	~20 minutes
10 Mbps	~45 minutes

Tip: First build is slow; subsequent builds use Docker cache and complete in seconds.

Running Without Container

The dataeval_flow package can be used standalone without Docker.

Installation:

git clone https://gitlab.jatic.net/jatic/aria/dataeval-flow.git
cd dataeval-flow
uv sync

CLI Usage:

python -m dataeval_flow --config /path/to/config --output /path/to/output
python -m dataeval_flow --data /path/to/data --output /path/to/output

Python API Usage:

from pathlib import Path
from dataeval_flow import load_config, run_tasks

config = load_config(Path("/path/to/data/config.yaml"))
results = run_tasks(config, data_dir=Path("/path/to/data"))
print(results[0].report())

Development:

uv sync --group dev
nox

License

MIT

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

This version

0.1.0

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataeval_flow-0.1.0.tar.gz (161.3 kB view details)

Uploaded Mar 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dataeval_flow-0.1.0-py3-none-any.whl (205.2 kB view details)

Uploaded Mar 30, 2026 Python 3

File details

Details for the file dataeval_flow-0.1.0.tar.gz.

File metadata

Download URL: dataeval_flow-0.1.0.tar.gz
Upload date: Mar 30, 2026
Size: 161.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dataeval_flow-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`431ce2b43d025772d096815afe126539edc27f5d67a03a3529d89f8bd30a3181`
MD5	`78e6e0d0ab928a0bdb075a5fd84c53d7`
BLAKE2b-256	`9ba6de8aa9418050da672d05a62873a3230de41b1ede2ec5e0da152b18e4b4ab`

See more details on using hashes here.

File details

Details for the file dataeval_flow-0.1.0-py3-none-any.whl.

File metadata

Download URL: dataeval_flow-0.1.0-py3-none-any.whl
Upload date: Mar 30, 2026
Size: 205.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dataeval_flow-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1e5f35da5609f340988ee188610ae7fb9ef6ec90b13a6ec0f2a8df705d7d1fe7`
MD5	`d220c600e3795501584a71daf4808576`
BLAKE2b-256	`9c92b534f56bfaff7f9e2c892d32b26f7acc84f99f365c03c94084eb499f6078`

See more details on using hashes here.

dataeval-flow 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DataEval Workflows

Quick Start

Requirements

Verify GPU Access

Volume Mounts

File Permissions

Option 1: Pass your host UID (recommended)

Option 2: Open directory permissions

Custom Data Root

Configuration

Dataset Formats

CPU Fallback

CLI Modes

Interactive TUI (app)

Simple CLI Config Builder (config)

Dependencies

Troubleshooting

Build appears stuck at uv sync

Running Without Container

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Interactive TUI (`app`)

Simple CLI Config Builder (`config`)

Build appears stuck at `uv sync`