DatasetOps for computer vision datasets
Project description
VisionPack
DatasetOps for Computer Vision — a Git/Docker-like CLI for the messy part of training models: turning scattered images and labels into a clean, versioned, leak-free, ready-to-train dataset.
vp init --name factory-defects --task detection
vp sync # pull images + labels from the sources in visionpack.yaml
vp validate # catch corrupt images, bad boxes, near-duplicate leakage
vp split create # deterministic, reproducible train/val/test
vp snapshot create -m "baseline"
vp export --format yolo --split # ready-to-train layout
Why VisionPack
Most of the pain in a CV project isn't the model — it's the dataset. VisionPack targets the failures that quietly cost you accuracy and reproducibility:
- Train/test leakage that inflates your metrics. Exact-duplicate detection is not enough: a re-encoded or resized copy of a training image landing in the test set makes your reported numbers a lie. VisionPack catches near-duplicate leakage with perceptual hashing.
- Splits you can't reproduce. "I shuffled with
random.seed(42)" breaks the moment data is added or reordered. VisionPack splits are a function of image content, so they're identical across machines and stable as the dataset grows. - Data scattered across places. Images in one bucket, labels in another repo,
classes in a third. Declare them once and
vp syncassembles the dataset. - "Which dataset trained this model?" Content-addressed snapshots make that answerable instead of a guess.
It's built to complement, not replace CVAT, FiftyOne, DVC, Roboflow, and Label Studio — VisionPack is the DatasetOps layer that imports, validates, versions, splits, packs, and exports.
Install
VisionPack uses uv. From the repo root:
uv sync
uv run vp --help
Requires Python 3.11+.
Quickstart (60 seconds)
# 1. create a project (the manifest is visionpack.yaml)
uv run vp init --name factory-defects --task detection
# 2. bring in a YOLO dataset
uv run vp import ./raw --format yolo
# 3. check it for real problems
uv run vp validate
# 4. a deterministic, reproducible split
uv run vp split create --train 0.8 --val 0.1 --test 0.1 --strategy stratified
uv run vp split lock
# 5. freeze a reproducible version
uv run vp snapshot create -m "initial import"
# 6. comparable metrics as the dataset grows
uv run vp stats --by split
# 7. a ready-to-train layout
uv run vp export --format yolo --split
Works across the common CV tasks
VisionPack's annotation model carries a tagged geometry, so one tool covers the tasks you actually use:
| Task | Import | Geometry |
|---|---|---|
| Classification | ImageFolder (folder-per-class) | whole-image label |
| Detection | YOLO, COCO | bounding box |
| Instance segmentation | COCO | polygon |
| Keypoints / pose | COCO | keypoints |
# classification from a folder-per-class layout
uv run vp init --name product-grades --task classification
uv run vp import ./train --format imagefolder
uv run vp export --format imagefolder --split # train/val/test/<class>/…
# detection or instance segmentation from COCO
uv run vp init --name cells --task segmentation
uv run vp import ./instances.json --format coco --images ./images
Assemble a dataset from many sources
Images and labels rarely live together. Declare them in visionpack.yaml:
sources:
- name: camera-A
format: yolo
images: ./repoA/images # images here…
labels: ./repoB # …labels in another repo
classes: ./repoB/classes.txt
match: stem # pair by filename (or `relpath` for parallel trees)
copy: ingest
Then reconcile the dataset — idempotently, so re-running only pulls what's new:
uv run vp sync --dry-run # preview: found / matched / unmatched / classes
uv run vp sync # ingest; classes merge by name, provenance recorded
Classes from different sources merge by name (YOLO indices are mapped through each source's own class order, never positionally), so reordered class lists don't mislabel your data.
A one-off vp import also records what it imported as a source in
visionpack.yaml, so the manifest stays the single source of truth and the data
can be re-pulled later with vp sync (use --no-record for a throwaway import).
What's in the box
- Deterministic, lockable splits —
stratified/random/hash(growth-stable), captured in snapshots. - Near-duplicate & cross-split leakage detection — perceptual-hash tier, no
extra dependencies, scale-proof via LSH bucketing; surfaced in
vp validate. - Multi-source sync — declarative
sources:+vp sync, with per-asset provenance and a resolver layer ready for remote backends. - Content-addressed snapshots & diff — reproducible versions; compare any two.
- Strong validation — unreadable images, missing/orphan labels, unknown classes, invalid/out-of-bounds boxes, exact + near duplicates, split leakage.
- Comparable metrics — per-split stats so class balance stays auditable as data grows.
- Packing —
archive(.tar.zst, self-contained) andtraining(split-aware WebDataset shards). - Interoperable I/O — YOLO, COCO, ImageFolder in and out.
Full command reference and per-command options live in the usage guide.
Release process
Releases are prepared locally, reviewed as a GitHub Release draft, and published to PyPI only when the GitHub Release is published.
.\scripts\prepare-release.ps1 0.1.1
git push origin HEAD
git push origin v0.1.1
gh release create v0.1.1 --draft --title "v0.1.1" --notes-file CHANGELOG.md
Review the draft release notes in GitHub. Publishing the release triggers
.github/workflows/publish.yml, which builds the package, validates the
artifacts with twine check, and publishes to PyPI through Trusted Publishing.
Use -NoCommit -NoTag to update files and run the checks without creating the
release commit or tag:
.\scripts\prepare-release.ps1 0.1.1 -NoCommit -NoTag
How it works
VisionPack is manifest-driven and content-addressed: visionpack.yaml
declares the dataset, raw images are stored once by sha256 (immutable), and
annotations / splits / snapshots are the versioned layer on top. The truth is the
manifest + index, never "a folder with the right name".
For the design principles, module map, data model, subsystems, and the full roadmap, see ARCHITECTURE.md. For the original product vision, see docs/DESIGN.md.
Status
VisionPack is in active development (early but usable). The core workflow — multi-source ingestion → validation → deterministic splits → snapshots → ready-to-train export/packing — works end-to-end across classification, detection, instance segmentation, and keypoints, with 55 passing tests. APIs may still shift; feedback and contributions are welcome.
uv run python -m unittest discover -s tests -q
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file visionpack-0.0.1.tar.gz.
File metadata
- Download URL: visionpack-0.0.1.tar.gz
- Upload date:
- Size: 79.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12fa58d881fb90c174217a4e6eaa2ddbdfdb52928be2fea3117b13ba06c5845e
|
|
| MD5 |
95f120947719e19fc719c360f4fc5a9b
|
|
| BLAKE2b-256 |
3b81a5bd195a84deb4d3ac9ec8d475c1ec821adc95c4065c9ef14468dd77fc43
|
Provenance
The following attestation bundles were made for visionpack-0.0.1.tar.gz:
Publisher:
publish.yml on CaioWing/VisionPack
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
visionpack-0.0.1.tar.gz -
Subject digest:
12fa58d881fb90c174217a4e6eaa2ddbdfdb52928be2fea3117b13ba06c5845e - Sigstore transparency entry: 1828864812
- Sigstore integration time:
-
Permalink:
CaioWing/VisionPack@81765f1c7d1fbae76bb48ded8e4f8771a1cdf17a -
Branch / Tag:
refs/tags/0.0.1 - Owner: https://github.com/CaioWing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@81765f1c7d1fbae76bb48ded8e4f8771a1cdf17a -
Trigger Event:
release
-
Statement type:
File details
Details for the file visionpack-0.0.1-py3-none-any.whl.
File metadata
- Download URL: visionpack-0.0.1-py3-none-any.whl
- Upload date:
- Size: 83.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2ed29fcce43d627d79fd7090459710af8f610fe88a48cc6cb5459f248aef4c6
|
|
| MD5 |
2be3dafd45308522ad78d891771cd0b5
|
|
| BLAKE2b-256 |
2027acf756428b5826dbb22b6d23bf0273a6a1d94d2bc09eebeb4d00dd27b12d
|
Provenance
The following attestation bundles were made for visionpack-0.0.1-py3-none-any.whl:
Publisher:
publish.yml on CaioWing/VisionPack
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
visionpack-0.0.1-py3-none-any.whl -
Subject digest:
e2ed29fcce43d627d79fd7090459710af8f610fe88a48cc6cb5459f248aef4c6 - Sigstore transparency entry: 1828864897
- Sigstore integration time:
-
Permalink:
CaioWing/VisionPack@81765f1c7d1fbae76bb48ded8e4f8771a1cdf17a -
Branch / Tag:
refs/tags/0.0.1 - Owner: https://github.com/CaioWing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@81765f1c7d1fbae76bb48ded8e4f8771a1cdf17a -
Trigger Event:
release
-
Statement type: