Vet your robot datasets: diagnose, repair, and quality-score LeRobot-format episode data before you waste a training run.
Project description
robovet
Vet your robot data. Diagnose, repair, and quality-score LeRobot-format datasets — before you waste a training run on broken episodes.
$ robovet doctor ./my_dataset
FAIL DATA-104 1 episode where metadata 'length' disagrees with the parquet
row count — the classic signature of a corrupted episode map.
FAIL STATS-302 1 stat block disagrees with the actual data — every training
run normalizes with these numbers.
WARN TIME-202 Loading this dataset requires tolerance_s ≥ 7.7e-03
(77× the default). Worst: episode 2, 7.29 ms off the grid.
3 fail · 4 warn · 22 pass
UNSAFE TO TRAIN — fix the FAILs first. (exit code 1 — CI-gate it)
Why this exists
Robot learning's bottleneck moved from models to data, and the data is quietly
broken. An April 2026 audit of 10 popular open-source robot datasets found
floating-point drift that breaks video decoding after ~45 episodes, a
v2.1→v3.0 conversion bug that silently corrupts episode↔frame mapping
(your run "works" — the policy just learns from jumbled sequences), datasets
that only load with tolerance_s set to 100× the default, and no quality
metrics anywhere. Hugging Face's own community-dataset cleaning run tells the
same story: 111 of 240 datasets failed validation — and that pipeline is
internal, not something you can run on yours. Meanwhile the 2026 consensus is that a well-curated
500-demo fine-tune beats a poorly-curated one at 10× the scale — curation
tooling is the gap, not model size.
Every check in robovet maps to a documented, real-world failure. The receipts — issue numbers, audit findings, papers — live in PAIN.md.
Try it in 30 seconds (no robot required)
pip install robovet[video]
robovet demo ./demo # synthetic SO-100-style dataset, 10 real-world
# defect classes injected (each tagged with the
# GitHub issue it reproduces)
robovet demo ./demo3 --v3 # same idea in the v3.0 shared-file layout
robovet doctor ./demo # catches all of them; exit 1
robovet fix ./demo --apply # repairs the metadata class; .bak backups
robovet doctor ./demo # metadata FAILs gone
robovet report ./demo -o report.html # one self-contained, shareable page
robovet demo ./demo --clean builds the same dataset with zero defects, so
you can see what all-green looks like.
What it checks
| Group | Catches | Maps to |
|---|---|---|
STRUCT-0xx |
missing/invalid metadata, dangling episodes, orphan files | lerobot#761 (no validator for hand-rolled conversions) |
DATA-1xx |
episode↔frame mapping corruption, schema drift, NaN/Inf, dead dims | lerobot#2401 (silent v2.1→v3.0 corruption) |
TIME-2xx |
off-grid timestamps with the exact tolerance_s you'd need, non-monotonic time, cumulative FP drift |
lerobot#933, lerobot#3177 |
STATS-3xx |
stored normalization stats that disagree with the data ("normalization poison"), broken quantile stats (q01/q99) | HF docs warning; phospho repair post; lerobot#2189 |
VIDEO-4xx |
video/parquet frame-count desync — including per-episode windows inside shared v3 files, codec-aware compatibility tiers (h264 ✓ / AV1 info — it's lerobot's own default / mpeg4-hevc warn), fps mismatch | Correll-lab postmortem; phospho notes |
robovet doctor exits 1 on any FAIL and takes --json, so it drops
straight into CI: gate dataset merges the way Codecov gates coverage.
Quality scoring (triage, not truth)
robovet score ./my_dataset --csv scores.csv
Per-episode signals, all computed in one pass: jerk smoothness, idle ratio, gripper chatter, duration outliers, action saturation, exact duplicates. This is deliberately the cheap first pass — the smoothness-first approach the 2026 curation literature (rinse, Demo-SCORE, QoQ) argues should precede expensive policy-rollout or influence-function filtering. Scores put the worst episodes in front of a human in seconds; review before you delete. Statistical flags carry practical-significance guards, so homogeneous datasets don't self-flag.
Repair contract
robovet fix is dry-run by default. With --apply it rewrites only
metadata (episode lengths, normalization stats, info.json counters), backs up
every touched file as .bak, never modifies parquet or video payloads, and
preserves everything it doesn't understand: quantile keys (q01/q99 — the
v3 QUANTILES-normalization era), image-stat blocks, and unknown episode fields
such as tags. A repair tool must never be the thing that deletes your data;
the test suite enforces these guarantees. Frame surgery
(tail-trimming desynced episodes, timestamp re-gridding) is on the
roadmap under the same contract.
Scope, honestly
- LeRobot v2.0 / v2.1 and v3.x are both first-class for diagnosis — each
has its own synthetic fixture and test suite, and v3 gets per-episode video
alignment inside shared files (
VIDEO-405) plus per-episode stats checks parsed from the v3 metadata.fixcurrently rewrites v2.x episode metadata and global stats; v3 per-episode stats regeneration is on the roadmap. - robovet does not merge/split/delete episodes —
lerobotships that natively now. We do what the official stack doesn't: deep validation, metadata repair, and quality triage. - Local-first by design. Your data never leaves your disk — deployment-specific data is a competitive asset; treat it like one.
Library use
from robovet import load_dataset, run_doctor, score_dataset
ds = load_dataset("./my_dataset")
rep = run_doctor(ds) # rep.exit_code, rep.results, rep.counts
sc = score_dataset(ds, scan=rep.scan) # reuses the same single IO pass
Apache-2.0. Issues and broken-dataset war stories very welcome — if your dataset breaks in a way robovet doesn't catch, that's a bug report we want.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robovet-0.1.0.tar.gz.
File metadata
- Download URL: robovet-0.1.0.tar.gz
- Upload date:
- Size: 53.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5af3e15fb8c3b03817df506e7de377e9345a6d000ee660826a1606d6502c666a
|
|
| MD5 |
d2509ae2bdb4356895d8d9cf268140e9
|
|
| BLAKE2b-256 |
9e40cd704d993bfccacaa8a8f6b45e7b1cadcdb5a84e75f1d265ab4c7b0f7c9f
|
File details
Details for the file robovet-0.1.0-py3-none-any.whl.
File metadata
- Download URL: robovet-0.1.0-py3-none-any.whl
- Upload date:
- Size: 55.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89814338c7935bc1b360b16c0054745c87309514ea3e36763d2fe014468ccad5
|
|
| MD5 |
ccf8758933cba99645b0a5e3242e274c
|
|
| BLAKE2b-256 |
6ba2066904504573c4ac17d93cf7daa89a37b841f9a70e0a4436463986074c9d
|