Vet your robot datasets: diagnose, repair, and quality-score LeRobot-format episode data before you waste a training run.
Project description
robovet
Vet your robot data. Diagnose, repair, and quality-score LeRobot-format datasets — before you waste a training run on broken episodes.
$ robovet doctor ./my_dataset
FAIL DATA-104 1 episode where metadata 'length' disagrees with the parquet
row count — the classic signature of a corrupted episode map.
FAIL STATS-302 1 stat block disagrees with the actual data — every training
run normalizes with these numbers.
WARN TIME-202 Loading this dataset requires tolerance_s ≥ 7.7e-03
(77× the default). Worst: episode 2, 7.29 ms off the grid.
FAIL META-502 Σ episode lengths = 1086 but info.json total_frames = 1037 —
the metadata contradicts itself before a single file is read.
5 fail · 4 warn · 23 pass
UNSAFE TO TRAIN — fix the FAILs first. (exit code 1 — CI-gate it)
Why this exists
Robot learning's bottleneck moved from models to data, and the data is quietly
broken. An April 2026 audit of 10 popular open-source robot datasets found
floating-point drift that breaks video decoding after ~45 episodes, a
v2.1→v3.0 conversion bug that silently corrupts episode↔frame mapping
(your run "works" — the policy just learns from jumbled sequences), datasets
that only load with tolerance_s set to 100× the default, and no quality
metrics anywhere. Hugging Face's own community-dataset cleaning run tells the
same story: 111 of 240 datasets failed validation — and that pipeline is
internal, not something you can run on yours. Meanwhile the 2026 consensus is that a well-curated
500-demo fine-tune beats a poorly-curated one at 10× the scale — curation
tooling is the gap, not model size.
Every check in robovet maps to a documented, real-world failure. The receipts — issue numbers, audit findings, papers — live in PAIN.md.
Try it in 30 seconds (no robot required)
pip install robovet[video]
robovet demo ./demo # synthetic SO-100-style dataset, 10 real-world
# defect classes injected (each tagged with the
# GitHub issue it reproduces)
robovet demo ./demo3 --v3 # same idea in the v3.0 shared-file layout
robovet doctor ./demo # catches all of them; exit 1
robovet fix ./demo --apply # repairs the metadata class; .bak backups
robovet doctor ./demo # metadata FAILs gone
robovet report ./demo -o report.html # one self-contained, shareable page
robovet demo ./demo --clean builds the same dataset with zero defects, so
you can see what all-green looks like.
Vet a Hub dataset before downloading it
pip install "robovet[hub]"
robovet doctor hf://lerobot/svla_so100_pickplace
Fetches only meta/ (a few MB), then runs every metadata-level check:
structure, stats sanity, and the new META-5xx ledger cross-checks — episode↔
frame index math, Σlengths vs counters, per-episode stats freshness, video
time windows. The #2401 corruption class is visible from metadata alone, so
you find out a 4 GB dataset is broken before spending the bandwidth. Honest
scope: the verdict says META CLEAN, never CLEAN — values, timestamps and
video decode still need the full local doctor. --meta-only works on local
paths too (instant pre-check on slow disks).
CI gate in 15 lines
name: robovet
on: [push, pull_request]
jobs:
vet:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: { python-version: "3.12" }
- run: pip install "robovet[video]"
- run: robovet doctor ./datasets/my_task # exit 1 on FAIL blocks the merge
Saw this error? Run this
| You hit | Run | You get |
|---|---|---|
ValueError: timestamps … tolerance_s on load |
robovet doctor → TIME-202 |
the exact minimal tolerance_s, per worst episode |
| wrong frames / IndexError after a v2→v3 conversion | TIME — DATA-104/105 + META-501 | which episodes' ledgers lie, three-way cross-check |
| TorchCodec/AV1 decode errors | VIDEO-403 | per-camera codec tiers and what to re-encode |
loss=NaN out of nowhere |
DATA-107 + STATS-302 | NaN/Inf locations and stale-normalization blocks |
What it checks
| Group | Catches | Maps to |
|---|---|---|
STRUCT-0xx |
missing/invalid metadata, dangling episodes, orphan files | lerobot#761 (no validator for hand-rolled conversions) |
DATA-1xx |
episode↔frame mapping corruption, schema drift, NaN/Inf, dead dims | lerobot#2401 (silent v2.1→v3.0 corruption) |
TIME-2xx |
off-grid timestamps with the exact tolerance_s you'd need, non-monotonic time, cumulative FP drift |
lerobot#933, lerobot#3177 |
STATS-3xx |
stored normalization stats that disagree with the data ("normalization poison"), broken quantile stats (q01/q99) | HF docs warning; phospho repair post; lerobot#2189 |
VIDEO-4xx |
video/parquet frame-count desync — including per-episode windows inside shared v3 files, codec-aware compatibility tiers (h264 ✓ / AV1 info — it's lerobot's own default / mpeg4-hevc warn), fps mismatch | Correll-lab postmortem; phospho notes |
robovet doctor exits 1 on any FAIL and takes --json, so it drops
straight into CI: gate dataset merges the way Codecov gates coverage.
Quality scoring (triage, not truth)
robovet score ./my_dataset --csv scores.csv
Per-episode signals, all computed in one pass: jerk smoothness, idle ratio, gripper chatter, duration outliers, action saturation, exact duplicates. This is deliberately the cheap first pass — the smoothness-first approach the 2026 curation literature (rinse, Demo-SCORE, QoQ) argues should precede expensive policy-rollout or influence-function filtering. Scores put the worst episodes in front of a human in seconds; review before you delete. Statistical flags carry practical-significance guards, so homogeneous datasets don't self-flag.
Repair contract
robovet fix is dry-run by default. With --apply it rewrites only
metadata (episode lengths, normalization stats, info.json counters), backs up
every touched file as .bak, never modifies parquet or video payloads, and
preserves everything it doesn't understand: quantile keys (q01/q99 — the
v3 QUANTILES-normalization era), image-stat blocks, and unknown episode fields
such as tags. A repair tool must never be the thing that deletes your data;
the test suite enforces these guarantees. Frame surgery
(tail-trimming desynced episodes, timestamp re-gridding) is on the
roadmap under the same contract.
Scope, honestly
- LeRobot v2.0 / v2.1 and v3.x are both first-class for diagnosis — each
has its own synthetic fixture and test suite, and v3 gets per-episode video
alignment inside shared files (
VIDEO-405) plus per-episode stats checks parsed from the v3 metadata.fixcurrently rewrites v2.x episode metadata and global stats; v3 per-episode stats regeneration is on the roadmap. - robovet does not merge/split/delete episodes —
lerobotships that natively now. We do what the official stack doesn't: deep validation, metadata repair, and quality triage. - Local-first by design. Your data never leaves your disk — deployment-specific data is a competitive asset; treat it like one.
Library use
from robovet import load_dataset, run_doctor, score_dataset
ds = load_dataset("./my_dataset")
rep = run_doctor(ds) # rep.exit_code, rep.results, rep.counts
sc = score_dataset(ds, scan=rep.scan) # reuses the same single IO pass
Apache-2.0. Issues and broken-dataset war stories very welcome — if your dataset breaks in a way robovet doesn't catch, that's a bug report we want.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robovet-0.2.2.tar.gz.
File metadata
- Download URL: robovet-0.2.2.tar.gz
- Upload date:
- Size: 59.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
848768fd7db4f00cd414e8fa576f9c945b0068f306ea4daf1d57b4203ef15f69
|
|
| MD5 |
f8f72535c84eeec764539a2d8fdac0c7
|
|
| BLAKE2b-256 |
1832c604f3996b3a22f3a102a9999c3b79f04d778344406d1dea04748abc7b96
|
File details
Details for the file robovet-0.2.2-py3-none-any.whl.
File metadata
- Download URL: robovet-0.2.2-py3-none-any.whl
- Upload date:
- Size: 61.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54db0c409e43346eecb19090247abb825c73109458ab26c3aeda2ca621a00ebb
|
|
| MD5 |
19f8444db7caf067d957307b86f8da8b
|
|
| BLAKE2b-256 |
b6317bd707a4b9acc585e9b648e30902aa89a67dd0313b8b0c9c53863b5b3b03
|