Data engineering copilot for robot imitation learning datasets
Project description
ORBIT — Data Quality Analysis for Robot Policy Training
Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.
ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.
Quick Start
pip install orbit-robotics
orbit analyze lerobot/pusht
No GPU required. No API keys required. Works out of the box.
What It Checks
- Quality grade (A-F) calibrated against real training outcomes
- Success prediction — probability your data will train a working policy
- Dead joint detection — finds servos that aren't moving
- Action divergence — detects contradictory demonstrations
- Episode consistency — flags recording issues and length outliers
- Policy fit — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
- Workspace coverage — checks if demonstrations cover the task space
- Community comparison — benchmarks against other public datasets
Example Output
ORBIT Analysis: lerobot/pusht
206 episodes - DIFFUSION_POLICY
Grade: A (98/100) — Ready to train — expect strong results
Similar datasets trained at: 63%, 84%, 91%, 95%, 100% (5 nearest matches)
1 issue found:
! Jerk varies across episodes (CV=0.57)
Run with --proxy for training signal, --detail for full diagnostics
Local AI — Free, Private, Zero Config
ORBIT v0.6 adds local AI support via Ollama. Deep analysis, AI grading, and the orbit assist chatbot all run locally with no API keys and no data leaving your machine.
# Install Ollama (one time)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4
# Now AI features just work — ORBIT auto-detects Ollama
orbit analyze lerobot/your-dataset --deep # AI-powered deep analysis
orbit analyze lerobot/your-dataset --ai # AI quality judge
orbit assist # Interactive AI troubleshooter
ORBIT auto-detects the best available provider: Ollama (local) > Gemini (cloud) > OpenAI (cloud). No configuration needed — it just picks whatever's available.
Check your AI setup anytime:
orbit setup-ai
Provider Options
| Provider | Setup | Cost | Data Privacy |
|---|---|---|---|
| Ollama (recommended) | ollama pull gemma4 |
Free | Data stays local |
| Gemini | export GOOGLE_API_KEY=... |
~$0.001/analysis | Sent to Google |
| OpenAI | export OPENAI_API_KEY=... |
~$0.01/analysis | Sent to OpenAI |
Override the auto-detected provider:
orbit auth set ai-provider ollama # Force Ollama
orbit auth set ai-model gemma4 # Pick a specific model
Commands
| Command | What it does |
|---|---|
orbit analyze <dataset> |
Full quality analysis with grade and predictions |
orbit suggest <dataset> |
Training command with tuned hyperparameters |
orbit clean <dataset> |
Remove bad episodes automatically |
orbit fix <dataset> |
Analyze, clean, and suggest in one shot |
orbit gate <dataset> |
CI/CD quality gate — pass/fail for pipelines |
orbit compare <a> <b> |
Side-by-side dataset comparison |
orbit benchmark <task> |
Compare against published training benchmarks |
orbit assist |
AI troubleshooter for data and training issues |
orbit doctor |
Check environment health and AI providers |
orbit setup-ai |
Check and configure AI providers |
orbit badge <dataset> |
Generate a shields.io quality badge |
orbit explore |
Browse and discover LeRobot datasets on HuggingFace |
Common Options
orbit analyze lerobot/my-dataset --policy act # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep # AI-powered deep analysis
orbit analyze lerobot/my-dataset --ai # AI quality judge for A grades
orbit analyze lerobot/my-dataset --json # Machine-readable output
orbit analyze lerobot/my-dataset --full # All episodes (no sampling)
orbit analyze lerobot/my-dataset --detail # Full diagnostic report
orbit -q analyze lerobot/my-dataset # Quiet mode (grade only)
orbit analyze ./local-data/ --format hdf5 # Local HDF5 files
CI/CD Integration
# Fail the pipeline if data quality is below B
orbit gate lerobot/my-dataset --min-grade B --policy act
# JSON output for scripts
orbit analyze lerobot/my-dataset --json | jq '.readiness.grade'
Supported Formats
| Format | Source |
|---|---|
| LeRobot (Hub) | HuggingFace datasets (lerobot/...) |
| LeRobot (local) | Local LeRobot directories |
| HDF5 | RoboMimic, robosuite, custom .hdf5 files |
| RLDS | TFRecord-based datasets (pip install orbit-robotics[rlds]) |
| ROS bags | .bag and .mcap files (pip install orbit-robotics[rosbag]) |
Understanding Grades
| Grade | Score | Meaning |
|---|---|---|
| A | 85-100 | Ready to train — expect strong results |
| B | 72-84 | Good data — minor issues, should train well |
| C | 58-71 | Usable but has problems — clean first |
| D | 40-57 | Significant issues — collect more or better data |
| F | 0-39 | Critical problems — fix before training |
Grades are calibrated against 82 real datasets with known training outcomes.
What's New in v0.6.0
- Local AI via Ollama —
--deep,--ai, andorbit assistwork locally with Gemma 4, no API keys needed - Multi-provider AI — auto-detects Ollama > Gemini > OpenAI, configurable via
orbit auth set ai-provider orbit setup-ai— check and configure AI providers in one command- Grading accuracy overhaul — ground truth task difficulty adjustment, BC/BC-RNN divergence penalties, calibrated summaries for hard tasks
- Similar dataset display — shows nearest benchmark success rates instead of wide confidence intervals
- Sampling warnings — warns when analyzing <20 episodes that the grade may be unreliable
orbit doctor— now checks Ollama, Gemini, and OpenAI availability- Ollama backend for
orbit assist— chat with local models about your data
Requirements
- Python 3.10+
- No GPU needed
- No API keys needed for core analysis
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file orbit_robotics-0.6.0.tar.gz.
File metadata
- Download URL: orbit_robotics-0.6.0.tar.gz
- Upload date:
- Size: 393.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f74a55b7a8f06426d0cb11910c2b1b54646ca01c76bdbcc740308ec42650223e
|
|
| MD5 |
71d9d460aa6a951210b22ab660304e94
|
|
| BLAKE2b-256 |
12d469ca3c059ac8742917c32605951637c8c95b8d08ab5ea68b01ef19b6cb83
|
File details
Details for the file orbit_robotics-0.6.0-py3-none-any.whl.
File metadata
- Download URL: orbit_robotics-0.6.0-py3-none-any.whl
- Upload date:
- Size: 356.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f249bafbd0d557d34c0e80463c9e60fe6c1f1628b6792889b51d5f3483d4842d
|
|
| MD5 |
444b1d8da811ad53a02476603cfa2b54
|
|
| BLAKE2b-256 |
2c5a32ea019853676eccf6d4be1524a7cbeccd655a0e7644f1f08b8596d20215
|