Skip to main content

Data engineering copilot for robot imitation learning datasets

Project description

ORBIT — Data Quality Analysis for Robot Policy Training

Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.

ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.

Quick Start

pip install orbit-robotics
orbit analyze lerobot/pusht

No GPU required. No API keys required. Works out of the box.

What It Checks

  • Quality grade (A-F) calibrated against real training outcomes
  • Success prediction — probability your data will train a working policy
  • Dead joint detection — finds servos that aren't moving
  • Action divergence — detects contradictory demonstrations
  • Episode consistency — flags recording issues and length outliers
  • Policy fit — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
  • Workspace coverage — checks if demonstrations cover the task space
  • Community comparison — benchmarks against other public datasets

Example Output

Dataset Readiness: B+ (score: 78/100)
Good data — minor issues, should train well

  ✓ High consistency (0.95)
  ✓ Sufficient episodes (200) for diffusion_policy
  ✓ Good coverage (0.84)
  ✗ 2 joints clipping (>10% of frames)

Top action: Fix joint clipping before training

YOUR DATA AT A GLANCE
────────────────────────────────────────
  Episodes:       200     (top 25%)
  Coverage:       0.84  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░
  Signal Health:  0.92  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░

(Illustrative — actual output depends on your dataset.)

Optional: AI-Powered Assessment

For deeper analysis using vision-language models:

pip install orbit-robotics[vlm]
export GOOGLE_API_KEY=your-key
orbit analyze lerobot/your-dataset --deep

Uses Gemini for VLM-based visual assessment. The core statistical analysis works fully without any API key.

You can also use Claude as an alternative AI provider:

pip install orbit-robotics[claude]
export ANTHROPIC_API_KEY=your-key

Commands

Command What it does
orbit analyze <dataset> Full quality analysis with grade and predictions
orbit benchmark <task> Compare against published training benchmarks
orbit assist AI troubleshooter for data and training issues
orbit suggest <dataset> Training command with tuned hyperparameters
orbit clean <dataset> Remove bad episodes automatically
orbit fix <dataset> Analyze, clean, and suggest in one shot

Common Options

orbit analyze lerobot/my-dataset --policy act       # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep              # AI-powered deep analysis (needs GOOGLE_API_KEY)
orbit analyze lerobot/my-dataset --json              # Machine-readable output
orbit analyze lerobot/my-dataset --full              # All episodes (no sampling)
orbit analyze ./local-data/ --format hdf5            # Local HDF5 files

Supported Formats

Format Source
LeRobot (Hub) HuggingFace datasets (lerobot/...)
LeRobot (local) Local LeRobot directories
HDF5 RoboMimic, robosuite, custom .hdf5 files
RLDS TFRecord-based datasets (pip install orbit-robotics[rlds])
ROS bags .bag and .mcap files (pip install orbit-robotics[rosbag])

Understanding Grades

Grade Score Meaning
A 85-100 Ready to train — expect strong results
B 72-84 Good data — minor issues, should train well
C 58-71 Usable but has problems — clean first
D 40-57 Significant issues — collect more or better data
F 0-39 Critical problems — fix before training

Grades are calibrated against 82 real datasets with known training outcomes.

Requirements

  • Python 3.10+
  • No GPU needed
  • No API keys needed for core analysis

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orbit_robotics-0.5.4.tar.gz (371.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orbit_robotics-0.5.4-py3-none-any.whl (333.5 kB view details)

Uploaded Python 3

File details

Details for the file orbit_robotics-0.5.4.tar.gz.

File metadata

  • Download URL: orbit_robotics-0.5.4.tar.gz
  • Upload date:
  • Size: 371.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.4.tar.gz
Algorithm Hash digest
SHA256 eeced6167491c0bbc4af412f8ac46df4275b0a6d910e1b1498e75e25a7df9519
MD5 ba517aade14201ef9101b166c1b62770
BLAKE2b-256 95de91f201afc168aea0e09c5865c61aa305e48e84dd4069147d4d3fcc04b07a

See more details on using hashes here.

File details

Details for the file orbit_robotics-0.5.4-py3-none-any.whl.

File metadata

  • Download URL: orbit_robotics-0.5.4-py3-none-any.whl
  • Upload date:
  • Size: 333.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8ff98e043a5e3e3888f27f59baae2556a1fe7d394656a811f63f26e07d6d5a2f
MD5 e65562deda8aeb45357028c2ea53d3fa
BLAKE2b-256 6e6376419c63d48708482f6eb33ca106209aed7004acec9fe1049990ba776aa9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page