Skip to main content

Data engineering copilot for robot imitation learning datasets

Project description

ORBIT — Data Quality Analysis for Robot Policy Training

Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.

ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.

Quick Start

pip install orbit-robotics
orbit analyze lerobot/pusht

No GPU required. No API keys required. Works out of the box.

What It Checks

  • Quality grade (A-F) calibrated against real training outcomes
  • Success prediction — probability your data will train a working policy
  • Dead joint detection — finds servos that aren't moving
  • Action divergence — detects contradictory demonstrations
  • Episode consistency — flags recording issues and length outliers
  • Policy fit — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
  • Workspace coverage — checks if demonstrations cover the task space
  • Community comparison — benchmarks against other public datasets

Example Output

Dataset Readiness: B+ (score: 78/100)
Good data — minor issues, should train well

  ✓ High consistency (0.95)
  ✓ Sufficient episodes (200) for diffusion_policy
  ✓ Good coverage (0.84)
  ✗ 2 joints clipping (>10% of frames)

Top action: Fix joint clipping before training

YOUR DATA AT A GLANCE
────────────────────────────────────────
  Episodes:       200     (top 25%)
  Coverage:       0.84  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░
  Signal Health:  0.92  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░

(Illustrative — actual output depends on your dataset.)

Optional: AI-Powered Assessment

For deeper analysis using vision-language models:

pip install orbit-robotics[vlm]
export GOOGLE_API_KEY=your-key
orbit analyze lerobot/your-dataset --deep

Uses Gemini for VLM-based visual assessment. The core statistical analysis works fully without any API key.

You can also use Claude as an alternative AI provider:

pip install orbit-robotics[claude]
export ANTHROPIC_API_KEY=your-key

Commands

Command What it does
orbit analyze <dataset> Full quality analysis with grade and predictions
orbit benchmark <task> Compare against published training benchmarks
orbit assist AI troubleshooter for data and training issues
orbit suggest <dataset> Training command with tuned hyperparameters
orbit clean <dataset> Remove bad episodes automatically
orbit fix <dataset> Analyze, clean, and suggest in one shot

Common Options

orbit analyze lerobot/my-dataset --policy act       # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep              # AI-powered deep analysis (needs GOOGLE_API_KEY)
orbit analyze lerobot/my-dataset --json              # Machine-readable output
orbit analyze lerobot/my-dataset --full              # All episodes (no sampling)
orbit analyze ./local-data/ --format hdf5            # Local HDF5 files

Supported Formats

Format Source
LeRobot (Hub) HuggingFace datasets (lerobot/...)
LeRobot (local) Local LeRobot directories
HDF5 RoboMimic, robosuite, custom .hdf5 files
RLDS TFRecord-based datasets (pip install orbit-robotics[rlds])
ROS bags .bag and .mcap files (pip install orbit-robotics[rosbag])

Understanding Grades

Grade Score Meaning
A 85-100 Ready to train — expect strong results
B 72-84 Good data — minor issues, should train well
C 58-71 Usable but has problems — clean first
D 40-57 Significant issues — collect more or better data
F 0-39 Critical problems — fix before training

Grades are calibrated against 82 real datasets with known training outcomes.

Requirements

  • Python 3.10+
  • No GPU needed
  • No API keys needed for core analysis

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orbit_robotics-0.5.1.tar.gz (371.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orbit_robotics-0.5.1-py3-none-any.whl (333.5 kB view details)

Uploaded Python 3

File details

Details for the file orbit_robotics-0.5.1.tar.gz.

File metadata

  • Download URL: orbit_robotics-0.5.1.tar.gz
  • Upload date:
  • Size: 371.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.1.tar.gz
Algorithm Hash digest
SHA256 24e128c2617b1306ca3fe12c008212af1c9f15147e0df08ba2745f4a6c2c596e
MD5 e11a4fb7491b4c318b0e43a9fb327eb2
BLAKE2b-256 66f99f7ec39cad0b7fda3667cd8a0ddbfcabc2c05a92df875a219e06795e69b1

See more details on using hashes here.

File details

Details for the file orbit_robotics-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: orbit_robotics-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 333.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2fafa1e512acd56659147d4a593dc88f1380c58b11f9ce5d0cb1b58df290df5c
MD5 fb2254949a0e5e99c82693941c1ff236
BLAKE2b-256 4ff2011fa0b3a1d5392d5f0803f68cb8a6bc81f19dadfaca4cb5c3b2f99e33fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page