Skip to main content

Data engineering copilot for robot imitation learning datasets

Project description

ORBIT — Data Quality Analysis for Robot Policy Training

Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.

ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.

Quick Start

pip install orbit-robotics
orbit analyze lerobot/pusht

No GPU required. No API keys required. Works out of the box.

What It Checks

  • Quality grade (A-F) calibrated against real training outcomes
  • Success prediction — probability your data will train a working policy
  • Dead joint detection — finds servos that aren't moving
  • Action divergence — detects contradictory demonstrations
  • Episode consistency — flags recording issues and length outliers
  • Policy fit — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
  • Workspace coverage — checks if demonstrations cover the task space
  • Community comparison — benchmarks against other public datasets

Example Output

Dataset Readiness: B+ (score: 78/100)
Good data — minor issues, should train well

  ✓ High consistency (0.95)
  ✓ Sufficient episodes (200) for diffusion_policy
  ✓ Good coverage (0.84)
  ✗ 2 joints clipping (>10% of frames)

Top action: Fix joint clipping before training

YOUR DATA AT A GLANCE
────────────────────────────────────────
  Episodes:       200     (top 25%)
  Coverage:       0.84  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░
  Signal Health:  0.92  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░

(Illustrative — actual output depends on your dataset.)

Optional: AI-Powered Assessment

For deeper analysis using vision-language models:

pip install orbit-robotics[vlm]
export GOOGLE_API_KEY=your-key
orbit analyze lerobot/your-dataset --deep

Uses Gemini for VLM-based visual assessment. The core statistical analysis works fully without any API key.

You can also use Claude as an alternative AI provider:

pip install orbit-robotics[claude]
export ANTHROPIC_API_KEY=your-key

Commands

Command What it does
orbit analyze <dataset> Full quality analysis with grade and predictions
orbit benchmark <task> Compare against published training benchmarks
orbit assist AI troubleshooter for data and training issues
orbit suggest <dataset> Training command with tuned hyperparameters
orbit clean <dataset> Remove bad episodes automatically
orbit fix <dataset> Analyze, clean, and suggest in one shot

Common Options

orbit analyze lerobot/my-dataset --policy act       # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep              # AI-powered deep analysis (needs GOOGLE_API_KEY)
orbit analyze lerobot/my-dataset --json              # Machine-readable output
orbit analyze lerobot/my-dataset --full              # All episodes (no sampling)
orbit analyze ./local-data/ --format hdf5            # Local HDF5 files

Supported Formats

Format Source
LeRobot (Hub) HuggingFace datasets (lerobot/...)
LeRobot (local) Local LeRobot directories
HDF5 RoboMimic, robosuite, custom .hdf5 files
RLDS TFRecord-based datasets (pip install orbit-robotics[rlds])
ROS bags .bag and .mcap files (pip install orbit-robotics[rosbag])

Understanding Grades

Grade Score Meaning
A 85-100 Ready to train — expect strong results
B 72-84 Good data — minor issues, should train well
C 58-71 Usable but has problems — clean first
D 40-57 Significant issues — collect more or better data
F 0-39 Critical problems — fix before training

Grades are calibrated against 82 real datasets with known training outcomes.

Requirements

  • Python 3.10+
  • No GPU needed
  • No API keys needed for core analysis

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orbit_robotics-0.5.2.tar.gz (371.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orbit_robotics-0.5.2-py3-none-any.whl (333.5 kB view details)

Uploaded Python 3

File details

Details for the file orbit_robotics-0.5.2.tar.gz.

File metadata

  • Download URL: orbit_robotics-0.5.2.tar.gz
  • Upload date:
  • Size: 371.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.2.tar.gz
Algorithm Hash digest
SHA256 f0d2716b743cb3ef04d78dabc4fa00347e54801d8bde32e36428794ac831b2ab
MD5 20a66d6e5062fe8688f376a04079e57f
BLAKE2b-256 2d1fe14e3c3764b008fd87076542d84086eb1a2f832ad61c18b11cc4df41ca85

See more details on using hashes here.

File details

Details for the file orbit_robotics-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: orbit_robotics-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 333.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5b6857eb805d55a394c4102e500d5f38fa73c96d96126a83a5c0982dbe36d7c0
MD5 0e59196b5d1050a32f3085b1cb384046
BLAKE2b-256 82332b65dadaac51de5ffdda21f33b5c17b469a45ffb5e685733c687e03cd3ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page