Skip to main content

Data engineering copilot for robot imitation learning datasets

Project description

ORBIT — Data Quality Analysis for Robot Policy Training

Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.

ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.

Quick Start

pip install orbit-robotics
orbit analyze lerobot/pusht

No GPU required. No API keys required. Works out of the box.

What It Checks

  • Quality grade (A-F) calibrated against real training outcomes
  • Success prediction — probability your data will train a working policy
  • Dead joint detection — finds servos that aren't moving
  • Action divergence — detects contradictory demonstrations
  • Episode consistency — flags recording issues and length outliers
  • Policy fit — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
  • Workspace coverage — checks if demonstrations cover the task space
  • Community comparison — benchmarks against other public datasets

Example Output

Dataset Readiness: B+ (score: 78/100)
Good data — minor issues, should train well

  ✓ High consistency (0.95)
  ✓ Sufficient episodes (200) for diffusion_policy
  ✓ Good coverage (0.84)
  ✗ 2 joints clipping (>10% of frames)

Top action: Fix joint clipping before training

YOUR DATA AT A GLANCE
────────────────────────────────────────
  Episodes:       200     (top 25%)
  Coverage:       0.84  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░
  Signal Health:  0.92  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░

(Illustrative — actual output depends on your dataset.)

Optional: AI-Powered Assessment

For deeper analysis using vision-language models:

pip install orbit-robotics[vlm]
export GOOGLE_API_KEY=your-key
orbit analyze lerobot/your-dataset --deep

Uses Gemini for VLM-based visual assessment. The core statistical analysis works fully without any API key.

You can also use Claude as an alternative AI provider:

pip install orbit-robotics[claude]
export ANTHROPIC_API_KEY=your-key

Commands

Command What it does
orbit analyze <dataset> Full quality analysis with grade and predictions
orbit benchmark <task> Compare against published training benchmarks
orbit assist AI troubleshooter for data and training issues
orbit suggest <dataset> Training command with tuned hyperparameters
orbit clean <dataset> Remove bad episodes automatically
orbit fix <dataset> Analyze, clean, and suggest in one shot

Common Options

orbit analyze lerobot/my-dataset --policy act       # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep              # AI-powered deep analysis (needs GOOGLE_API_KEY)
orbit analyze lerobot/my-dataset --json              # Machine-readable output
orbit analyze lerobot/my-dataset --full              # All episodes (no sampling)
orbit analyze ./local-data/ --format hdf5            # Local HDF5 files

Supported Formats

Format Source
LeRobot (Hub) HuggingFace datasets (lerobot/...)
LeRobot (local) Local LeRobot directories
HDF5 RoboMimic, robosuite, custom .hdf5 files
RLDS TFRecord-based datasets (pip install orbit-robotics[rlds])
ROS bags .bag and .mcap files (pip install orbit-robotics[rosbag])

Understanding Grades

Grade Score Meaning
A 85-100 Ready to train — expect strong results
B 72-84 Good data — minor issues, should train well
C 58-71 Usable but has problems — clean first
D 40-57 Significant issues — collect more or better data
F 0-39 Critical problems — fix before training

Grades are calibrated against 82 real datasets with known training outcomes.

Requirements

  • Python 3.10+
  • No GPU needed
  • No API keys needed for core analysis

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orbit_robotics-0.5.3.tar.gz (371.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orbit_robotics-0.5.3-py3-none-any.whl (333.6 kB view details)

Uploaded Python 3

File details

Details for the file orbit_robotics-0.5.3.tar.gz.

File metadata

  • Download URL: orbit_robotics-0.5.3.tar.gz
  • Upload date:
  • Size: 371.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.3.tar.gz
Algorithm Hash digest
SHA256 ada30676be394b558fba43a91de5f035e568acee52d89c67459beda15d958a3c
MD5 b4f8d32c189e773cb09dc605c82b4f0d
BLAKE2b-256 b7e27ea0f2f7975333dcadb44a399a57b8951a0f288c61613a3ed737bb349627

See more details on using hashes here.

File details

Details for the file orbit_robotics-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: orbit_robotics-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 333.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 55915ad56a40ac4801971d262c197978a157e8ef14d57092d1b374ed9b754757
MD5 e814a1e034591f8704e54d1066d1ff15
BLAKE2b-256 4392fd217efce7f35a49fba7b02c3d5c54e128f792bacf5202fad93953b6786e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page