Skip to main content

Data engineering copilot for robot imitation learning datasets

Project description

ORBIT — Data Quality Analysis for Robot Policy Training

Predict whether your robot learning dataset will actually train successfully — before burning GPU hours.

ORBIT analyzes your demonstration data for the issues that cause training failures: dead joints, action divergence, inconsistent demonstrations, poor workspace coverage, and more. Get a quality grade (A-F) and calibrated success prediction based on 82+ ground-truth training outcomes.

Quick Start

pip install orbit-robotics
orbit analyze lerobot/pusht

No GPU required. No API keys required. Works out of the box.

What It Checks

  • Quality grade (A-F) calibrated against real training outcomes
  • Success prediction — probability your data will train a working policy
  • Dead joint detection — finds servos that aren't moving
  • Action divergence — detects contradictory demonstrations
  • Episode consistency — flags recording issues and length outliers
  • Policy fit — rates compatibility with ACT, Diffusion Policy, SmolVLA, OpenVLA
  • Workspace coverage — checks if demonstrations cover the task space
  • Community comparison — benchmarks against other public datasets

Example Output

ORBIT Analysis: lerobot/pusht
206 episodes - DIFFUSION_POLICY

  Grade: A (98/100) — Ready to train — expect strong results
  Similar datasets trained at: 63%, 84%, 91%, 95%, 100% (5 nearest matches)

  1 issue found:
    ! Jerk varies across episodes (CV=0.57)

  Run with --proxy for training signal, --detail for full diagnostics

Local AI — Free, Private, Zero Config

ORBIT v0.6 adds local AI support via Ollama. Deep analysis, AI grading, and the orbit assist chatbot all run locally with no API keys and no data leaving your machine.

# Install Ollama (one time)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4

# Now AI features just work — ORBIT auto-detects Ollama
orbit analyze lerobot/your-dataset --deep    # AI-powered deep analysis
orbit analyze lerobot/your-dataset --ai      # AI quality judge
orbit assist                                 # Interactive AI troubleshooter

ORBIT auto-detects the best available provider: Ollama (local) > Gemini (cloud) > OpenAI (cloud). No configuration needed — it just picks whatever's available.

Check your AI setup anytime:

orbit setup-ai

Provider Options

Provider Setup Cost Data Privacy
Ollama (recommended) ollama pull gemma4 Free Data stays local
Gemini export GOOGLE_API_KEY=... ~$0.001/analysis Sent to Google
OpenAI export OPENAI_API_KEY=... ~$0.01/analysis Sent to OpenAI

Override the auto-detected provider:

orbit auth set ai-provider ollama      # Force Ollama
orbit auth set ai-model gemma4         # Pick a specific model

Commands

Command What it does
orbit analyze <dataset> Full quality analysis with grade and predictions
orbit suggest <dataset> Training command with tuned hyperparameters
orbit clean <dataset> Remove bad episodes automatically
orbit fix <dataset> Analyze, clean, and suggest in one shot
orbit gate <dataset> CI/CD quality gate — pass/fail for pipelines
orbit compare <a> <b> Side-by-side dataset comparison
orbit benchmark <task> Compare against published training benchmarks
orbit assist AI troubleshooter for data and training issues
orbit doctor Check environment health and AI providers
orbit setup-ai Check and configure AI providers
orbit badge <dataset> Generate a shields.io quality badge
orbit explore Browse and discover LeRobot datasets on HuggingFace

Common Options

orbit analyze lerobot/my-dataset --policy act       # Check fit for specific policy
orbit analyze lerobot/my-dataset --deep              # AI-powered deep analysis
orbit analyze lerobot/my-dataset --ai                # AI quality judge for A grades
orbit analyze lerobot/my-dataset --json              # Machine-readable output
orbit analyze lerobot/my-dataset --full              # All episodes (no sampling)
orbit analyze lerobot/my-dataset --detail            # Full diagnostic report
orbit -q analyze lerobot/my-dataset                  # Quiet mode (grade only)
orbit analyze ./local-data/ --format hdf5            # Local HDF5 files

CI/CD Integration

# Fail the pipeline if data quality is below B
orbit gate lerobot/my-dataset --min-grade B --policy act

# JSON output for scripts
orbit analyze lerobot/my-dataset --json | jq '.readiness.grade'

Supported Formats

Format Source
LeRobot (Hub) HuggingFace datasets (lerobot/...)
LeRobot (local) Local LeRobot directories
HDF5 RoboMimic, robosuite, custom .hdf5 files
RLDS TFRecord-based datasets (pip install orbit-robotics[rlds])
ROS bags .bag and .mcap files (pip install orbit-robotics[rosbag])

Understanding Grades

Grade Score Meaning
A 85-100 Ready to train — expect strong results
B 72-84 Good data — minor issues, should train well
C 58-71 Usable but has problems — clean first
D 40-57 Significant issues — collect more or better data
F 0-39 Critical problems — fix before training

Grades are calibrated against 82 real datasets with known training outcomes.

What's New in v0.6.0

  • Local AI via Ollama--deep, --ai, and orbit assist work locally with Gemma 4, no API keys needed
  • Multi-provider AI — auto-detects Ollama > Gemini > OpenAI, configurable via orbit auth set ai-provider
  • orbit setup-ai — check and configure AI providers in one command
  • Grading accuracy overhaul — ground truth task difficulty adjustment, BC/BC-RNN divergence penalties, calibrated summaries for hard tasks
  • Similar dataset display — shows nearest benchmark success rates instead of wide confidence intervals
  • Sampling warnings — warns when analyzing <20 episodes that the grade may be unreliable
  • orbit doctor — now checks Ollama, Gemini, and OpenAI availability
  • Ollama backend for orbit assist — chat with local models about your data

Requirements

  • Python 3.10+
  • No GPU needed
  • No API keys needed for core analysis

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orbit_robotics-0.6.0.tar.gz (393.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orbit_robotics-0.6.0-py3-none-any.whl (356.3 kB view details)

Uploaded Python 3

File details

Details for the file orbit_robotics-0.6.0.tar.gz.

File metadata

  • Download URL: orbit_robotics-0.6.0.tar.gz
  • Upload date:
  • Size: 393.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.6.0.tar.gz
Algorithm Hash digest
SHA256 f74a55b7a8f06426d0cb11910c2b1b54646ca01c76bdbcc740308ec42650223e
MD5 71d9d460aa6a951210b22ab660304e94
BLAKE2b-256 12d469ca3c059ac8742917c32605951637c8c95b8d08ab5ea68b01ef19b6cb83

See more details on using hashes here.

File details

Details for the file orbit_robotics-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: orbit_robotics-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 356.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f249bafbd0d557d34c0e80463c9e60fe6c1f1628b6792889b51d5f3483d4842d
MD5 444b1d8da811ad53a02476603cfa2b54
BLAKE2b-256 2c5a32ea019853676eccf6d4be1524a7cbeccd655a0e7644f1f08b8596d20215

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page