Data engineering copilot for robot imitation learning datasets

These details have not been verified by PyPI

Project description

ORBIT — Know If Your Robot Data Will Train Before You Burn GPU Hours

You collected 200 demonstrations. You trained for 12 hours. The robot doesn't move. Was it the data? The hyperparameters? A dead servo you didn't notice?

ORBIT tells you in 10 seconds.

pip install orbit-robotics
orbit analyze lerobot/pusht

ORBIT Analysis: lerobot/pusht
206 episodes · DIFFUSION_POLICY

  Grade: A (98/100) — Ready to train — expect strong results
  Similar datasets trained at: 63%, 84%, 91%, 95%, 100% (5 nearest matches)

  1 issue found:
    ! Jerk varies across episodes (CV=0.57) — some demos are much jerkier

  Run with --detail for full diagnostics

No GPU. No API keys. No setup. Just answers.

The Problem

Every robotics lab has hit this: you collect demos, start training, wait hours — and the policy fails. You don't know if you need more data, better data, or different hyperparameters. There's no tool that checks your data quality before training.

ORBIT is that tool. It catches dead joints, contradictory demonstrations, inconsistent episodes, poor coverage, and clipping — the silent killers of robot policy training. Grades are calibrated against 82 real training runs with known success rates, so when ORBIT says "A", it means datasets like yours actually worked.

What You Get

Quality Grade (A through F)

A single, calibrated score. Not a vague "looks okay" — a grade backed by 82 ground-truth training outcomes from published results across ACT, Diffusion Policy, BC, and more.

A (85+) — Ready to train. Datasets like yours succeed.
B (72-84) — Good data with minor issues. Should train well.
C (58-71) — Has problems. Clean your data first or expect poor results.
D (40-57) — Significant issues. Collect more or better demonstrations.
F (below 40) — Critical problems. Don't waste compute on this.

12 Diagnostic Checks

Every analysis runs these automatically:

Check	What it catches
Dead joint detection	Servos that never move — wastes model capacity, masks hardware failures
Action divergence	Same state, different actions — directly confuses the policy
Joint clipping	Joints hitting mechanical limits in >10% of frames
Episode consistency	Wild variation in demo length, speed, or strategy
Outlier episodes	Demos that are statistically different from the rest
Workspace coverage	Whether demos actually cover the task space
Temporal alignment	State-action lag that causes the policy to learn the wrong timing
Directional bias	Joints that only move one way (ignoring grippers, where this is normal)
Smoothness analysis	Jerk and curvature variation across episodes
Policy fit scoring	How well your data matches ACT vs Diffusion Policy vs BC vs SmolVLA
Episode count validation	Whether you have enough demos for your chosen policy
Scaling advice	How many more episodes you actually need

Benchmark Comparison

ORBIT knows what worked. It compares your dataset against 82 validated training runs and shows you the nearest matches:

Similar datasets trained at: 63%, 84%, 91%, 95%, 100% (5 nearest matches)
Closest match: Push-T (state) — 206 episodes, 91% success (diffusion_policy)

Ready-to-Run Training Commands

Don't guess hyperparameters. ORBIT picks the best policy for your data and gives you a command you can copy-paste:

orbit suggest lerobot/my-dataset

Recommended: Diffusion Policy (fit: 0.90)

Copy and run:
lerobot-train \
  --dataset.repo_id=lerobot/my-dataset \
  --policy.type=diffusion_policy \
  --batch_size=32 \
  --steps=500000 \
  ...

Training tips:
  - Loss should drop below 0.1 by step 100000
  - If loss plateaus above 0.2: your demonstrations may be too inconsistent

Automatic Cleaning

Bad episodes drag your whole training down. ORBIT finds them and removes them:

orbit clean lerobot/my-dataset              # Remove bad episodes
orbit fix lerobot/my-dataset                # Analyze + clean + suggest, one shot

CI/CD Quality Gate

Block bad data from entering your training pipeline:

orbit gate lerobot/my-dataset --policy act --min-grade B
# Exit code 0 = pass, 1 = fail

Full Command Reference

Core Workflow

Command	What it does
`orbit analyze <dataset>`	Full quality analysis — grade, diagnostics, recommendations
`orbit suggest <dataset>`	Best policy + ready-to-run training command with tuned hyperparameters
`orbit clean <dataset>`	Find and remove bad episodes automatically
`orbit fix <dataset>`	Analyze, clean, and suggest — one command does it all
`orbit compare <a> <b>`	Side-by-side comparison of two datasets

Discovery & Benchmarking

Command	What it does
`orbit explore`	Browse and discover LeRobot datasets on HuggingFace
`orbit benchmark <task>`	Search 82 validated results by task, policy, or hardware
`orbit leaderboard`	Quality leaderboard of scored robotics datasets

Training Pipeline

Command	What it does
`orbit gate <dataset>`	CI/CD quality gate — pass/fail with exit codes
`orbit train <dataset>`	Full pipeline: gate, train, evaluate
`orbit monitor`	Watch a training run in real-time
`orbit debug`	Diagnose a failed or underperforming training run
`orbit verify`	Compare training outcome against predicted quality
`orbit report`	Save training results locally and to dashboard

Data Engineering

Command	What it does
`orbit curate <dataset>`	Select the best episodes from a dataset
`orbit improve <dataset>`	Clean bad episodes and prove the score went up
`orbit convert <dataset>`	Convert from any format to LeRobot v3
`orbit plan`	Plan a data collection strategy
`orbit coach`	Real-time guidance during data collection
`orbit badge <dataset>`	Generate a shields.io quality badge for your dataset card

Setup & Config

Command	What it does
`orbit doctor`	Check environment health — Python, deps, AI providers
`orbit setup-ai`	Check and configure AI providers (Ollama, Gemini, OpenAI)
`orbit quickstart`	Get started in 30 seconds
`orbit init`	Scaffold a training project with Makefile and config
`orbit assist`	Interactive AI troubleshooter — ask questions about your data

Analyze Options

# Basics
orbit analyze lerobot/my-dataset                    # Quick analysis (samples 50 episodes)
orbit analyze lerobot/my-dataset --full              # Analyze every episode
orbit analyze lerobot/my-dataset --episodes 100      # Analyze exactly 100 episodes
orbit analyze lerobot/my-dataset --detail            # Full diagnostic report with all sections
orbit -q analyze lerobot/my-dataset                  # Quiet: just "A (98/100)" — for scripts

# Policy-specific
orbit analyze lerobot/my-dataset --policy act        # Check fit for ACT
orbit analyze lerobot/my-dataset --policy diffusion_policy
orbit analyze lerobot/my-dataset --policy smolvla

# AI-powered (optional — works without, better with)
orbit analyze lerobot/my-dataset --deep              # LLM diagnosis with specific fix instructions
orbit analyze lerobot/my-dataset --ai                # AI second opinion on A grades
orbit analyze lerobot/my-dataset --vlm               # Vision-language model assessment of frames
orbit analyze lerobot/my-dataset --proxy             # Train a quick BC model as ground truth signal

# Output
orbit analyze lerobot/my-dataset --json              # Machine-readable JSON for pipelines
orbit analyze lerobot/my-dataset --json | jq '.readiness.grade'

# Local files
orbit analyze ./my-local-data/                       # Local LeRobot directory
orbit analyze ./data.hdf5 --format hdf5              # HDF5 file (RoboMimic, robosuite)
orbit analyze ./recording.bag --format rosbag         # ROS bag file
orbit analyze ./data/ --format rlds                   # RLDS TFRecord dataset

Supported Data Formats

Format	How to use	Common sources
LeRobot (HuggingFace)	`orbit analyze lerobot/pusht`	Any dataset on HuggingFace Hub
LeRobot (local)	`orbit analyze ./my-data/`	Local LeRobot recordings
HDF5	`orbit analyze data.hdf5 --format hdf5`	RoboMimic, robosuite, custom
RLDS	`orbit analyze ./data/ --format rlds`	TFRecord datasets (needs `pip install orbit-robotics[rlds]`)
ROS bags	`orbit analyze rec.bag --format rosbag`	`.bag` and `.mcap` files (needs `pip install orbit-robotics[rosbag]`)

ORBIT auto-detects the format. Use --format to override if needed.

AI Features (Optional)

The core analysis — grading, diagnostics, all 12 checks — works fully without any AI or API keys. AI adds deeper natural-language diagnosis on top.

Local AI with Ollama (Free, Private)

# One-time setup
curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4

# Now these just work — ORBIT auto-detects Ollama
orbit analyze lerobot/my-dataset --deep     # "Your action divergence is high because..."
orbit assist                                # "Why is my policy not learning?" → specific answers

Your data never leaves your machine. No API keys. No cost.

Cloud AI (Gemini / OpenAI)

export GOOGLE_API_KEY=your-key              # Gemini (~$0.001 per analysis)
# or
export OPENAI_API_KEY=your-key              # OpenAI (~$0.01 per analysis)

ORBIT auto-detects: Ollama (local) > Gemini > OpenAI. Check your setup:

orbit setup-ai

Override:

orbit auth set ai-provider ollama           # Force a specific provider
orbit auth set ai-model gemma4              # Force a specific model

How Grading Works

ORBIT doesn't guess. Every grade is calibrated against real outcomes.

We collected 82 training runs from published papers and community results — datasets where we know the actual success rate of the trained policy. ORBIT's grading formula was tuned to match: datasets that trained successfully get A's, datasets that failed get D's and F's.

When ORBIT shows "Similar datasets trained at: 63%, 84%, 91%", those are real results from real papers. When it says "Grade B — should train well", that's based on what actually happened to similar data.

What each grade means in practice

Grade	What to expect	What to do
A	Your data is solid. Policies trained on similar datasets succeeded.	Start training.
B	Minor issues detected, but datasets like this usually train fine.	Train, but review the flagged issues if results disappoint.
C	Real problems found. Training might work but expect lower success rates.	Run `orbit clean` first. Fix flagged issues. Then re-analyze.
D	Significant quality problems. Training is likely to fail or underperform.	Collect more data, fix hardware issues, or change your collection strategy.
F	Critical failures — dead joints, extreme divergence, broken data.	Don't train on this. Fix the root cause first.

Real-World Impact

We analyzed 55 popular datasets on HuggingFace. Findings:

Only 46% were ready to train (Grade A). 19% had significant problems.
75% had action divergence — the #1 silent training killer.
stanford_kuka_multimodal has 3,000 episodes but scores Grade D (43/100) — 3 dead joints and 50% outlier episodes. More data doesn't mean better data.
7 out of 15 community datasets failed to even load — 81,000+ downloads on broken data.

A 10-second orbit analyze would have caught every one of these issues.

Installation

pip install orbit-robotics

That's it. Everything works out of the box.

Optional extras

pip install orbit-robotics[vlm]       # Vision-language model features
pip install orbit-robotics[rlds]      # RLDS/TFRecord format support
pip install orbit-robotics[rosbag]    # ROS bag format support
pip install orbit-robotics[all]       # Everything

Requirements

Python 3.10+
No GPU needed
No API keys needed (AI features are optional enhancements)

What's New in v0.6.0

Local AI via Ollama — --deep, --ai, and orbit assist now work locally with Gemma 4. No API keys, no cost, no data leaves your machine.
Multi-provider AI — auto-detects Ollama > Gemini > OpenAI. Configure with orbit auth set ai-provider.
orbit setup-ai — check and configure AI providers in one command.
Grading accuracy overhaul — task difficulty adjustment from ground truth, policy-specific divergence penalties (BC penalized more than Diffusion Policy), calibrated summaries that tell you when a task is inherently hard vs when your data is bad.
Nearest-match display — shows actual success rates from similar datasets instead of abstract confidence intervals.
Sampling warnings — tells you when you're analyzing too few episodes for a reliable grade.
Smarter orbit doctor — checks Ollama, Gemini, and OpenAI availability alongside all dependencies.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.7.1

Apr 6, 2026

0.7.0

Apr 6, 2026

This version

0.6.1

Apr 6, 2026

0.6.0

Apr 6, 2026

0.5.4

Apr 5, 2026

0.5.3

Apr 5, 2026

0.5.2

Apr 5, 2026

0.5.1

Apr 5, 2026

0.5.0

Apr 4, 2026

0.4.1

Apr 3, 2026

0.4.0

Apr 3, 2026

0.3.0

Mar 29, 2026

0.2.1

Mar 22, 2026

0.2.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orbit_robotics-0.6.1.tar.gz (399.1 kB view details)

Uploaded Apr 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

orbit_robotics-0.6.1-py3-none-any.whl (358.8 kB view details)

Uploaded Apr 6, 2026 Python 3

File details

Details for the file orbit_robotics-0.6.1.tar.gz.

File metadata

Download URL: orbit_robotics-0.6.1.tar.gz
Upload date: Apr 6, 2026
Size: 399.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.6.1.tar.gz
Algorithm	Hash digest
SHA256	`1e21753d3ebec46abdf09aba9ea9d38c9320fe309f6362a0b9cff514227a0321`
MD5	`b3f31c13d6f26421ccb26bee05ab2faa`
BLAKE2b-256	`80225b8f02af9ede5282b933a6b39a8faeb78072ffabd74340d30cff8caecc23`

See more details on using hashes here.

File details

Details for the file orbit_robotics-0.6.1-py3-none-any.whl.

File metadata

Download URL: orbit_robotics-0.6.1-py3-none-any.whl
Upload date: Apr 6, 2026
Size: 358.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for orbit_robotics-0.6.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f43c1bfc899bec7178208abb8a8415a348b4da64864f746068e9357a8a1c8df6`
MD5	`2ef39e8c6f09621d22f38d39c00c7158`
BLAKE2b-256	`cd23515a36f353e86c113f00822722b2af2de0785747dfba774f3c0f5a224a45`

See more details on using hashes here.

orbit-robotics 0.6.1

Navigation

Verified details

Project links

Maintainers

Unverified details

Meta

Classifiers

Project description

ORBIT — Know If Your Robot Data Will Train Before You Burn GPU Hours

The Problem

What You Get

Quality Grade (A through F)

12 Diagnostic Checks

Benchmark Comparison

Ready-to-Run Training Commands

Automatic Cleaning

CI/CD Quality Gate

Full Command Reference

Core Workflow

Discovery & Benchmarking

Training Pipeline

Data Engineering

Setup & Config

Analyze Options

Supported Data Formats

AI Features (Optional)

Local AI with Ollama (Free, Private)

Cloud AI (Gemini / OpenAI)

How Grading Works

What each grade means in practice

Real-World Impact

Installation

Optional extras

Requirements

What's New in v0.6.0

License

Project details

Verified details

Project links

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes