Open-source framework for defining, running, and benchmarking robot training missions.

These details have not been verified by PyPI

Project links

Project description

Odyssey

Status: Alpha

Status: alpha (v0.1.0-alpha.1). The API, CLI, schemas, and wire protocols are still subject to change without notice. See docs/ for the design refs.

Odyssey intro

odyssey.dev ↗

Install

[!TIP] Linux only — install build dependencies before proceeding (needed by .[all]):
sudo apt update && sudo apt install build-essential python3-dev -y

git clone https://github.com/lovellai-dev/odyssey.git
cd odyssey
python3 -m venv .venv
source .venv/bin/activate
pip install -e .              # CLI, validate, mock runs (lightweight)
pip install -e ".[all]"       # real training + evaluation (torch, robosuite…)
pip install -e ".[all,dev]"   # + pytest, ruff, mypy

The base install pulls in pydantic, click, pyyaml, and aiosqlite — enough to run validate, list, status, and run --use-mock-runner against any mission spec without a GPU. .[all] adds everything needed for real training and evaluation runs.

Quick start (no GPU, no network)

# Validate the mission spec
$ odyssey validate examples/quickstart-openvla/mission.yaml
OK  examples/quickstart-openvla/mission.yaml
  spec version : 0.1
  tasks        : 1 training, 1 evaluation

# Run the full mission with a CPU mock (no GPU needed)
$ odyssey run examples/quickstart-openvla/mission.yaml --use-mock-runner
...
{"ts": "...", "event": "mission.completed", "overall_grade": 1.0}

COMPLETED  c1756bad855e45cc9a95b5b0566c948b
  overall_grade : 1.000

--use-mock-runner swaps in the CPU mock for every task, so this works on a laptop without a GPU. Inspect runs afterward with odyssey list and odyssey status <id>. State is persisted to ~/.odyssey/missions.db; artifacts under ~/.odyssey/runs/<mission-id>/<task-id>/.

What it is

You train an agent by describing a mission in YAML — a robot, a model, a dataset to train on, an evaluation benchmark to score against — and odyssey run walks it through the full lifecycle: load → validate → execute training tasks → execute the evaluation task → persist results. Local-mode by default; the hosted Lovell services (leaderboard, learning graph, hosted runners) are optional layers that land in later releases.

Launching a training mission

Two training paths ship today: GR00T (NVIDIA Isaac GR00T) and OpenVLA. Both run through odyssey run <mission.yaml> — pick the quickstart that matches your model.

GR00T (Isaac-GR00T + Isaac Lab)

Fine-tunes nvidia/GR00T-N1.7-3B on the LeRobot-format demo set that ships inside the Isaac-GR00T repo (no separate download), evaluated in the Isaac Lab cube-lift environment.

Prerequisites:

Install the upstream Isaac-GR00T package — it carries the training entry point (gr00t.experiment.launch_finetune) and the demo dataset. Accept NVIDIA's weight license:

git clone https://github.com/NVIDIA/Isaac-GR00T.git /srv/Isaac-GR00T
pip install -e /srv/Isaac-GR00T
export ISAAC_GR00T_REPO_PATH=/srv/Isaac-GR00T   # resolves the demo dataset

For the Isaac Lab evaluation, install Isaac Lab and point Odyssey at its launcher:

export ISAACLAB_PATH=/srv/IsaacLab              # provides isaaclab.sh

Run:

odyssey run examples/quickstart-gr00t/mission.yaml

The mission routes its training task to the GR00T runner with config: { runner: gr00t } — OpenVLA and GR00T both serve wildcard training tasks, so the family is selected explicitly.

OpenVLA (Bridge V2 + Robosuite)

Prerequisites:

Install the training extras:

pip install -e ".[huggingface,openvla,robosuite]"

Clone the upstream OpenVLA repo and install its dependencies (needed for draccus and the fine-tuning script):

git clone https://github.com/openvla/openvla.git /srv/openvla
pip install -e /srv/openvla
export OPENVLA_REPO_PATH=/srv/openvla

Download the Bridge V2 dataset in RLDS format (~124 GB):

wget -r -nH --cut-dirs=4 --reject="index.html*" \
  https://rail.eecs.berkeley.edu/datasets/bridge_release/data/tfds/bridge_dataset/
mv bridge_dataset bridge_orig

Set --data_root_dir to the parent directory containing bridge_orig/.

Run:

odyssey run examples/quickstart-openvla/mission.yaml

Hardware: 24 GB GPU (RTX 4090-class or better) for the OpenVLA LoRA fine-tune.

[!NOTE] GCP users: single-GPU VMs require export NCCL_NET=Socket before running, to bypass Google's NCCL plugin. See issue #5 for details.

[!NOTE] Evaluation: the Robosuite runner auto-wires an OpenVLA→robosuite-action adapter (make_openvla_policy in runners/models/openvla.py) when no custom policy_factory is injected — it loads either a LoRA adapter or a full merged checkpoint, so eval works without extra glue. Full episode-completion validation on a real GPU is still in progress.

Known-good OpenVLA stack

The fine-tune runs through the cloned OpenVLA repo, which carries its own dependency set — most onboarding friction comes from there, not from Odyssey. Mixing versions surfaces as protobuf / TensorFlow / tensorflow-metadata conflicts or draccus import errors. Known-good versions (from OpenVLA's own requirements — treat its repo as the source of truth):

Python        3.10
torch         2.2.0
torchvision   0.17.0
transformers  4.40.1
tokenizers    0.19.1
timm          0.9.10
flash-attn    2.5.5

To avoid re-downloading the 7B base model each run, point its path env var at a local copy (HF id upper-cased, / and - → _, suffixed _PATH):

export OPENVLA_OPENVLA_7B_PATH=/path/to/openvla-7b   # for base: openvla/openvla-7b

Dataset: how `source: oxe` / `ref: bridge_orig` resolves

Odyssey does not download the dataset — oxe is a pass-through. The runner forwards two values to OpenVLA's finetune.py, which loads via TFDS/RLDS:

mission.yaml	becomes the flag	meaning
`dataset.ref: bridge_orig`	`--dataset_name bridge_orig`	the OXE registry key OpenVLA looks up
`config.data_root_dir: <path>`	`--data_root_dir <path>`	the parent dir containing the RLDS dataset folder

⚠️ Naming gotcha: the registry key and the on-disk folder name can differ. In validation, ref: bridge_orig resolved to data under ~/bridge_dataset/1.0.0/, so data_root_dir had to point at the parent of that folder — not the key name. Check where your download actually landed and set data_root_dir to its parent.

Weights & Biases (W&B)

OpenVLA's finetune.py calls wandb.init() unconditionally, so a run stalls or fails if W&B isn't reachable. Control it yourself:

# Disable for local / smoke runs:
export WANDB_MODE=disabled
# Or log to your account, then pass project/entity via mission config:
#   config: { wandb_project: my-project, wandb_entity: my-entity }

Any config: key Odyssey doesn't consume is forwarded verbatim as --<key> <value> to finetune.py.

What to expect during a run

Timing varies widely with hardware, disk, and network — treat these as orientation, not promises:

Base model download — openvla-7b (~14 GB) on first run, unless OPENVLA_OPENVLA_7B_PATH is set.
Dataset load / indexing — Bridge V2 (~124 GB); RLDS indexing on a cold cache takes a while.
Training startup — model load + LoRA wrap, then steps begin.
Steady state — throughput logs as it/s (~1.49 it/s on an NVIDIA L4 for the quickstart config).

If a stage seems stuck, it's almost always a download in progress or a dataset-path / W&B issue rather than a training bug — check those first.

Multi-agent (PILOT + SPECIALIST)

Show setup & how it works

HuggingFace login (gated models)

The models pulled from the Hub are gated — you must accept each model's license on its HuggingFace page, then authenticate on the machine before the first run, or the download fails with 401/403:

openvla/openvla-7b — the PILOT
google/gemma-4-E2B-it — the SPECIALIST in the multi-agent example (Apache-2.0, no gating)

huggingface-cli login          # paste a token from https://huggingface.co/settings/tokens
# or, non-interactive (CI / headless VM):
export HF_TOKEN=hf_xxx          # a read token on an account that accepted the licenses

A mission with a SPECIALIST agent (a task planner) in addition to the PILOT runs a plan-then-execute loop during eval: the SPECIALIST decomposes the instruction into sub-steps once per episode, and the PILOT executes each. Only the PILOT produces actions and only the PILOT is trained — the SPECIALIST is inference-only (it runs its base checkpoint to plan and has no training task).

robot:
  agents:
    - id: pilot
      role: PILOT
      model: { source: huggingface, base: openvla/openvla-7b }
    - id: task-planner
      role: SPECIALIST
      model:
        source: huggingface
        base: google/gemma-4-E2B-it
        quantization: int4
        modality: multimodal

The SPECIALIST is a vision-grounded multimodal Gemma 4 planner: it sees the first camera frame of each episode and grounds its plan in the scene. Gemma 4 needs a modern transformers + torchvision, which conflicts with OpenVLA's pinned transformers==4.40.1, so the SPECIALIST must run out of process in a separate venv. The PILOT stays in the main venv; the two talk over a JSON-lines subprocess protocol (the planner runs once per episode, off the per-step hot loop).

Setting up the out-of-process SPECIALIST

Create the specialist venv (modern transformers + torchvision + Gemma deps):

python -m venv ~/specialist-venv
~/specialist-venv/bin/pip install -e ".[specialist]" \
  -c constraints/specialist-known-good.txt

Point Odyssey at that venv's python. It is read per-process from the environment, so export it in every shell that runs a mission — or add it to your shell profile / VM startup script so it persists:
```
export ODYSSEY_SPECIALIST_PYTHON=~/specialist-venv/bin/python
```

ODYSSEY_SPECIALIST_PYTHON is required for any mission with a SPECIALIST. The planner is launched in that venv (RemotePlanner → python -m odyssey.runners.agents.planner_server). If it is unset, multi-agent eval fails fast with a clear RuntimeError: the multimodal Gemma 4 planner cannot load in the main venv, which pins transformers==4.40.1 for OpenVLA.

Quick check without a simulator (launches the planner in the specialist venv and prints a decomposition — no OpenVLA or simulator needed):

python tests/manual/smoke_remote_planner.py

Why Gemma 4, not Gemma 3, for multimodal. Gemma 3 4B emits NaN logits under int4 bitsandbytes on this stack (verified across eager/sdpa attention, text-only and with-image), so it can't run quantized here. Gemma 4 (Apache-2.0, ungated) loads cleanly in int4 and grounds plans in the scene image.

VRAM note. Both models still share the GPU — the venv split solves the dependency conflict, not VRAM. The SPECIALIST is pinned to GPU 0 (device_map={"": 0}) so bitsandbytes never silently offloads layers to CPU. Gemma 4 E4B-it int4 (~9.3 GB) alongside bf16 OpenVLA (~14 GB) peaks at ~23 GB — tight on a 24 GB L4; drop to E2B-it for headroom (this is what the multimodal example mission uses).

Two known-good stacks. The main venv pins OpenVLA's stack (constraints/openvla-known-good.txt: torch 2.2.0, transformers 4.40.1); the specialist venv pins a modern one with torchvision (constraints/specialist-known-good.txt). They no longer need to be mutually compatible.

CLI reference

Command	What it does
`odyssey init [DIR]`	Scaffold a new mission directory. `--template openvla\|cpu_mock`.
`odyssey validate <mission.yaml>`	Parse + validate a spec. Exits 0 if clean.
`odyssey run <mission.yaml>`	Execute end-to-end. `--use-mock-runner` for no-GPU smoke.
`odyssey list`	Recent missions from the local SQLite DB. `--status` to filter.
`odyssey status <mission_id>`	One mission's detail. Accepts an id prefix.

All commands respect --db and --working-dir to override the ~/.odyssey/ defaults.

Status snapshot (v0.1.0-alpha.1)

Area	Done	Deferred
Spec + validate	✓	—
Engine + lifecycle	✓	watchdog timers, materialized profiles
In-memory + SQLite persistence	✓	—
Provider ABCs + Local + HF	✓	OXE, Lovell-mode
CPU mock runner	✓	—
OpenVLA training runner	✓ (validated on L4)	—
GR00T training runner	✓ (validated on H100); task-level `runner: gr00t` routing	—
Robosuite eval runner	✓ (auto-wired OpenVLA adapter)	full GPU end-to-end validation
Isaac Lab eval runner	✓ skeleton + tests, subprocess launch + `ODYSSEY_*` stdout protocol	blessed eval script (GR00T/VLA recipe), real-Isaac smoke
Multi-agent eval (PILOT + SPECIALIST)	✓ (out-of-process Gemma 4 planner)	full GPU end-to-end validation
`odyssey init / run / list / status / validate`	✓	`logs`, `publish`
Leaderboard publish, Learning Graph, Anonymizer, Auth	—	post-v0.1.0-alpha.1

License

Apache License 2.0. See LICENSE.

Contributing

See CONTRIBUTING.md. DCO sign-off required on every commit. Open an issue before non-trivial PRs — the API surface is moving weekly until v0.1.0-alpha.1 freezes.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0a1 pre-release

Jun 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lovell_odyssey-0.1.0a1.tar.gz (5.2 MB view details)

Uploaded Jun 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lovell_odyssey-0.1.0a1-py3-none-any.whl (106.0 kB view details)

Uploaded Jun 21, 2026 Python 3

File details

Details for the file lovell_odyssey-0.1.0a1.tar.gz.

File metadata

Download URL: lovell_odyssey-0.1.0a1.tar.gz
Upload date: Jun 21, 2026
Size: 5.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for lovell_odyssey-0.1.0a1.tar.gz
Algorithm	Hash digest
SHA256	`d296822478d1448225c80769cccdfe090d768a452b7b8d2612273a17df2ff072`
MD5	`f46c621a1ca3492a0bdbc0db103629f6`
BLAKE2b-256	`e893bc7d2c0b8101d06ecdc29d2ee64aec0e8ad980259819f50b795f216d158a`

See more details on using hashes here.

File details

Details for the file lovell_odyssey-0.1.0a1-py3-none-any.whl.

File metadata

Download URL: lovell_odyssey-0.1.0a1-py3-none-any.whl
Upload date: Jun 21, 2026
Size: 106.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for lovell_odyssey-0.1.0a1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e3db36a32207ec25efa0d2648125dfdb5eca02a08805eb17db4798916ae7d28`
MD5	`74bb50a3f2fc1baf1e85ed9d803c70c0`
BLAKE2b-256	`6222c378f0bf4aa90279778475b6a3aec595285a4217884473adaf70db98cffc`

See more details on using hashes here.

lovell-odyssey 0.1.0a1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Odyssey

Install

Quick start (no GPU, no network)

What it is

Launching a training mission

Known-good OpenVLA stack

Dataset: how source: oxe / ref: bridge_orig resolves

Weights & Biases (W&B)

What to expect during a run

Multi-agent (PILOT + SPECIALIST)

HuggingFace login (gated models)

Setting up the out-of-process SPECIALIST

CLI reference

Status snapshot (v0.1.0-alpha.1)

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Dataset: how `source: oxe` / `ref: bridge_orig` resolves