Run Harbor tasks in the cloud with scheduling, monitoring, and persistent state
Project description
Oddish CLI
Run Harbor tasks on Oddish infrastructure.
oddish is a Python CLI for submitting Harbor tasks, running multi-trial sweeps,
monitoring experiments, and pulling logs and artifacts back to disk. If you
already use harbor run, Oddish adds persistent state, retries, queueing, and
better operational tooling around the same task format.
Python 3.14+ is required.
Quick Start
uv pip install oddish
export ODDISH_API_KEY="ok_..."
# Submit a run
oddish run -d swebench@1.0 -a codex -m openai/gpt-5.2 --n-trials 3
# Watch progress
oddish status
oddish status <task_id> --watch
# Pull logs and artifacts locally
oddish pull <task_id> --watch
The CLI targets Oddish Cloud by default. All API-backed commands require
ODDISH_API_KEY. For self-deployed instances, also set ODDISH_API_URL.
Installation
uv pip install oddish
Common environment variables:
export ODDISH_API_KEY="ok_..."
# Point at a self-deployed instance instead of Oddish Cloud
# export ODDISH_API_URL="https://<workspace>--api.modal.run"
# Optional dashboard override
# export ODDISH_DASHBOARD_URL="https://www.oddish.app"
Need to deploy your own stack? See ../SELF_HOSTING.md.
Need package internals, architecture, or development notes? See AGENTS.md.
Commands
The installed console script is:
oddish --help
Available commands:
oddish runuploads a local task or dataset, downloads a registry dataset, or expands a sweep config into trialsoddish uploadregisters task bundles (no trials) or uploads off-oddish Harbor trial results (logs, rewards, tokens) onto an existing taskoddish statusshows system, task, or experiment statusoddish cancelstops all in-flight runs for a taskoddish pulldownloads logs, results, trajectories, and artifact files for a trial, task, or experimentoddish deletedeletes a task or experiment from a self-hosted deployment
oddish run
Use oddish run for:
- a single local Harbor task directory
- a local dataset directory containing multiple tasks
- a Harbor registry dataset via
--dataset - a YAML or JSON sweep config via
--config - appending trials to an existing task via
--task
Examples:
# Local task
oddish run ./my-task -a claude-code -m anthropic/claude-sonnet-4-5
# Local dataset
oddish run ./my-dataset -a codex -m openai/gpt-5.2 --n-trials 3
# Harbor registry dataset
oddish run -d swebench@1.0 -a codex -m openai/gpt-5.2 --n-trials 3
# Filter a dataset
oddish run -d swebench@1.0 -t "django__*" -l 10 -a claude-code
# Append new trials to an existing task
oddish run --task task_123 -a gemini-cli -m google/gemini-3.1-pro-preview --n-trials 3
# Submit in the background
oddish run ./my-task -a claude-code --background
# Script-friendly JSON output (implies --background)
oddish run ./my-task -a claude-code --json
Common flags:
PATHor-p, --pathselects a local task or dataset directory-a, --agentselects the agent-m, --modelselects the model--n-trialsruns multiple trials per task-d, --datasetpulls tasks from the Harbor registry--taskappends trials to an existing task ID without re-uploading task files-c, --configloads a YAML or JSON sweep config-t, --task-name,-x, --exclude-task-name, and-l, --n-tasksfilter datasets-e, --envselects the execution environment-P, --priority,-E, --experiment,-u, --user,-G, --github-user, and--github-metaattach scheduling and attribution metadata-w, --watch / --no-watchwatches single-task submissions until completion--backgroundsubmits and returns immediately--jsonemits machine-readable output and implies--background-q, --quietsuppresses nonessential output--run-analysisruns post-trial analysis and task verdict generation--publishpublishes the experiment for public read-only access--disable-verificationskips task verification--override-cpus,--override-memory-mb,--override-gpus,--override-storage-mb, and--force-buildoverride environment settings--ae/--agent-env,--ak/--agent-kwarg, and--artifactpass Harbor agent/env configuration through to every submitted config--apioverrides the API URL for a single invocation
Supported --env values:
dockerdaytonae2bmodalrunloopgke
When --env is omitted:
- hosted Oddish (
*.modal.run) defaults tomodal - other API URLs default to
docker --taskpreserves the existing task's environment unless you override it
Sweep Configs
oddish run -c sweep.yaml accepts YAML or JSON. A minimal config:
agents:
- name: claude-code
model_name: anthropic/claude-sonnet-4-5
n_trials: 3
- name: codex
model_name: openai/gpt-5.2
n_trials: 3
dataset: swebench@1.0
n_tasks: 10
priority: low
You can also set path, exclude_task_names, and experiment_id in the
config file. Per-agent overrides use env and kwargs. Timeouts and
per-provider concurrency are no longer configured in sweep files; declare task
timeouts in task.toml and API concurrency at server startup.
oddish upload
oddish upload covers two related flows. The mode is picked
automatically from the positional PATH you provide:
- Task upload — if
PATHis a Harbor task directory (or a dataset directory of tasks), the task bundle is uploaded to storage and a task row is created in the DB so it shows up in the task browser. No trials are queued. - Trial import — if
PATHis a Harborjob_dir(or a parentjobs_dirwith multiple job subdirs), each trial in the job is registered against an existing task as if it had run on Oddish. Imported trials show up in the experiment view with their reward, tokens, cost, phase timing, and artifacts; the only difference is anorigin = "imported"flag on the trial row.
Task uploads:
# Upload a single local task
oddish upload ./my-task
# Upload every task in a local dataset directory
oddish upload ./my-dataset
# Upload all tasks from a Harbor registry dataset
oddish upload -d swebench@1.0
# Filter which tasks to upload
oddish upload ./my-dataset -t "django__*" -l 10
oddish upload -d swebench@1.0 -x "*-slow"
Trial imports from a local harbor run:
# Import every trial in a single Harbor job dir into an existing task
oddish upload ./jobs/my-task.claude-code.abcd --task task_123
# Pin the imported trials to a named experiment (new or existing)
oddish upload ./jobs/my-task.claude-code.abcd --task task_123 \
--experiment my-local-sweep
# Import multiple Harbor jobs at once from a parent jobs directory
oddish upload ./jobs --task task_123 --experiment my-local-sweep
# One-shot: upload the task and import its trials in a single command
oddish upload ./jobs/my-task.claude-code.abcd --path ./my-task
# Register metadata only (no logs/trajectory uploads)
oddish upload ./jobs/my-task.claude-code.abcd --task task_123 \
--skip-artifacts
Common flags:
PATHselects the source (task dir, dataset dir, Harbor job dir, or Harbor jobs parent dir).-p, --pathis an alias that also doubles as a one-shot task upload in trial-import mode.-d, --datasetpulls tasks from the Harbor registry (task-upload mode)-t, --task-name,-x, --exclude-task-name, and-l, --n-tasksfilter datasets (task-upload mode)-M, --messageattaches a description to the uploaded task version (task-upload mode)-u, --userattributes the created task row to a user (defaults to OS username)-P, --prioritysets the task priority (loworhigh) (task-upload mode)--taskpins imported trials to an existing task ID (trial-import mode)-E, --experimentpins imported trials to a new or existing experiment; omitted, each import creates a fresh experiment (trial-import mode)--skip-artifactsregisters imported trial metadata only, without logs/trajectory (trial-import mode)--api,--json,-q, --quietmatch the other commands
Notes:
- Task rows uploaded this way appear in the task browser in
pendingstate until their first trials run (or are imported). oddish run --task <task_id> ...attaches fresh trials to a previously-uploaded task.- The target task for a trial import must have been created without
run_analysisenabled. Imports skip the worker queue and cannot feed the analysis pipeline. - Experiments can be heterogeneous — one experiment can mix trials that ran on Oddish with trials that were imported.
oddish status
Without arguments, oddish status shows recent experiments and API health. Use
a task ID or --experiment to inspect a specific run, and --watch to resume
live monitoring later.
Examples:
# System overview
oddish status
# Task snapshot
oddish status <task_id>
# Watch a task
oddish status <task_id> --watch
# Watch an experiment
oddish status --experiment <experiment_id> --watch
oddish cancel
Cancel all in-flight runs for a task without deleting any data. Queued jobs are removed, running trials are cancelled, and active Modal workers are terminated when applicable. Completed trials and their results are preserved.
oddish cancel <task_id>
oddish cancel <task_id> --force # skip confirmation
oddish pull
oddish pull accepts a trial ID, task ID, or experiment ID and auto-detects
the target type by default.
Examples:
# Pull one trial
oddish pull <trial_id>
# Keep syncing a task while it runs
oddish pull <task_id> --watch --interval 5
# Pull an entire experiment, including task files
oddish pull <experiment_id> --include-task-files
By default, pull output is written to ./.oddish/<target> and includes a
manifest.json describing the fetch. Use --no-logs, --no-files,
--structured, --include-task-files, --out, and --type to control what
gets downloaded and where it lands. --type trial|task|experiment forces the
target type instead of auto-resolving it.
oddish delete
Examples:
# Delete a task and its trials
oddish delete <task_id>
# Delete an entire experiment
oddish delete --experiment <experiment_id>
Typical Workflow
# 1. Submit a run
oddish run -d swebench@1.0 -a claude-code -m anthropic/claude-sonnet-4-5
# 2. Inspect or watch it later
oddish status <task_id> --watch
# 3. Pull outputs when you want them locally
oddish pull <task_id> --watch
More Technical Docs
- Package internals and implementation notes:
AGENTS.md - Self-hosting and deployment:
../SELF_HOSTING.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oddish-0.1.11.tar.gz.
File metadata
- Download URL: oddish-0.1.11.tar.gz
- Upload date:
- Size: 261.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f22ee644d43d05c57a5decd6cc5b06f2b433030fce9ed17062f7ad9bc9d9e01d
|
|
| MD5 |
4b6a66395ccc9d4a8dc3e04ba0a2233e
|
|
| BLAKE2b-256 |
e2d37a3c4239c462778f00c87a8df8efe2d8306fa69bc5c42eed217932ec9a5a
|
File details
Details for the file oddish-0.1.11-py3-none-any.whl.
File metadata
- Download URL: oddish-0.1.11-py3-none-any.whl
- Upload date:
- Size: 235.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd5d2a87a6a567d69028cc5065b47a2b251de9ab116035b04f498d8c026a6d73
|
|
| MD5 |
ed2faea99131a631061a37ccde214389
|
|
| BLAKE2b-256 |
a0a11dd9625621394b956914ee2f8ef93efbf59f8dd378b0fb3735de09edcfd9
|