Skip to main content

NVIDIA OSMO workflow orchestration for Strands Agents - submit, monitor, and debug Physical AI pipelines from natural language.

Project description

strands-osmo

PyPI version Docs License

Awesome Strands Agents

Strands OSMO

NVIDIA OSMO workflow orchestration for Strands Agents - submit, monitor, and debug Physical AI pipelines from natural language.

OSMO is NVIDIA's open-source Kubernetes-native control plane for Physical AI. It runs heterogeneous compute (training GPUs, simulation GPUs, edge devices) from a single YAML spec - used in production for GR00T, Isaac Lab, Isaac Dexterity, Isaac Sim, Isaac ROS.

This package wraps the production osmo CLI as 22 Strands tools so an agent can drive the entire workflow lifecycle in plain English.

Sister project: strands-cosmos (Cosmos VLM provider + 21 generation/edge tools).


Install

pip install strands-osmo

You also need the OSMO CLI on PATH:

curl -fsSL https://raw.githubusercontent.com/NVIDIA/OSMO/main/install.sh | bash
osmo login                  # one-time OAuth
osmo version                # confirm

Verify everything from your agent:

from strands_osmo import osmo_doctor
osmo_doctor()
# → {"osmo_present": True, "version_ok": True, "logged_in": True, ...}

Quick Start

Discover resources, submit a workflow

from strands import Agent
from strands_osmo import (
    osmo_doctor, osmo_pool_list, osmo_resources_list,
    osmo_workflow_validate, osmo_workflow_submit, osmo_workflow_status,
    osmo_workflow_logs, osmo_workflow_cancel,
    osmo_cookbook_fetch,
)

agent = Agent(tools=[
    osmo_doctor, osmo_pool_list, osmo_resources_list,
    osmo_workflow_validate, osmo_workflow_submit, osmo_workflow_status,
    osmo_workflow_logs, osmo_workflow_cancel,
    osmo_cookbook_fetch,
])

agent("""
Find me an idle H100 pool with at least 4 GPUs free,
fetch the GR00T fine-tune recipe from the cookbook,
validate it, then submit it. Report the workflow ID.
""")

One-shot pipeline

from strands_osmo import osmo_workflow_submit

osmo_workflow_submit(
    workflow_yaml="train.yaml",
    pool="h100-prod",
    set_vars={"num_gpu": 4, "epochs": 10},
    priority="LOW",   # bypass quota, run on idle capacity
)

Tools

Category Tools Wraps
Auth osmo_doctor, osmo_login, osmo_version env / osmo login / osmo version
Resources osmo_pool_list, osmo_resources_list, osmo_profile_list, osmo_bucket_list osmo pool/resources/profile/bucket list
Workflow osmo_workflow_submit, osmo_workflow_list, osmo_workflow_status, osmo_workflow_cancel, osmo_workflow_logs, osmo_workflow_exec osmo workflow ...
Task osmo_task_list, osmo_task_describe osmo task ...
Data osmo_data_upload, osmo_data_download, osmo_data_list osmo data ...
Dataset osmo_dataset_list, osmo_dataset_describe osmo dataset ...
App osmo_app_list, osmo_app_create osmo app ...
Cookbook osmo_cookbook_fetch GitHub raw / local clone
Spec osmo_workflow_validate, osmo_workflow_render local YAML + Jinja

Total: 25 tools (22 OSMO-CLI-backed + 3 local helpers).

from strands_osmo import osmo_pool_list, osmo_workflow_submit

osmo_pool_list(mode="free")
# → JSON: pools, free GPUs, quota state

osmo_workflow_submit("train.yaml", pool="h100-prod", set_vars={"epochs": 10})
# → workflow ID(s)

Why use it inside an agent?

OSMO already has a great CLI. The point of strands-osmo is letting an agent chain OSMO ops with reasoning:

Without an agent With an agent
osmo pool list → eyeball "Find me an idle H100"
Edit YAML by hand to fit quota "Cap memory to fit the smallest A100 node, then submit"
osmo workflow logs <id> → grep for stack trace "Why did workflow X fail and how do I fix it?"
osmo workflow submit && watch status "Submit, wait until done, summarize results, retrain on failure"

Architecture

strands_osmo/
├── __init__.py             # exports 25 tools
└── tools/
    ├── _common.py          # osmo CLI runner + ToolResult helpers
    ├── doctor.py           # env probe (osmo binary, login, profile)
    ├── login.py            # osmo login
    ├── version.py          # osmo version
    ├── pool_list.py        # osmo pool list [--mode free]
    ├── resources_list.py   # osmo resources list
    ├── profile_list.py     # osmo profile list
    ├── bucket_list.py      # osmo bucket list
    ├── workflow_submit.py  # osmo workflow submit
    ├── workflow_list.py    # osmo workflow list
    ├── workflow_status.py  # osmo workflow describe
    ├── workflow_cancel.py  # osmo workflow cancel
    ├── workflow_logs.py    # osmo workflow logs
    ├── workflow_exec.py    # osmo workflow exec
    ├── task_list.py        # osmo task list
    ├── task_describe.py    # osmo task describe
    ├── data_upload.py      # osmo data upload
    ├── data_download.py    # osmo data download
    ├── data_list.py        # osmo data list
    ├── dataset_list.py     # osmo dataset list
    ├── dataset_describe.py # osmo dataset describe
    ├── app_list.py         # osmo app list
    ├── app_create.py       # osmo app create
    ├── cookbook_fetch.py   # OSMO cookbook fetcher
    ├── workflow_validate.py# local YAML lint
    └── workflow_render.py  # local Jinja render

Design tenets:

  1. CLI is truth. Tools shell out to osmo, never re-implement its client SDK.
  2. Thin wrappers. Each tool ≈ 50 lines, normalizing JSON into Strands ToolResult.
  3. Graceful degradation. Missing osmo binary → exit 127 ToolResult, not crash.
  4. No upstream forking. Use OSMO as-is; it's a separate repo on disk.

Workflow YAML 101 (cheat sheet)

workflow:
  name: my-pipeline
  resources:
    default: { cpu: 4, memory: 16Gi, storage: 40Gi, gpu: 1 }

  tasks:
  - name: train
    image: pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel
    command: ["python", "train.py"]
    args: ["--epochs", "{{epochs}}", "--num-gpu", "{{num_gpu}}"]
    resources: { gpu: "{{num_gpu}}" }

Submit with:

osmo workflow submit my-pipeline.yaml --pool h100-prod \
    --set num_gpu=4 --set epochs=10

Or:

osmo_workflow_submit("my-pipeline.yaml", pool="h100-prod",
                     set_vars={"num_gpu": 4, "epochs": 10})

For the full reference (including groups, inputs, outputs.dataset, checkpointing, and platform constraints) see the OSMO User Guide.


Development

git clone https://github.com/cagataycali/strands-osmo && cd strands-osmo
pip install -e ".[dev]"
pytest tests/                # smoke tests
ruff check .                  # lint
ruff format .                 # format

See Also


License

Apache 2.0 - same license as upstream OSMO.

OSMO is © NVIDIA Corporation. strands-osmo is an independent community project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

strands_osmo-0.1.0.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

strands_osmo-0.1.0-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file strands_osmo-0.1.0.tar.gz.

File metadata

  • Download URL: strands_osmo-0.1.0.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for strands_osmo-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4c71438fb88e8f55f82bf3d279a0dba742f3729410b7e64097a3375e9230d6ed
MD5 15b1465a9374262b299095872e7f5e0f
BLAKE2b-256 c811c71d993a4c0a43d372d03d82c441b061c1ecddff1cd4ae601ed92bc5c520

See more details on using hashes here.

File details

Details for the file strands_osmo-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: strands_osmo-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for strands_osmo-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3ce37dc895f9beefa339c400ac007e1579cb8eb3095ad90a14ef177d7bec3407
MD5 1a16b0114d1a97232d14f26a13ea2cec
BLAKE2b-256 ca107b8d3da90c433f92f72748bb3b9a4cb96650855fc30ad2f272c7d2085fc2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page