Skip to main content

Strands agent wrapping NVIDIA Cosmos for Jetson AGX Thor edge deployment

Project description

thor-cosmos

thor-cosmos

NVIDIA Cosmos on Jetson AGX Thor — one justfile, one Strands agent, full lifecycle.

PyPI Python License Awesome Strands Agents Docs


thor-cosmos is a Strands agent + justfile that orchestrates the full NVIDIA Cosmos ecosystem and deploys it on Jetson AGX Thor for real-time robot perception.

Wraps Cosmos-Reason2 (VLM), Cosmos-Predict2.5 (world model), Cosmos-Transfer2.5 (ControlNet), and Cosmos-Xenna (data curation) — every pipeline is a just recipe that agents and operators share.

 Operator CLI         Strands Agent
      │                     │
      │  just <recipe>      │  cosmos_*()
      │                     │
      └─────────────────────┴─────────┐
                                      ▼
                              ┌───────────────┐
                              │   justfile    │  ← EVERYTHING lives here
                              │  (42 recipes) │
                              └───────────────┘
                                      │
                      ┌───────────────┼───────────────┐
                      ▼               ▼               ▼
               tensorrt-edgellm-*   torchrun     curl/gst/nats
               (quant/export)      (train/distill) (serve/io)

Why the justfile pattern?

  • Same muscle memory as upstream: every Cosmos repo (cosmos-predict2.5, cosmos-transfer2.5, cosmos-reason2, cosmos-cookbook) already ships a justfile. Ours blends in.
  • One source of truth: agents shell out to just <recipe>; operators run just <recipe> directly. Zero duplication.
  • Thin Python tools: each @tool is ~30 lines that invokes a recipe and maps output → Strands ToolResult.
  • Discoverable: just --list prints every pipeline step.
  • Composable: pipeline-edge-deploy chains download → quantize → export-llm → export-visual.

Installation

brew install just                         # (or: curl -LsSf https://get.casey.rs | bash)
pipx install thor-cosmos
thor-cosmos                               # start the agent

From source:

git clone https://github.com/cagataycali/thor-cosmos
cd thor-cosmos
just install                              # venv + pip install -e .
just run                                  # start agent REPL

On Jetson Thor (rsync from laptop, run via tmux):

# From your laptop:
just deploy-thor cagatay@thor.local ~/thor-cosmos

# On Thor:
ssh cagatay@thor.local
tmux new -s thor 'cd ~/thor-cosmos && just run'

Tools ↔ Recipes

Tool (agent) just recipe (CLI)
cosmos_model_download just download <name> / just download-dataset <name>
cosmos_quantize just quantize <model_dir> <output_dir> <dtype> <quantization>
cosmos_export_onnx (llm) just export-llm <model_dir> <output_dir>
cosmos_export_onnx (visual) just export-visual <model_dir> <output_dir> <dtype> <quant>
cosmos_build_engine (llm) just build-llm-engine <onnx_dir> <engine_dir> ...
cosmos_build_engine (visual) just build-visual-engine <onnx_dir> <engine_dir>
cosmos_serve(start|stop|status|logs|restart) serve-start / serve-stop / serve-status / serve-logs / serve-restart
cosmos_inference just infer <image> <prompt>
cosmos_reason_hf HF Transformers direct (no recipe)
rtp_capture_frame just rtp-capture <port> <out> <w> <h> <timeout>
nats_publish just nats-publish <subject> <payload_json>
cosmos_predict_generate just predict-generate <input.json>
cosmos_transfer_generate just transfer-generate <input.json> <control>
cosmos_post_train(reason2|predict2_5|transfer2_5) post-train-reason2 / post-train-predict / post-train-transfer
cosmos_distill just distill <teacher> <student> <method> <family>
cosmos_curate just curate <input> <output> <stages>
cosmos_evaluate just evaluate <metric> <pred> <gt>
system_info just sysinfo
video_probe / video_extract_frames just video-probe / just video-frames
image_read — (read + embed JPEG bytes)

Pipelines (composed recipes)

# Full x86 model prep (download → quantize → export-llm → export-visual)
just prep-edge-model reason2-2b ./models/Cosmos-Reason2-2B-fp8

# Flagship edge deployment (intbot_edge_vlm)
just pipeline-edge-deploy reason2-2b ./models/Cosmos-Reason2-2B-fp8

# GR00T-Dreams synthetic trajectories
just pipeline-gr00t-dreams ./datasets/gr1 configs/gr00t-dreams.yaml

# Real-time perception (RTP → VLM → NATS, runs in tmux)
just perception-loop perception.vlm "describe the scene, count people"

# Smoke test
just smoke

The flagship recipe — intbot_edge_vlm

Deploy Cosmos-Reason2 to Thor for real-time robot perception.

# On x86 GPU host
just prep-edge-model reason2-2b ./models/R2-fp8
scp -r ./models/R2-fp8/onnx cagatay@thor.local:~/R2-fp8-onnx

# On Thor
just build-engines ~/R2-fp8-onnx ~/R2-fp8-engines
just serve-start ~/R2-fp8-engines/llm ~/R2-fp8-engines/visual
just infer /tmp/test.jpg "count people in the scene"

# Continuous loop
just perception-loop perception.vlm "describe the scene; count people"

Expected throughput on Jetson AGX Thor with Cosmos-Reason2-2B FP8: 3-5 FPS at 800×600 and 128-token output.

Full walkthrough in the docs


Environment

All just recipes honor these env vars (put them in .envdotenv-load is on):

# Agent model
THOR_COSMOS_PROVIDER=bedrock              # bedrock | openai | ollama
THOR_COSMOS_MODEL_ID=global.anthropic.claude-opus-4-6-v1
AWS_REGION=us-west-2

# TRT-Edge-LLM binaries (Thor)
TRT_ROOT=/opt/tensorrt-edge-llm
COSMOS_SERVER_BIN=${TRT_ROOT}/build/examples/server/trt_edgellm_server
TRT_LLM_BUILD_BIN=${TRT_ROOT}/build/examples/llm/llm_build
TRT_VISUAL_BUILD_BIN=${TRT_ROOT}/build/examples/multimodal/visual_build

# Cosmos upstream repo paths (cloned alongside thor-cosmos by default)
COSMOS_PREDICT_REPO=../cosmos-predict2.5
COSMOS_TRANSFER_REPO=../cosmos-transfer2.5
COSMOS_REASON_REPO=../cosmos-reason2
COSMOS_XENNA_REPO=../cosmos-xenna
COSMOS_RL_REPO=../cosmos-rl
COSMOS_COOKBOOK_REPO=../cosmos-cookbook

# Serve / IO
VLM_HOST=127.0.0.1
VLM_PORT=8080
RTP_BIND=0.0.0.0
RTP_PORT=5600
NATS_URL=nats://127.0.0.1:4222

ToolResult contract

Every tool returns:

{
    "status": "success" | "error",
    "content": [
        {"text": "..."},                                          # human-readable
        {"json": {...}},                                          # structured payload
        {"image": {"format": "jpeg", "source": {"bytes": b"..."}}},
    ],
}

Image-producing tools (rtp_capture_frame, video_extract_frames, image_read) embed the captured JPEG bytes so the agent can feed them straight into cosmos_inference on the next turn — no disk round-trip.


Docs

Landing page: cagataycali.github.io/thor-cosmos


References


Apache-2.0 · built for Physical AI · GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thor_cosmos-0.1.0.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thor_cosmos-0.1.0-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file thor_cosmos-0.1.0.tar.gz.

File metadata

  • Download URL: thor_cosmos-0.1.0.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for thor_cosmos-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f0433024bc1c56429aa60703ece7e01a2126d7ced785e00d66000326c9393491
MD5 b8a8e925c3df612003f5cf5033f4100e
BLAKE2b-256 ac276b6c62e9dc3c67c9117c2b3a18a0494b1d1afeb7e7b37348c677a390fc3f

See more details on using hashes here.

File details

Details for the file thor_cosmos-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: thor_cosmos-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for thor_cosmos-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c9912caf095586cc852c4ff2add9d2de1c631d3e57c663d0d5157b215ace5966
MD5 e98e992a99649c10e3080aef288154cd
BLAKE2b-256 e15d813546833ad8b733fe5bf96a2ceb535b11e53d401c684113e4fe9edef830

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page