NVIDIA Cosmos Reason VLM provider for Strands Agents - physical AI reasoning, video understanding, and embodied intelligence
Project description
strands-cosmos
NVIDIA Cosmos toolkit for Strands Agents — from VLM reasoning to world-model generation, edge deployment, and evaluation.
Provides Cosmos-Reason2 as a Strands model provider plus 21 tools covering the entire NVIDIA Cosmos ecosystem: inference, video generation (Predict2.5), video-to-video (Transfer2.5), data curation (Xenna), post-training, distillation, quantization, edge deployment, and evaluation.
Demo
Dashcam safety analysis with Chain-of-Thought reasoning on Jetson AGX Thor
Install
pip install strands-cosmos
Developer Setup
git clone https://github.com/cagataycali/strands-cosmos && cd strands-cosmos
just setup-full # Installs system deps, Python deps, clones all Cosmos repos
just doctor # Verify everything
NVIDIA Jetson (Thor, Orin, AGX)
pip install strands-cosmos
strands-cosmos-fix-cublas # Fix CUBLAS for Jetson GPU architecture
Quick Start
from strands import Agent
from strands_cosmos import CosmosVisionModel
model = CosmosVisionModel(model_id="nvidia/Cosmos-Reason2-2B")
agent = Agent(model=model)
# Video understanding
agent("Caption in detail: <video>dashcam.mp4</video>")
# Image reasoning
agent("<image>robot_view.jpg</image> What should the robot do next?")
# Text-only physics reasoning
agent("What happens when a ball rolls off a table?")
Tools
Use any tool inside a Strands Agent for full Cosmos pipeline automation:
| Category | Tools | Description |
|---|---|---|
| Reason2 VLM | cosmos_inference, cosmos_reason_hf, cosmos_serve |
TRT server inference, HF direct inference, server lifecycle |
| Predict 2.5 | cosmos_predict_generate |
World-model video generation (future frame prediction) |
| Transfer 2.5 | cosmos_transfer_generate |
ControlNet video-to-video (depth/edge/sketch→video) |
| Model Lifecycle | cosmos_model_download, cosmos_quantize, cosmos_export_onnx, cosmos_build_engine |
Download, FP8 quantize, ONNX export, TRT engine build |
| Training | cosmos_post_train, cosmos_distill |
SFT/LoRA post-training, knowledge distillation |
| Data | cosmos_curate |
Xenna data curation pipeline |
| Evaluation | cosmos_evaluate |
FID/FVD/CSE/CLIP benchmark evaluation |
| I/O | rtp_capture_frame, nats_publish, video_probe, video_extract_frames, image_read |
RTP capture, NATS messaging, video/image utilities |
| System | cosmos_sysinfo |
GPU/platform diagnostics |
from strands import Agent
from strands_cosmos import cosmos_reason_hf, video_probe, cosmos_sysinfo
agent = Agent(tools=[cosmos_reason_hf, video_probe, cosmos_sysinfo])
agent("Check the system, then analyze the video at /tmp/scene.mp4")
Models
| Model | GPU Memory | Use Case |
|---|---|---|
| Cosmos-Reason2-2B | 24GB | Edge deployment (Jetson Thor/Orin) |
| Cosmos-Reason2-8B | 32GB | Cloud/desktop high-accuracy |
Performance (Jetson AGX Thor, Reason2-2B)
| Task | Load Time | Generation |
|---|---|---|
| Text inference | 7s | 1.4s (46 tokens) |
| Video caption | 7s | 2.2s (short clip @ 4fps) |
Architecture
strands_cosmos/
├── cosmos_model.py # CosmosModel (text-only Strands Model)
├── cosmos_vision_model.py # CosmosVisionModel (video+image+text)
├── fix_cublas.py # Jetson CUBLAS compatibility fix
├── tools/ # 21 tools (full Cosmos pipeline)
│ ├── inference.py # TRT server inference
│ ├── reason_hf.py # HF Transformers direct inference
│ ├── serve.py # Server lifecycle management
│ ├── predict_generate.py # Predict2.5 world model
│ ├── transfer_generate.py # Transfer2.5 ControlNet
│ ├── model_download.py # HF model download
│ ├── quantize.py # FP8 quantization
│ ├── export_onnx.py # ONNX export
│ ├── build_engine.py # TRT engine build
│ ├── post_train.py # Post-training (SFT/LoRA)
│ ├── distill.py # Knowledge distillation
│ ├── curate.py # Xenna data curation
│ ├── evaluate.py # Benchmark evaluation
│ ├── rtp.py # GStreamer RTP capture
│ ├── nats_pub.py # NATS publish
│ ├── video_utils.py # ffprobe + frame extraction
│ ├── image_read.py # Base64 image read
│ └── sysinfo.py # System diagnostics
└── justfile # Developer workflow automation
Justfile (Developer Workflow)
just setup # Clone all 6 Cosmos ecosystem repos
just setup-full # Full setup: system deps + Python + repos + diagnostics
just doctor # Check repos, tools, GPU, platform compatibility
just install-trt-edge-llm # Build TensorRT-Edge-LLM from source (Jetson)
# Run pipelines
just predict-generate config.json
just transfer-generate config.json
just evaluate metrics.json
just serve-start
Configuration
model = CosmosVisionModel(
model_id="nvidia/Cosmos-Reason2-8B",
device_map="auto",
torch_dtype="auto",
reasoning=True, # Chain-of-thought <think>...</think>
fps=4, # Video sampling rate
min_vision_tokens=256,
max_vision_tokens=8192,
params={"max_tokens": 4096, "temperature": 0.6},
)
Verified Platforms
| Platform | GPU | Status |
|---|---|---|
| Jetson AGX Thor | NVIDIA Thor 132GB | ✅ (with CUBLAS fix) |
| Jetson Orin | 32/64GB | ✅ (may need CUBLAS fix) |
| Desktop | A100 / H100 / RTX 4090 | ✅ |
| Cloud | Any CUDA 12+ GPU | ✅ |
Troubleshooting
CUBLAS_STATUS_INVALID_VALUE on Jetson
strands-cosmos-fix-cublas # Replaces torch's bundled CUBLAS with JetPack system CUBLAS
StopIteration in get_rope_index during video
Already handled — strands-cosmos pins transformers<5.3.0. If you see this, run:
pip install "transformers>=4.57.0,<5.3.0"
TRT tools return exit 127
Expected on workstations — those tools run on Jetson or in TRT Docker. Run just doctor to see what works on your machine.
Resources
- Cosmos Cookbook — Official recipes
- Cosmos-Reason2 — VLM source
- Strands Agents — Agent framework
- strands-mlx — Apple Silicon provider
License
Apache 2.0 | Built with NVIDIA Cosmos and Strands Agents
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file strands_cosmos-0.2.0.tar.gz.
File metadata
- Download URL: strands_cosmos-0.2.0.tar.gz
- Upload date:
- Size: 38.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a08b42c4e12bc24fa2829f03b804fe971b924609fcc88f92276a5f86fbd132af
|
|
| MD5 |
30230093d0ae66819f417cc1d463e115
|
|
| BLAKE2b-256 |
c0dbc9d4652c2f37b5d407c0a4aef0ce05e3ec9e6d5649dc8dea2ee58a843098
|
File details
Details for the file strands_cosmos-0.2.0-py3-none-any.whl.
File metadata
- Download URL: strands_cosmos-0.2.0-py3-none-any.whl
- Upload date:
- Size: 47.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87b5d7d315323636adb7f0ac217151d443d949fdaf3202cf135676d4286796b4
|
|
| MD5 |
7fee980fccc45fa754ff55b054115ab9
|
|
| BLAKE2b-256 |
575c6d390da1fc460f045b0a844f51f416b18f998d20517fc815b10d8ada96c5
|