NVIDIA VLM plugin for Vision Agents
Project description
NVIDIA Plugin for Vision Agents
NVIDIA VLM integration for Vision Agents. Supports vision language models through NVIDIA's Chat Completions API with NVCF asset management.
Features
- Video understanding: Automatically buffers and forwards video frames to NVIDIA VLM models
- Streaming responses: Supports streaming text responses with real-time chunk events
- Asset management: Automatically uploads frames as assets and cleans them up after use
- Configurable frame rate and buffer duration for optimal performance
Installation
uv add vision-agents[nvidia]
Configuration
Set your NVIDIA API key:
export NVIDIA_API_KEY=your_nvidia_api_key
Usage
Basic VLM Usage
from vision_agents.plugins import nvidia
vlm = nvidia.VLM(
model="nvidia/cosmos-reason2-8b",
fps=1,
frame_buffer_seconds=10,
)
# VLM automatically buffers video frames when used with an Agent
response = await vlm.simple_response("What do you see?")
print(response.text)
With Custom Configuration
from vision_agents.plugins import nvidia
vlm = nvidia.VLM(
model="nvidia/cosmos-reason2-8b",
api_key="your-api-key",
fps=2,
frame_buffer_seconds=15,
frame_width=1280,
frame_height=720,
max_tokens=2048,
temperature=0.3,
top_p=0.8,
)
Configuration
| Parameter | Description | Default | Type |
|---|---|---|---|
model |
NVIDIA model ID | "nvidia/cosmos-reason2-8b" |
str |
api_key |
NVIDIA API token (or use NVIDIA_API_KEY env var) |
None |
Optional[str] |
fps |
Frames per second to buffer | 1 |
int |
frame_buffer_seconds |
Number of seconds to buffer | 10 |
int |
frame_width |
Width of video frames to send | 800 |
int |
frame_height |
Height of video frames to send | 600 |
int |
max_tokens |
Maximum response tokens | 1024 |
int |
temperature |
Temperature for sampling | 0.2 |
float |
top_p |
Top-p sampling parameter | 0.7 |
float |
frames_per_second |
Frames per second for video models | 8 |
int |
Dependencies
vision-agents: Core Vision Agents frameworkaiohttp>=3.9.0: Async HTTP client for API requests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vision_agents_plugins_nvidia-0.3.2.tar.gz.
File metadata
- Download URL: vision_agents_plugins_nvidia-0.3.2.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37bdbae2b86214bc12cbf21ab83b493fa4cb0b58f3dd09c2cbf38b1d66e3b0fb
|
|
| MD5 |
749ba799be36367c243aab3edaeb5bef
|
|
| BLAKE2b-256 |
9e8617d831b6eab1a6ee67c1f84c0d3d8f04c442ed73e359e11c808478a92cea
|
File details
Details for the file vision_agents_plugins_nvidia-0.3.2-py3-none-any.whl.
File metadata
- Download URL: vision_agents_plugins_nvidia-0.3.2-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14e6a2b2538037c8960bc1e2571816cb0fe5825202cee787ea3ac13ef512d359
|
|
| MD5 |
fd10db7aef8c626caef8f9a93a87ff41
|
|
| BLAKE2b-256 |
9f89bed8ba7928f92a6baa949ad3b335138b917d6cf23aecac94b916dddb7575
|