Skip to main content

NVIDIA VLM plugin for Vision Agents

Project description

NVIDIA Plugin for Vision Agents

NVIDIA VLM integration for Vision Agents. Supports vision language models through NVIDIA's Chat Completions API with NVCF asset management.

Features

  • Video understanding: Automatically buffers and forwards video frames to NVIDIA VLM models
  • Streaming responses: Supports streaming text responses with real-time chunk events
  • Asset management: Automatically uploads frames as assets and cleans them up after use
  • Configurable frame rate and buffer duration for optimal performance

Installation

uv add "vision-agents[nvidia]"
# or directly
uv add vision-agents-plugins-nvidia

Configuration

Set your NVIDIA API key:

export NVIDIA_API_KEY=your_nvidia_api_key

Usage

Basic VLM Usage

from vision_agents.plugins import nvidia

vlm = nvidia.VLM(
    model="nvidia/cosmos-reason2-8b",
    fps=1,
    frame_buffer_seconds=10,
)

# VLM automatically buffers video frames when used with an Agent
response = await vlm.simple_response("What do you see?")
print(response.text)

With Custom Configuration

from vision_agents.plugins import nvidia

vlm = nvidia.VLM(
    model="nvidia/cosmos-reason2-8b",
    api_key="your-api-key",
    fps=2,
    frame_buffer_seconds=15,
    frame_width=1280,
    frame_height=720,
    max_tokens=2048,
    temperature=0.3,
    top_p=0.8,
)

Configuration

Parameter Description Default Type
model NVIDIA model ID "nvidia/cosmos-reason2-8b" str
api_key NVIDIA API token (or use NVIDIA_API_KEY env var) None Optional[str]
fps Frames per second to buffer 1 int
frame_buffer_seconds Number of seconds to buffer 10 int
frame_width Width of video frames to send 800 int
frame_height Height of video frames to send 600 int
max_tokens Maximum response tokens 1024 int
temperature Temperature for sampling 0.2 float
top_p Top-p sampling parameter 0.7 float
frames_per_second Frames per second for video models 8 int

Dependencies

  • vision-agents: Core Vision Agents framework
  • aiohttp>=3.9.0: Async HTTP client for API requests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_nvidia-0.5.9.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vision_agents_plugins_nvidia-0.5.9-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file vision_agents_plugins_nvidia-0.5.9.tar.gz.

File metadata

  • Download URL: vision_agents_plugins_nvidia-0.5.9.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_nvidia-0.5.9.tar.gz
Algorithm Hash digest
SHA256 e67ea8265ce6f68b44d940a9c9b2b0709a11c855b9fc12767c4e4f6424217a3b
MD5 1a476bb6c9343c46771828a35113679d
BLAKE2b-256 0cf2fb366431c5d5052f675baf5205c467181f91eebd8d096a3a3591b0bc51a7

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_nvidia-0.5.9-py3-none-any.whl.

File metadata

  • Download URL: vision_agents_plugins_nvidia-0.5.9-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_nvidia-0.5.9-py3-none-any.whl
Algorithm Hash digest
SHA256 63e82f43fa3f018293ecfaa93a4801281fa4b56a2836034f15b68236c2daf47f
MD5 1c6602cfb0f6483a1f2edff7ca12c7ab
BLAKE2b-256 1c9191b7f60841c6f9f0c08edd8544decc216a7334a4062a59e855a6fd70db00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page