Moondream 3 vision processor plugin for Vision Agents

These details have been verified by PyPI

Project links

Source

GitHub Statistics

Maintainers

nash0x7e2 tschellenbach

These details have not been verified by PyPI

Project links

Project description

Moondream Plugin

This plugin provides Moondream 3 detection capabilities for vision-agents, enabling real-time zero-shot object detection on video streams. Choose between cloud-hosted or local processing depending on your needs.

Installation

uv add vision-agents-plugins-moondream

Choosing the Right Processor

CloudDetectionProcessor (Recommended for Most Users)

Use when: You want a simple setup with no infrastructure management
Pros: No model download, no GPU required, automatic updates
Cons: Requires API key, 2 RPS rate limit by default (can be increased)
Best for: Development, testing, low-to-medium volume applications

LocalDetectionProcessor (For Advanced Users)

Use when: You need higher throughput, have your own GPU infrastructure, or want to avoid rate limits
Pros: No rate limits, no API costs, full control over hardware
Cons: Requires GPU for best performance, model download on first use, infrastructure management
Best for: Production deployments, high-volume applications, Digital Ocean Gradient AI GPUs, or custom infrastructure

Quick Start

Using CloudDetectionProcessor (Hosted)

The CloudDetectionProcessor uses Moondream's hosted API. By default it has a 2 RPS (requests per second) rate limit and requires an API key. The rate limit can be adjusted by contacting the Moondream team to request a higher limit.

from vision_agents.plugins import moondream
from vision_agents.core import Agent

# Create a cloud processor with detection
processor = moondream.CloudDetectionProcessor(
    api_key="your-api-key",  # or set MOONDREAM_API_KEY env var
    detect_objects="person",  # or ["person", "car", "dog"] for multiple
    fps=30
)

# Use in an agent
agent = Agent(
    processors=[processor],
    llm=your_llm,
    # ... other components
)

Using LocalDetectionProcessor (On-Device)

If you are running on your own infrastructure or using a service like Digital Ocean's Gradient AI GPUs, you can use the LocalDetectionProcessor which downloads the model from HuggingFace and runs on device. By default it will use CUDA for best performance. Performance will vary depending on your specific hardware configuration.

Note: The moondream3-preview model is gated and requires HuggingFace authentication:

Request access at https://huggingface.co/moondream/moondream3-preview
Set HF_TOKEN environment variable: export HF_TOKEN=your_token_here
Or run: huggingface-cli login

from vision_agents.plugins import moondream
from vision_agents.core import Agent

# Create a local processor (no API key needed)
processor = moondream.LocalDetectionProcessor(
    detect_objects=["person", "car", "dog"],
    conf_threshold=0.3,
    device="cuda",  # Auto-detects CUDA, MPS, or CPU
    fps=30
)

# Use in an agent
agent = Agent(
    processors=[processor],
    llm=your_llm,
    # ... other components
)

Detect Multiple Objects

# Detect multiple object types with zero-shot detection
processor = moondream.CloudDetectionProcessor(
    api_key="your-api-key",
    detect_objects=["person", "car", "dog", "basketball"],
    conf_threshold=0.3
)

# Access results for LLM
state = processor.state()
print(state["detections_summary"])  # "Detected: 2 persons, 1 car"
print(state["detections_count"])  # Total number of detections
print(state["last_image"])  # PIL Image for vision models

Configuration

CloudDetectionProcessor Parameters

api_key: str - API key for Moondream Cloud API. If not provided, will attempt to read from MOONDREAM_API_KEY environment variable.
detect_objects: str | List[str] - Object(s) to detect using zero-shot detection. Can be any object name like "person", "car", "basketball". Default: "person"
conf_threshold: float - Confidence threshold for detections (default: 0.3)
fps: int - Frame processing rate (default: 30)
interval: int - Processing interval in seconds (default: 0)
max_workers: int - Thread pool size for CPU-intensive operations (default: 10)

Rate Limits: By default, the Moondream Cloud API has a 2rps (requests per second) rate limit. Contact the Moondream team to request a higher limit.

LocalDetectionProcessor Parameters

detect_objects: str | List[str] - Object(s) to detect using zero-shot detection. Can be any object name like "person", "car", "basketball". Default: "person"
conf_threshold: float - Confidence threshold for detections (default: 0.3)
fps: int - Frame processing rate (default: 30)
interval: int - Processing interval in seconds (default: 0)
max_workers: int - Thread pool size for CPU-intensive operations (default: 10)
device: str - Device to run inference on ('cuda', 'mps', or 'cpu'). Auto-detects CUDA, then MPS (Apple Silicon), then defaults to CPU. Default: None (auto-detect)
model_name: str - Hugging Face model identifier (default: "moondream/moondream3-preview")
options: AgentOptions - Model directory configuration. If not provided, uses default which defaults to tempfile.gettempdir()

Performance: Performance will vary depending on your hardware configuration. CUDA is recommended for best performance on NVIDIA GPUs. The model will be downloaded from HuggingFace on first use.

Video Publishing

The processor publishes annotated video frames with bounding boxes drawn on detected objects:

processor = moondream.CloudDetectionProcessor(
    api_key="your-api-key",
    detect_objects=["person", "car"]
)

# The track will show:
# - Green bounding boxes around detected objects
# - Labels with confidence scores
# - Real-time annotation overlay

Testing

The plugin includes comprehensive tests:

# Run all tests
pytest plugins/moondream/tests/ -v

# Run specific test categories
pytest plugins/moondream/tests/ -k "inference" -v
pytest plugins/moondream/tests/ -k "annotation" -v
pytest plugins/moondream/tests/ -k "state" -v

Dependencies

Required

vision-agents - Core framework
moondream - Moondream SDK for cloud API (CloudDetectionProcessor only)
numpy>=2.0.0 - Array operations
pillow>=10.0.0 - Image processing
opencv-python>=4.8.0 - Video annotation
aiortc - WebRTC support

LocalDetectionProcessor Additional Dependencies

torch - PyTorch for model inference
transformers - HuggingFace transformers library for model loading

Project details

These details have been verified by PyPI

Project links

Source

GitHub Statistics

Maintainers

nash0x7e2 tschellenbach

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.0

May 18, 2026

0.5.9

May 15, 2026

0.5.8

May 13, 2026

0.5.7

May 7, 2026

0.5.6

May 5, 2026

0.5.5

Apr 27, 2026

0.5.4

Apr 15, 2026

0.5.3

Apr 14, 2026

0.5.2

Apr 13, 2026

0.5.1

Apr 7, 2026

0.5.0

Apr 1, 2026

0.4.7

Mar 27, 2026

0.4.6

Mar 26, 2026

0.4.5

Mar 25, 2026

0.4.4

Mar 23, 2026

0.4.3

Mar 11, 2026

0.4.2

Mar 10, 2026

0.4.1

Mar 4, 2026

0.4.0

Mar 3, 2026

0.3.8

Feb 24, 2026

0.3.7

Feb 23, 2026

0.3.6

Feb 13, 2026

0.3.5

Feb 10, 2026

0.3.4

Feb 6, 2026

0.3.3

Feb 4, 2026

0.3.2

Jan 27, 2026

0.3.1

Jan 21, 2026

0.3.0

Jan 20, 2026

0.2.10

Jan 14, 2026

0.2.9

Jan 9, 2026

0.2.8

Jan 8, 2026

0.2.7

Jan 7, 2026

0.2.6

Dec 16, 2025

0.2.5

Dec 12, 2025

0.2.4

Dec 12, 2025

0.2.3

Dec 7, 2025

0.2.2

Nov 29, 2025

0.2.1

Nov 21, 2025

0.2.0

Nov 14, 2025

0.1.14

Nov 11, 2025

This version

0.1.13

Nov 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_moondream-0.1.13.tar.gz (14.2 kB view details)

Uploaded Nov 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vision_agents_plugins_moondream-0.1.13-py3-none-any.whl (37.3 kB view details)

Uploaded Nov 3, 2025 Python 3

File details

Details for the file vision_agents_plugins_moondream-0.1.13.tar.gz.

File metadata

Download URL: vision_agents_plugins_moondream-0.1.13.tar.gz
Upload date: Nov 3, 2025
Size: 14.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.7

File hashes

Hashes for vision_agents_plugins_moondream-0.1.13.tar.gz
Algorithm	Hash digest
SHA256	`b9e8d7dbf389fea59b4e6fbc0a892399f57f21aa35eecdde46d086da767b75c7`
MD5	`159794e257ede140655e1dc6fd86fb89`
BLAKE2b-256	`893d5720cdf4a7aaff60858ea59f304c15e9d794a01cfe50387af62f784206d7`

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_moondream-0.1.13-py3-none-any.whl.

File metadata

Download URL: vision_agents_plugins_moondream-0.1.13-py3-none-any.whl
Upload date: Nov 3, 2025
Size: 37.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.7

File hashes

Hashes for vision_agents_plugins_moondream-0.1.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`da86fbc8810bde55d1ec968853033a3b7ed2b7abeb521b6a80fbecf700dd0924`
MD5	`eaa803df78a25b73d42cb7dc554f2f68`
BLAKE2b-256	`718ea3c2a50ff2a9d7353dd140e8fad89cb062bdf65669a5cfdfce200367c1c6`

See more details on using hashes here.

vision-agents-plugins-moondream 0.1.13

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Project description

Moondream Plugin

Installation

Choosing the Right Processor

CloudDetectionProcessor (Recommended for Most Users)

LocalDetectionProcessor (For Advanced Users)

Quick Start

Using CloudDetectionProcessor (Hosted)

Using LocalDetectionProcessor (On-Device)

Detect Multiple Objects

Configuration

CloudDetectionProcessor Parameters

LocalDetectionProcessor Parameters

Video Publishing

Testing

Dependencies

Required

LocalDetectionProcessor Additional Dependencies

Links

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes