Official Python client for Moondream, a fast and efficient vision language model.
Project description
Moondream Python Client Library
Official Python client library for Moondream, a fast multi-function VLM. This client can target either Moondream Cloud or Moondream Station.
Capabilities
Moondream goes beyond the typical VLM "query" ability to include more visual functions:
| Method | Description |
|---|---|
caption |
Generate descriptive captions for images |
query |
Ask questions about image content |
detect |
Find bounding boxes around objects in images |
point |
Identify the center location of specified objects |
segment |
Generate an SVG path segmentation mask for objects |
Try it out on Moondream's playground.
Installation
pip install moondream
Quick Start
Choose how you want to run Moondream:
- Moondream Cloud — Get an API key from the cloud console
- Moondream Station — Run locally by installing Moondream Station
import moondream as md
from PIL import Image
# Initialize with Moondream Cloud
model = md.vl(api_key="<your-api-key>")
# Or initialize with a local Moondream Station
model = md.vl(endpoint="http://localhost:2020/v1")
# Load an image
image = Image.open("path/to/image.jpg")
# Generate a caption
caption = model.caption(image)["caption"]
print("Caption:", caption)
# Ask a question
answer = model.query(image, "What's in this image?")["answer"]
print("Answer:", answer)
# Stream the response
for chunk in model.caption(image, stream=True)["caption"]:
print(chunk, end="", flush=True)
API Reference
Constructor
model = md.vl(api_key="<your-api-key>") # Cloud
model = md.vl(endpoint="http://localhost:2020/v1") # Local
Methods
caption(image, length="normal", stream=False)
Generate a caption for an image.
Parameters:
image—Image.ImageorEncodedImagelength—"normal","short", or"long"(default:"normal")stream—bool(default:False)
Returns: CaptionOutput — {"caption": str | Generator}
caption = model.caption(image, length="short")["caption"]
# With streaming
for chunk in model.caption(image, stream=True)["caption"]:
print(chunk, end="", flush=True)
query(image, question, stream=False)
Ask a question about an image.
Parameters:
image—Image.ImageorEncodedImagequestion—strstream—bool(default:False)
Returns: QueryOutput — {"answer": str | Generator}
answer = model.query(image, "What's in this image?")["answer"]
# With streaming
for chunk in model.query(image, "What's in this image?", stream=True)["answer"]:
print(chunk, end="", flush=True)
detect(image, object)
Detect specific objects in an image.
Parameters:
image—Image.ImageorEncodedImageobject—str
Returns: DetectOutput — {"objects": List[Region]}
objects = model.detect(image, "car")["objects"]
point(image, object)
Get coordinates of specific objects in an image.
Parameters:
image—Image.ImageorEncodedImageobject—str
Returns: PointOutput — {"points": List[Point]}
points = model.point(image, "person")["points"]
segment(image, object, spatial_refs=None, stream=False)
Segment an object from an image and return an SVG path.
Parameters:
image—Image.ImageorEncodedImageobject—strspatial_refs—List[[x, y] | [x1, y1, x2, y2]]— optional spatial hints (normalized 0-1)stream—bool(default:False)
Returns:
- Non-streaming:
SegmentOutput—{"path": str, "bbox": Region} - Streaming: Generator yielding update dicts
result = model.segment(image, "cat")
svg_path = result["path"]
bbox = result["bbox"] # {"x_min": ..., "y_min": ..., "x_max": ..., "y_max": ...}
# With spatial hint (point)
result = model.segment(image, "cat", spatial_refs=[[0.5, 0.5]])
# With streaming
for update in model.segment(image, "cat", stream=True):
if "bbox" in update and not update.get("completed"):
print(f"Bbox: {update['bbox']}") # Available in first message
if "chunk" in update:
print(update["chunk"], end="") # Coarse path chunks
if update.get("completed"):
print(f"Final path: {update['path']}") # Refined path
print(f"Final bbox: {update['bbox']}")
encode_image(image)
Pre-encode an image for reuse across multiple calls.
Parameters:
image—Image.ImageorEncodedImage
Returns: Base64EncodedImage
encoded = model.encode_image(image)
Types
| Type | Description |
|---|---|
Image.Image |
PIL Image object |
EncodedImage |
Base class for encoded images |
Base64EncodedImage |
Output of encode_image(), subtype of EncodedImage |
Region |
Bounding box with x_min, y_min, x_max, y_max |
Point |
Coordinates with x, y indicating object center |
SpatialRef |
[x, y] point or [x1, y1, x2, y2] bbox, normalized to [0, 1] |
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file moondream-0.2.0.tar.gz.
File metadata
- Download URL: moondream-0.2.0.tar.gz
- Upload date:
- Size: 97.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.9 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
402655cc23b94490512caa1cf9f250fc34d133dfdbac201f78b32cbdeabdae0d
|
|
| MD5 |
598dadc75b1ca574716a08f8c512da4a
|
|
| BLAKE2b-256 |
a5d785e4d020c4d00f4842b35773e4442fe5cea310e4ebc6a1856e55d3e1a658
|
File details
Details for the file moondream-0.2.0-py3-none-any.whl.
File metadata
- Download URL: moondream-0.2.0-py3-none-any.whl
- Upload date:
- Size: 96.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.9 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca722763bddcce7c13faf87fa3e6b834f86f7bea22bc8794fc1fe15f2d826d93
|
|
| MD5 |
5e5bef3743110b8e644fcdc5145d012e
|
|
| BLAKE2b-256 |
f2cf369278487161c8d8eadd1a6cee8b0bd629936a1b263bbeccf71342b24dc8
|