Skip to main content

Official Python client for Moondream, a fast and efficient vision language model.

Project description

Moondream Python Client Library

Official Python client library for Moondream, a fast multi-function VLM. This client can target Moondream Cloud or run locally via Photon — on NVIDIA GPUs (Linux x86_64 / aarch64 or Windows) or Apple Silicon Macs.

Capabilities

Moondream goes beyond the typical VLM "query" ability to include more visual functions:

Method Description
caption Generate descriptive captions for images
query Ask questions about image content
detect Find bounding boxes around objects in images
point Identify the center location of specified objects
segment Generate an SVG path segmentation mask for objects

Try it out on Moondream's playground.

Installation

pip install moondream

Quick Start

Choose how you want to run Moondream:

  1. Moondream Cloud — Get an API key from the cloud console
  2. Moondream Photon — High-performance local inference engine on NVIDIA GPUs (Linux / Windows) or Apple Silicon Macs (macOS 13+). Requires an API key.
import moondream as md
from PIL import Image

# Initialize with Moondream Cloud
model = md.vl(api_key="<your-api-key>")

# Or initialize with local inference (Photon — NVIDIA GPU or Apple Silicon)
model = md.vl(api_key="<your-api-key>", local=True)

# Load an image
image = Image.open("path/to/image.jpg")

# Generate a caption
caption = model.caption(image)["caption"]
print("Caption:", caption)

# Ask a question
answer = model.query(image, "What's in this image?")["answer"]
print("Answer:", answer)

# Stream the response
for chunk in model.caption(image, stream=True)["caption"]:
    print(chunk, end="", flush=True)

API Reference

Constructor

model = md.vl(api_key="<your-api-key>")                        # Cloud
model = md.vl(api_key="<your-api-key>", local=True)            # Photon (local: NVIDIA GPU or Apple Silicon)
model = md.vl(api_key="<your-api-key>", model="moondream3-preview/ft_id@step")  # Finetune

Methods

caption(image, length="normal", stream=False)

Generate a caption for an image.

Parameters:

  • imageImage.Image or EncodedImage
  • length"normal", "short", or "long" (default: "normal")
  • streambool (default: False)

Returns: CaptionOutput{"caption": str | Generator}

caption = model.caption(image, length="short")["caption"]

# With streaming
for chunk in model.caption(image, stream=True)["caption"]:
    print(chunk, end="", flush=True)

query(image, question, stream=False)

Ask a question about an image.

Parameters:

  • imageImage.Image or EncodedImage
  • questionstr
  • streambool (default: False)

Returns: QueryOutput{"answer": str | Generator}

answer = model.query(image, "What's in this image?")["answer"]

# With streaming
for chunk in model.query(image, "What's in this image?", stream=True)["answer"]:
    print(chunk, end="", flush=True)

detect(image, object)

Detect specific objects in an image.

Parameters:

  • imageImage.Image or EncodedImage
  • objectstr

Returns: DetectOutput{"objects": List[Region]}

objects = model.detect(image, "car")["objects"]

point(image, object)

Get coordinates of specific objects in an image.

Parameters:

  • imageImage.Image or EncodedImage
  • objectstr

Returns: PointOutput{"points": List[Point]}

points = model.point(image, "person")["points"]

segment(image, object, spatial_refs=None, stream=False)

Segment an object from an image and return an SVG path.

Parameters:

  • imageImage.Image or EncodedImage
  • objectstr
  • spatial_refsList[[x, y] | [x1, y1, x2, y2]] — optional spatial hints (normalized 0-1)
  • streambool (default: False)

Returns:

  • Non-streaming: SegmentOutput{"path": str, "bbox": Region}
  • Streaming: Generator yielding update dicts
result = model.segment(image, "cat")
svg_path = result["path"]
bbox = result["bbox"]  # {"x_min": ..., "y_min": ..., "x_max": ..., "y_max": ...}

# With spatial hint (point)
result = model.segment(image, "cat", spatial_refs=[[0.5, 0.5]])

# With streaming
for update in model.segment(image, "cat", stream=True):
    if "bbox" in update and not update.get("completed"):
        print(f"Bbox: {update['bbox']}")  # Available in first message
    if "chunk" in update:
        print(update["chunk"], end="")  # Coarse path chunks
    if update.get("completed"):
        print(f"Final path: {update['path']}")  # Refined path
        print(f"Final bbox: {update['bbox']}")

encode_image(image)

Pre-encode an image for reuse across multiple calls.

Parameters:

  • imageImage.Image or EncodedImage

Returns: Base64EncodedImage

encoded = model.encode_image(image)

Types

Type Description
Image.Image PIL Image object
EncodedImage Base class for encoded images
Base64EncodedImage Output of encode_image(), subtype of EncodedImage
Region Bounding box with x_min, y_min, x_max, y_max
Point Coordinates with x, y indicating object center
SpatialRef [x, y] point or [x1, y1, x2, y2] bbox, normalized to [0, 1]

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moondream-1.3.0.tar.gz (104.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

moondream-1.3.0-py3-none-any.whl (104.1 kB view details)

Uploaded Python 3

File details

Details for the file moondream-1.3.0.tar.gz.

File metadata

  • Download URL: moondream-1.3.0.tar.gz
  • Upload date:
  • Size: 104.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.13.8 Linux/6.19.13-arch1-1

File hashes

Hashes for moondream-1.3.0.tar.gz
Algorithm Hash digest
SHA256 201973b6b2cad3ac46ed41ecb22009ee03437f0cdbad25491f3c0372e69c9040
MD5 bba1b470189c36f238d4b3d2f7dc7613
BLAKE2b-256 d002b5c4cb6743c599655f3172ad7462524d92b6887c93d79a785cad014e5673

See more details on using hashes here.

File details

Details for the file moondream-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: moondream-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 104.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.13.8 Linux/6.19.13-arch1-1

File hashes

Hashes for moondream-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fce82f8b6554af961c1b36f39f99cdc8c9edffe5ad72fdea53c78be18cbfa6a9
MD5 ccbdedb9bcdba684b78bb3b976a6e146
BLAKE2b-256 f7b760ad5524173e811888dac8fbe4e7aa8720329ec6310f71f220737772db16

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page