Python client library for moondream
Project description
Moondream Python Client Library
Official Python client library for Moondream, a tiny vision language model that can analyze images and answer questions about them. This library supports both local inference and cloud-based API access.
Features
- Local Inference: Run the model directly on your machine using CPU
- Cloud API: Access Moondream's hosted service for faster inference
- Streaming: Stream responses token by token for real-time output
- Multiple Model Sizes: Choose between 0.5B and 2B parameter models
- Multiple Tasks: Caption images, answer questions, detect objects, and locate points
Installation
Install the package from PyPI:
pip install moondream==0.0.5
Quick Start
Using Cloud API
To use Moondream's cloud API, you'll first need an API key. Sign up for a free account at console.moondream.ai to get your key. Once you have your key, you can use it to initialize the client as shown below.
import moondream as md
from PIL import Image
# Initialize with API key
model = md.vl(api_key="your-api-key")
# Load an image
image = Image.open("path/to/image.jpg")
# Generate a caption
caption = model.caption(image)["caption"]
print("Caption:", caption)
# Ask a question
answer = model.query(image, "What's in this image?")["answer"]
print("Answer:", answer)
# Stream the response
for chunk in model.caption(image, stream=True)["caption"]:
print(chunk, end="", flush=True)
Using Local Inference
First, download the model weights. We recommend the int8 weights for most applications:
| Model | Precision | Download Size | Memory Usage | Download Link |
|---|---|---|---|---|
| Moondream 2B | int8 | 1,733 MiB | 2,624 MiB | Download |
| Moondream 2B | int4 | 1,167 MiB | 2,002 MiB | Download |
| Moondream 0.5B | int8 | 593 MiB | 996 MiB | Download |
| Moondream 0.5B | int4 | 422 MiB | 816 MiB | Download |
Then use the model locally:
import moondream as md
from PIL import Image
# Initialize with local model path
model = md.vl(model="path/to/moondream-2b-int8.bin")
# Load and encode image
image = Image.open("path/to/image.jpg")
# Since encoding an image is computationally expensive, you can encode it once
# and reuse the encoded version for multiple queries/captions/etc. This avoids
# having to re-encode the same image multiple times.
encoded_image = model.encode_image(image)
# Generate caption
caption = model.caption(encoded_image)["caption"]
print("Caption:", caption)
# Ask questions
answer = model.query(encoded_image, "What's in this image?")["answer"]
print("Answer:", answer)
API Reference
Constructor
model = md.vl(
model="path/to/model.bin", # For local inference
api_key="your-api-key" # For cloud API access
)
Methods
caption(image, length="normal", stream=False, settings=None)
Generate a caption for an image.
result = model.caption(image)
# or with streaming
for chunk in model.caption(image, stream=True)["caption"]:
print(chunk, end="")
query(image, question, stream=False, settings=None)
Ask a question about an image.
result = model.query(image, "What's in this image?")
# or with streaming
for chunk in model.query(image, "What's in this image?", stream=True)["answer"]:
print(chunk, end="")
detect(image, object)
Detect and locate specific objects in an image.
result = model.detect(image, "car")
point(image, object)
Get coordinates of specific objects in an image.
result = model.point(image, "person")
Input Types
- Images can be provided as:
- PIL.Image.Image objects
- Encoded image objects (from model.encode_image())
Response Types
All methods return typed dictionaries:
- CaptionOutput:
{"caption": str | Generator} - QueryOutput:
{"answer": str | Generator} - DetectOutput:
{"objects": List[Region]} - PointOutput:
{"points": List[Point]}
Performance Notes
- Local inference currently only supports CPU execution
- CUDA (GPU) and MPS (Apple Silicon) support coming soon
- For optimal performance with GPU/MPS, use the PyTorch implementation for now
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file moondream-0.0.6.tar.gz.
File metadata
- Download URL: moondream-0.0.6.tar.gz
- Upload date:
- Size: 14.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.4 Darwin/23.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b923767530af9969330d10ac4d078d205668b78482566d09a389722988ea68fe
|
|
| MD5 |
20eb0d7412ff29d1f656c43fc45fdf69
|
|
| BLAKE2b-256 |
d1b6b797b9b31cc83d68dce7b1d6604aec6785cc6bcc34846e474c4fa591c09b
|
File details
Details for the file moondream-0.0.6-py3-none-any.whl.
File metadata
- Download URL: moondream-0.0.6-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.4 Darwin/23.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b0dfa11c51d63f89d7c317307117930b42353a66f7827783e697a41e81501b7
|
|
| MD5 |
f6466d47b8275652f2446c8981adbe82
|
|
| BLAKE2b-256 |
dc8a7024d97b886345190f12c15722274a38d493de3866417e5ba2d85e59d8fc
|