MCP image generation server with modular engines and routing
Project description
🎨 Image Gen MCP Server
"Fine. I'll do it myself." — Thanos (and also me, after trying five different MCP servers that couldn't mix-and-match image models)
I wanted a single, simple MCP server that lets agents generate and edit images across OpenAI, Google (Gemini/Imagen), Azure, Vertex, and OpenRouter—without yak‑shaving. So… here it is.
A multi‑provider Model Context Protocol (MCP) server for image generation and editing with a unified, type‑safe API. It returns MCP ImageContent blocks plus compact structured JSON so your client can route, log, or inspect results cleanly.
[!IMPORTANT] This
README.mdis the canonical reference for API, capabilities, and usage. Some/docsfiles may lag behind.
🗺️ Table of Contents
- Why this exists
- Features
- Quick start (users)
- Quick start (developers)
- Configure
mcp.json - Tools API
- Providers & Models
- Python client example
- Environment Variables
- Running via FastMCP CLI
- Troubleshooting & FAQ
- Contributing & Releases
- License
🧠 Why this exists
Because I couldn’t find an MCP server that spoke multiple image providers with one sane schema. Some only generated, some only edited, some required summoning three different CLIs at midnight.
This one prioritizes:
- One schema across providers (AR & diffusion)
- Minimal setup (
uvxorpip, drop amcp.json, done) - Type‑safe I/O with clear error shapes
- Discoverability: ask the server what models are live via
get_model_capabilities
✨ Features
- Unified tools:
generate_image,edit_image,get_model_capabilities - Providers: OpenAI, Azure OpenAI, Google Gemini, Vertex AI (Imagen & Gemini), OpenRouter
- Output: MCP
ImageContentblocks + small JSON metadata - Quality/size/orientation normalization
- Masking support where engines allow it
- Fail‑soft errors with stable shape:
{ code, message, details? }
🚀 Quick start (users)
Install and use as a published package.
# With uv (recommended)
uv add image-gen-mcp
# Or with pip
pip install image-gen-mcp
Then configure your MCP client.
Configure mcp.json
Use uvx to run in an isolated env with correct deps:
{
"mcpServers": {
"image-gen-mcp": {
"command": "uvx",
"args": ["--from", "image-gen-mcp", "image-gen-mcp"],
"env": {
"OPENAI_API_KEY": "your-key-here"
}
}
}
}
First call
{
"tool": "generate_image",
"params": {
"prompt": "A vibrant painting of a fox in a sunflower field",
"provider": "openai",
"model": "gpt-image-1"
}
}
🧑💻 Quick start (developers)
Run from source for local development or contributions.
Prereqs
- Python 3.12+
uv(recommended)
Install deps
uv sync --all-extras --dev
Environment
cp .env.example .env
# Add your keys
Run the server
# stdio (direct)
python -m image_gen_mcp.main
# via FastMCP CLI
fastmcp run image_gen_mcp/main.py:app
Local VS Code mcp.json for testing
If you use a VS Code extension or local tooling that reads .vscode/mcp.json, here's a safe example to run the local server (do NOT commit secrets):
{
"servers": {
"image-gen-mcp": {
"command": "python",
"args": ["-m", "image_gen_mcp.main"],
"env": {
"# NOTE": "Replace with your local keys for testing; do not commit.",
"OPENROUTER_API_KEY": "__REPLACE_WITH_YOUR_KEY__"
}
}
},
"inputs": []
}
Use this to run the server from your workspace instead of installing the package from PyPI. For CI or shared repos, store secrets in the environment or a secret manager and avoid checking them into git.
Dev tasks
uv run pytest -v
uv run ruff check .
uv run black --check .
uv run pyright
🧰 Tools API
All tools take named parameters. Outputs include structured JSON (for metadata/errors) and MCP ImageContent blocks (for actual images).
generate_image
Create one or more images from a text prompt.
Example
{
"prompt": "A vibrant painting of a fox in a sunflower field",
"provider": "openai",
"model": "gpt-image-1",
"n": 2,
"size": "M",
"orientation": "landscape"
}
Parameters
| Field | Type | Description |
|---|---|---|
prompt |
str | Required. Text description. |
provider |
enum | Required. openai | openrouter | azure | vertex | gemini. |
model |
enum | Required. Model id (see matrix). |
n |
int | Optional. Default 1; provider limits apply. |
size |
enum | Optional. S | M | L. |
orientation |
enum | Optional. square | portrait | landscape. |
quality |
enum | Optional. draft | standard | high. |
background |
enum | Optional. transparent | opaque (when supported). |
negative_prompt |
str | Optional. Used when provider supports it. |
directory |
str | Optional. Filesystem directory where the server should save generated images. If omitted a unique temp directory is used. |
edit_image
Edit an image with a prompt and optional mask.
Example
{
"prompt": "Remove the background and make the subject wear a red scarf",
"provider": "openai",
"model": "gpt-image-1",
"images": ["data:image/png;base64,..."],
"mask": null
}
Parameters
| Field | Type | Description |
|---|---|---|
prompt |
str | Required. Edit instruction. |
images |
list<str> | Required. One or more source images (base64, data URL, or https URL). Most models use only the first image. |
mask |
str | Optional. Mask as base64/data URL/https URL. |
provider |
enum | Required. See above. |
model |
enum | Required. Model id (see matrix). |
n |
int | Optional. Default 1; provider limits apply. |
size |
enum | Optional. S | M | L. |
orientation |
enum | Optional. square | portrait | landscape. |
quality |
enum | Optional. draft | standard | high. |
background |
enum | Optional. transparent | opaque. |
negative_prompt |
str | Optional. Negative prompt. |
directory |
str | Optional. Filesystem directory where the server should save edited images. If omitted a unique temp directory is used. |
get_model_capabilities
Discover which providers/models are actually enabled based on your environment.
Example
{ "provider": "openai" }
Call with no params to list all enabled providers/models.
Output: a CapabilitiesResponse with providers, models, and features.
🧭 Providers & Models
Routing is handled by a ModelFactory that maps model → engine. A compact, curated list keeps things understandable.
Model Matrix
| Model | Family | Providers | Generate | Edit | Mask |
|---|---|---|---|---|---|
gpt-image-1 |
AR | openai, azure |
✅ | ✅ | ✅ (OpenAI/Azure) |
dall-e-3 |
Diffusion | openai, azure |
✅ | ❌ | — |
gemini-2.5-flash-image-preview |
AR | gemini, vertex |
✅ | ✅ (maskless) | ❌ |
imagen-4.0-generate-001 |
Diffusion | vertex |
✅ | ❌ | — |
imagen-3.0-generate-002 |
Diffusion | vertex |
✅ | ❌ | — |
imagen-4.0-fast-generate-001 |
Diffusion | vertex |
✅ | ❌ | — |
imagen-4.0-ultra-generate-001 |
Diffusion | vertex |
✅ | ❌ | — |
imagen-3.0-capability-001 |
Diffusion | vertex |
✅ | ✅ | ✅ (mask via mask config) |
google/gemini-2.5-flash-image-preview |
AR | openrouter |
✅ | ✅ (maskless) | ❌ |
Provider Model Support
| Provider | Supported Models |
|---|---|
openai |
gpt-image-1, dall-e-3 |
azure |
gpt-image-1, dall-e-3 |
gemini |
gemini-2.5-flash-image-preview |
vertex |
imagen-4.0-generate-001, imagen-3.0-generate-002, gemini-2.5-flash-image-preview |
openrouter |
google/gemini-2.5-flash-image-preview |
🐍 Python client example
import asyncio
from fastmcp import Client
async def main():
# Assumes the server is running via: python -m image_gen_mcp.main
async with Client("image_gen_mcp/main.py") as client:
# 1) Capabilities
caps = await client.call_tool("get_model_capabilities")
print("Capabilities:", caps.structured_content or caps.text)
# 2) Generate
gen_result = await client.call_tool(
"generate_image",
{
"prompt": "a watercolor fox in a forest, soft light",
"provider": "openai",
"model": "gpt-image-1",
},
)
print("Generate Result:", gen_result.structured_content)
print("Image blocks:", len(gen_result.content))
asyncio.run(main())
🔐 Environment variables
Set only what you need:
| Variable | Required for | Description |
|---|---|---|
OPENAI_API_KEY |
OpenAI | API key for OpenAI. |
AZURE_OPENAI_API_KEY |
Azure OpenAI | Azure OpenAI key. |
AZURE_OPENAI_ENDPOINT |
Azure OpenAI | Azure endpoint URL. |
AZURE_OPENAI_API_VERSION |
Azure OpenAI | Optional; default 2024-02-15-preview. |
GEMINI_API_KEY |
Gemini | Gemini Developer API key. |
OPENROUTER_API_KEY |
OpenRouter | OpenRouter API key. |
VERTEX_PROJECT |
Vertex AI | GCP project id. |
VERTEX_LOCATION |
Vertex AI | GCP region (e.g. us-central1). |
VERTEX_CREDENTIALS_PATH |
Vertex AI | Optional path to GCP JSON; ADC supported. |
🏃 Running via FastMCP CLI
Supports multiple transports:
- stdio:
fastmcp run image_gen_mcp/main.py:app - SSE (HTTP):
fastmcp run image_gen_mcp/main.py:app --transport sse --host 127.0.0.1 --port 8000 - HTTP:
fastmcp run image_gen_mcp/main.py:app --transport http --host 127.0.0.1 --port 8000 --path /mcp
Design notes
- Schema: public contract in
image_gen_mcp/schema.py(Pydantic). - Engines: modular adapters in
image_gen_mcp/engines/, selected byModelFactory. - Capabilities: discovered dynamically via
image_gen_mcp/settings.py. - Errors: stable JSON error
{ code, message, details? }.
🛟 Troubleshooting & FAQ
Which size does S | M | L map to?
Sizes are normalized; exact pixel dimensions vary by provider.
Can I mask with Gemini or Imagen?
Masking is not currently supported on those engines; use OpenAI/Azure for masking.
How do I discover what’s live right now?
Call get_model_capabilities (optionally with "provider": "openai" etc.).
⚠️ Testing remarks
I tested this project locally using the openrouter-backed model only. I could not access Gemini or OpenAI from my location (Hong Kong) due to regional restrictions — thanks, US government — so I couldn't fully exercise those providers.
Because of that limitation, the gemini/vertex and openai (including Azure) adapters may contain bugs or untested edge cases. If you use those providers and find issues, please open an issue or, even better, submit a pull request with a fix — contributions are welcome.
Suggested info to include when filing an issue:
- Your provider and model (e.g.,
openai:gpt-image-1,vertex:imagen-4.0-generate-001) - Full stderr/server logs showing the error
- Minimal reproduction steps or a short test script
Thanks — and PRs welcome!
🤝 Contributing & Releases
PRs welcome! Please run tests and linters locally.
Release process (GitHub Actions)
-
Automated (recommended)
- Actions → Manual Release
- Pick version bump: patch / minor / major
- The workflow tags, builds the changelog, and publishes to PyPI
-
Manual
git tag vX.Y.Zgit push origin vX.Y.Z- Create a GitHub Release from the tag
📄 License
Apache-2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file image_gen_mcp-1.3.0.tar.gz.
File metadata
- Download URL: image_gen_mcp-1.3.0.tar.gz
- Upload date:
- Size: 2.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5f37047042be8479a7671c90afd495c6e962d3cd3290e0cc8c11ba61188d468
|
|
| MD5 |
a73ae4303d6ab4890ccc74e9d65470a1
|
|
| BLAKE2b-256 |
5b9c9427293c5eb9d62f186889b9e4a719f0eccddb81a98e1d7b55e2c3754283
|
File details
Details for the file image_gen_mcp-1.3.0-py3-none-any.whl.
File metadata
- Download URL: image_gen_mcp-1.3.0-py3-none-any.whl
- Upload date:
- Size: 52.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78f6e4bd329c5aee04cf4d9364958226b3b54dac5a4f693fcfd550e4c1c19fb7
|
|
| MD5 |
1eb6b9d786c337431548692c3a5a520e
|
|
| BLAKE2b-256 |
e1becc37997c2f3a1c2d449bce6c5eef73a651d300e5a717ab772120644dc56a
|