MCP image generation server with modular engines and routing

These details have not been verified by PyPI

Project links

Project description

🎨 Image Gen MCP Server

"Fine. I'll do it myself." — Thanos (and also me, after trying five different MCP servers that couldn't mix-and-match image models)
I wanted a single, simple MCP server that lets agents generate and edit images across OpenAI, Google (Gemini/Imagen), Azure, Vertex, and OpenRouter—without yak‑shaving. So… here it is.

Python 3.12+ license

A multi‑provider Model Context Protocol (MCP) server for image generation and editing with a unified, type‑safe API. It returns MCP ImageContent blocks plus compact structured JSON so your client can route, log, or inspect results cleanly.

[!IMPORTANT] This README.md is the canonical reference for API, capabilities, and usage. Some /docs files may lag behind.

🧠 Why this exists

Because I couldn’t find an MCP server that spoke multiple image providers with one sane schema. Some only generated, some only edited, some required summoning three different CLIs at midnight.
This one prioritizes:

One schema across providers (AR & diffusion)
Minimal setup (uvx or pip, drop a mcp.json, done)
Type‑safe I/O with clear error shapes
Discoverability: ask the server what models are live via get_model_capabilities

✨ Features

Unified tools: generate_image, edit_image, get_model_capabilities
Providers: OpenAI, Azure OpenAI, Google Gemini, Vertex AI (Imagen & Gemini), OpenRouter
Output: MCP ImageContent blocks + small JSON metadata
Quality/size/orientation normalization
Masking support where engines allow it
Fail‑soft errors with stable shape: { code, message, details? }

🚀 Quick start (users)

Install and use as a published package.

# With uv (recommended)
uv add image-gen-mcp

# Or with pip
pip install image-gen-mcp

Then configure your MCP client.

Configure `mcp.json`

Use uvx to run in an isolated env with correct deps:

{
  "mcpServers": {
    "image-gen-mcp": {
      "command": "uvx",
      "args": ["--from", "image-gen-mcp", "image-gen-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-key-here"
      }
    }
  }
}

First call

{
  "tool": "generate_image",
  "params": {
    "prompt": "A vibrant painting of a fox in a sunflower field",
    "provider": "openai",
    "model": "gpt-image-1"
  }
}

🧑‍💻 Quick start (developers)

Run from source for local development or contributions.

Prereqs

Python 3.12+
uv (recommended)

Install deps

uv sync --all-extras --dev

Environment

cp .env.example .env
# Add your keys

Run the server

# stdio (direct)
python -m image_gen_mcp.main

# via FastMCP CLI
fastmcp run image_gen_mcp/main.py:app

Local VS Code `mcp.json` for testing

If you use a VS Code extension or local tooling that reads .vscode/mcp.json, here's a safe example to run the local server (do NOT commit secrets):

{
  "servers": {
    "image-gen-mcp": {
      "command": "python",
      "args": ["-m", "image_gen_mcp.main"],
      "env": {
        "# NOTE": "Replace with your local keys for testing; do not commit.",
        "OPENROUTER_API_KEY": "__REPLACE_WITH_YOUR_KEY__"
      }
    }
  },
  "inputs": []
}

Use this to run the server from your workspace instead of installing the package from PyPI. For CI or shared repos, store secrets in the environment or a secret manager and avoid checking them into git.

Dev tasks

uv run pytest -v
uv run ruff check .
uv run black --check .
uv run pyright

🧰 Tools API

All tools take named parameters. Outputs include structured JSON (for metadata/errors) and MCP ImageContent blocks (for actual images).

`generate_image`

Create one or more images from a text prompt.

Example

{
  "prompt": "A vibrant painting of a fox in a sunflower field",
  "provider": "openai",
  "model": "gpt-image-1",
  "n": 2,
  "size": "M",
  "orientation": "landscape"
}

Parameters

Field	Type	Description
`prompt`	str	Required. Text description.
`provider`	enum	Required. `openai` \| `openrouter` \| `azure` \| `vertex` \| `gemini`.
`model`	enum	Required. Model id (see matrix).
`n`	int	Optional. Default 1; provider limits apply.
`size`	enum	Optional. `S` \| `M` \| `L`.
`orientation`	enum	Optional. `square` \| `portrait` \| `landscape`.
`quality`	enum	Optional. `draft` \| `standard` \| `high`.
`background`	enum	Optional. `transparent` \| `opaque` (when supported).
`negative_prompt`	str	Optional. Used when provider supports it.
`directory`	str	Optional. Filesystem directory where the server should save generated images. If omitted a unique temp directory is used.

`edit_image`

Edit an image with a prompt and optional mask.

Example

{
  "prompt": "Remove the background and make the subject wear a red scarf",
  "provider": "openai",
  "model": "gpt-image-1",
  "images": ["data:image/png;base64,..."],
  "mask": null
}

Parameters

Field	Type	Description
`prompt`	str	Required. Edit instruction.
`images`	list<str>	Required. One or more source images (base64, data URL, or https URL). Most models use only the first image.
`mask`	str	Optional. Mask as base64/data URL/https URL.
`provider`	enum	Required. See above.
`model`	enum	Required. Model id (see matrix).
`n`	int	Optional. Default 1; provider limits apply.
`size`	enum	Optional. `S` \| `M` \| `L`.
`orientation`	enum	Optional. `square` \| `portrait` \| `landscape`.
`quality`	enum	Optional. `draft` \| `standard` \| `high`.
`background`	enum	Optional. `transparent` \| `opaque`.
`negative_prompt`	str	Optional. Negative prompt.
`directory`	str	Optional. Filesystem directory where the server should save edited images. If omitted a unique temp directory is used.

`get_model_capabilities`

Discover which providers/models are actually enabled based on your environment.

Example

{ "provider": "openai" }

Call with no params to list all enabled providers/models.

Output: a CapabilitiesResponse with providers, models, and features.

🧭 Providers & Models

Routing is handled by a ModelFactory that maps model → engine. A compact, curated list keeps things understandable.

Model Matrix

Model	Family	Providers	Generate	Edit	Mask
`gpt-image-1`	AR	`openai`, `azure`	✅	✅	✅ (OpenAI/Azure)
`dall-e-3`	Diffusion	`openai`, `azure`	✅	❌	—
`gemini-2.5-flash-image-preview`	AR	`gemini`, `vertex`	✅	✅ (maskless)	❌
`imagen-4.0-generate-001`	Diffusion	`vertex`	✅	❌	—
`imagen-3.0-generate-002`	Diffusion	`vertex`	✅	❌	—
`imagen-4.0-fast-generate-001`	Diffusion	`vertex`	✅	❌	—
`imagen-4.0-ultra-generate-001`	Diffusion	`vertex`	✅	❌	—
`imagen-3.0-capability-001`	Diffusion	`vertex`	✅	✅	✅ (mask via mask config)
`google/gemini-2.5-flash-image-preview`	AR	`openrouter`	✅	✅ (maskless)	❌

Provider Model Support

Provider	Supported Models
`openai`	`gpt-image-1`, `dall-e-3`
`azure`	`gpt-image-1`, `dall-e-3`
`gemini`	`gemini-2.5-flash-image-preview`
`vertex`	`imagen-4.0-generate-001`, `imagen-3.0-generate-002`, `gemini-2.5-flash-image-preview`
`openrouter`	`google/gemini-2.5-flash-image-preview`

🐍 Python client example

import asyncio
from fastmcp import Client


async def main():
    # Assumes the server is running via: python -m image_gen_mcp.main
    async with Client("image_gen_mcp/main.py") as client:
        # 1) Capabilities
        caps = await client.call_tool("get_model_capabilities")
        print("Capabilities:", caps.structured_content or caps.text)

        # 2) Generate
        gen_result = await client.call_tool(
            "generate_image",
            {
                "prompt": "a watercolor fox in a forest, soft light",
                "provider": "openai",
                "model": "gpt-image-1",
            },
        )
        print("Generate Result:", gen_result.structured_content)
        print("Image blocks:", len(gen_result.content))


asyncio.run(main())

🔐 Environment variables

Set only what you need:

Variable	Required for	Description
`OPENAI_API_KEY`	OpenAI	API key for OpenAI.
`AZURE_OPENAI_API_KEY`	Azure OpenAI	Azure OpenAI key.
`AZURE_OPENAI_ENDPOINT`	Azure OpenAI	Azure endpoint URL.
`AZURE_OPENAI_API_VERSION`	Azure OpenAI	Optional; default `2024-02-15-preview`.
`GEMINI_API_KEY`	Gemini	Gemini Developer API key.
`OPENROUTER_API_KEY`	OpenRouter	OpenRouter API key.
`VERTEX_PROJECT`	Vertex AI	GCP project id.
`VERTEX_LOCATION`	Vertex AI	GCP region (e.g. `us-central1`).
`VERTEX_CREDENTIALS_PATH`	Vertex AI	Optional path to GCP JSON; ADC supported.

🏃 Running via FastMCP CLI

Supports multiple transports:

stdio: fastmcp run image_gen_mcp/main.py:app
SSE (HTTP): fastmcp run image_gen_mcp/main.py:app --transport sse --host 127.0.0.1 --port 8000
HTTP: fastmcp run image_gen_mcp/main.py:app --transport http --host 127.0.0.1 --port 8000 --path /mcp

Design notes

Schema: public contract in image_gen_mcp/schema.py (Pydantic).
Engines: modular adapters in image_gen_mcp/engines/, selected by ModelFactory.
Capabilities: discovered dynamically via image_gen_mcp/settings.py.
Errors: stable JSON error { code, message, details? }.

🛟 Troubleshooting & FAQ

Which size does S | M | L map to?
Sizes are normalized; exact pixel dimensions vary by provider.

Can I mask with Gemini or Imagen?
Masking is not currently supported on those engines; use OpenAI/Azure for masking.

How do I discover what’s live right now?
Call get_model_capabilities (optionally with "provider": "openai" etc.).

⚠️ Testing remarks

I tested this project locally using the openrouter-backed model only. I could not access Gemini or OpenAI from my location (Hong Kong) due to regional restrictions — thanks, US government — so I couldn't fully exercise those providers.

Because of that limitation, the gemini/vertex and openai (including Azure) adapters may contain bugs or untested edge cases. If you use those providers and find issues, please open an issue or, even better, submit a pull request with a fix — contributions are welcome.

Suggested info to include when filing an issue:

Your provider and model (e.g., openai:gpt-image-1, vertex:imagen-4.0-generate-001)
Full stderr/server logs showing the error
Minimal reproduction steps or a short test script

Thanks — and PRs welcome!

🤝 Contributing & Releases

PRs welcome! Please run tests and linters locally.

Release process (GitHub Actions)

Automated (recommended)
- Actions → Manual Release
- Pick version bump: patch / minor / major
- The workflow tags, builds the changelog, and publishes to PyPI
Manual
- git tag vX.Y.Z
- git push origin vX.Y.Z
- Create a GitHub Release from the tag

📄 License

Apache-2.0 — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.4.0

Sep 11, 2025

1.3.1

Sep 10, 2025

This version

1.3.0

Sep 10, 2025

1.2.2

Sep 10, 2025

1.2.1

Sep 9, 2025

1.2.0

Sep 9, 2025

1.1.1

Sep 8, 2025

1.0.0

Sep 8, 2025

0.1.5

Sep 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

image_gen_mcp-1.3.0.tar.gz (2.3 MB view details)

Uploaded Sep 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

image_gen_mcp-1.3.0-py3-none-any.whl (52.8 kB view details)

Uploaded Sep 10, 2025 Python 3

File details

Details for the file image_gen_mcp-1.3.0.tar.gz.

File metadata

Download URL: image_gen_mcp-1.3.0.tar.gz
Upload date: Sep 10, 2025
Size: 2.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for image_gen_mcp-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`d5f37047042be8479a7671c90afd495c6e962d3cd3290e0cc8c11ba61188d468`
MD5	`a73ae4303d6ab4890ccc74e9d65470a1`
BLAKE2b-256	`5b9c9427293c5eb9d62f186889b9e4a719f0eccddb81a98e1d7b55e2c3754283`

See more details on using hashes here.

File details

Details for the file image_gen_mcp-1.3.0-py3-none-any.whl.

File metadata

Download URL: image_gen_mcp-1.3.0-py3-none-any.whl
Upload date: Sep 10, 2025
Size: 52.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for image_gen_mcp-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`78f6e4bd329c5aee04cf4d9364958226b3b54dac5a4f693fcfd550e4c1c19fb7`
MD5	`1eb6b9d786c337431548692c3a5a520e`
BLAKE2b-256	`e1becc37997c2f3a1c2d449bce6c5eef73a651d300e5a717ab772120644dc56a`

See more details on using hashes here.

image-gen-mcp 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🎨 Image Gen MCP Server

🗺️ Table of Contents

🧠 Why this exists

✨ Features

🚀 Quick start (users)

Configure mcp.json

First call

🧑‍💻 Quick start (developers)

Local VS Code mcp.json for testing

🧰 Tools API

generate_image

edit_image

get_model_capabilities

🧭 Providers & Models

Model Matrix

Provider Model Support

🐍 Python client example

🔐 Environment variables

🏃 Running via FastMCP CLI

🛟 Troubleshooting & FAQ

⚠️ Testing remarks

🤝 Contributing & Releases

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Configure `mcp.json`

Local VS Code `mcp.json` for testing

`generate_image`

`edit_image`

`get_model_capabilities`