Skip to main content

Gemini 3 Pro Image MCP server with advanced features: high-resolution output (1K-4K), reference images (up to 14), Google Search grounding, and thinking mode

Project description

Gemini 3 Pro Image MCP Server 🎨

Professional MCP server for Google's Gemini 3 Pro Image - state-of-the-art image generation with advanced reasoning, high-resolution output, and Google Search grounding.

✨ Features

Gemini 3 Pro Image Capabilities

  • High-Resolution Output: Generate images in 1K, 2K, and 4K resolutions
  • Advanced Text Rendering: Create legible, stylized text in infographics, menus, diagrams, and marketing assets
  • Up to 14 Reference Images: Mix up to 14 reference images (6 objects + 5 humans) for consistent style and characters
  • Google Search Grounding: Use real-time data from Google Search (weather, stocks, events, maps)
  • Thinking Mode: Model uses reasoning process to refine composition before generating final output

Advanced Capabilities

  • 🤖 AI Prompt Enhancement: Automatically optimize prompts using Gemini Flash for superior results
  • 🔍 Google Search Integration: Generate images based on real-time information
  • 🎨 Reference Images: Use up to 14 images for style consistency and character preservation
  • 📐 Flexible Aspect Ratios: Support for 10 aspect ratios (1:1, 16:9, 9:16, 3:2, 4:3, 4:5, 5:4, 2:3, 3:4, 21:9)
  • 💭 Thought Process Visibility: See the model's thinking process (interim images and reasoning)
  • 🚀 Batch Processing: Generate multiple images efficiently with parallel processing
  • 🎯 Dual Modalities: Get both text explanations and images in responses

Production Ready

  • Comprehensive error handling and validation
  • Configurable settings via environment variables
  • Detailed logging and debugging
  • MCP resources for configuration and model information

🎬 Showcase - Gemini 3 Pro Image Features

Gemini 3 Pro Image - Experience state-of-the-art image generation with advanced reasoning and high-resolution output.

Key Features in Action

All images can be generated with 4K resolution and AI prompt enhancement enabled.

Example Use Cases

1. High-Resolution Professional Assets

Generate a 4K image of "modern office interior with natural lighting"
- Model: gemini-3-pro-image-preview
- Image Size: 4K
- Aspect Ratio: 16:9

2. Real-Time Data Visualization

Generate an image with Google Search grounding:
"Visualize the current weather forecast for the next 5 days in San Francisco as a clean, modern weather chart. Add a visual on what I should wear each day"
- Enable Google Search: true
- Aspect Ratio: 16:9

3. Reference Image Consistency

Use reference images to maintain consistent characters:
- Provide up to 5 human reference images
- Provide up to 6 object reference images
- Generate "An office group photo of these people, they are making funny faces"

4. Advanced Text Rendering

Generate infographics, menus, or diagrams with legible text:
"Create a restaurant menu with elegant typography showing appetizers, mains, and desserts"
- Image Size: 2K
- Aspect Ratio: 3:4

🔥 Why Gemini 3 Pro Image Is Powerful

  1. State-of-the-Art Quality: Built-in generation capabilities up to 4K resolution
  2. Advanced Reasoning: Thinking mode refines composition before final output
  3. Real-Time Grounding: Google Search integration for accurate, current data
  4. Character Consistency: Use up to 14 reference images for maintaining style
  5. Professional Features: Advanced text rendering for infographics and marketing

🚀 Quick Start

Prerequisites

Installation

Option 1: Using uv (Recommended)

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install and run the server
uvx ultimate-gemini-mcp

Option 2: Using pip

pip install ultimate-gemini-mcp

Option 3: From Source

git clone <repository-url>
cd ultimate-gemini-mcp
uv sync

Configuration

Create a .env file in your project directory:

cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

Or set environment variables directly:

export GEMINI_API_KEY=your_api_key_here

📖 Usage

With Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "ultimate-gemini": {
      "command": "uvx",
      "args": ["ultimate-gemini-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Important Notes:

  1. Images are automatically saved to ~/gemini_images (your home directory). You can optionally set OUTPUT_DIR to customize this location:

    • macOS: "OUTPUT_DIR": "/Users/yourusername/custom_folder"
    • Windows: "OUTPUT_DIR": "C:\\Users\\YourUsername\\custom_folder"
  2. uvx path issues on macOS: If you get spawn uvx ENOENT errors, use the full path to uvx:

    "command": "/Users/yourusername/.local/bin/uvx"
    

Config file locations:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json

With Claude Code (VS Code)

# Add MCP server to Claude Code
claude mcp add ultimate-gemini \
  --env GEMINI_API_KEY=your-api-key \
  -- uvx ultimate-gemini-mcp

Note: Images are automatically saved to ~/gemini_images. To customize, add --env OUTPUT_DIR=/your/custom/path.

With Cursor

Add to Cursor's MCP configuration (.cursor/mcp.json):

{
  "mcpServers": {
    "ultimate-gemini": {
      "command": "uvx",
      "args": ["ultimate-gemini-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Note: Images are automatically saved to ~/gemini_images. Optionally add "OUTPUT_DIR": "/your/custom/path" to customize.

🎯 Available Models

Gemini 3 Pro Image

  • gemini-3-pro-image-preview (default): State-of-the-art image generation optimized for professional asset production with:
    • Built-in 1K, 2K, and 4K resolution support
    • Advanced text rendering capabilities
    • Up to 14 reference images for consistency
    • Google Search grounding for real-time data
    • Thinking mode with reasoning process
    • Support for both TEXT and IMAGE response modalities

🛠️ Tools

generate_image

Generate professional images using Gemini 3 Pro Image with advanced features.

Parameters:

  • prompt (required): Text description of the image to generate
  • model: Model to use (default: gemini-3-pro-image-preview)
  • enhance_prompt: Automatically enhance prompt using AI (default: true)
  • aspect_ratio: Aspect ratio like 1:1, 16:9, 9:16, 3:2, 4:5, etc. (default: 1:1)
  • image_size: Resolution: 1K, 2K, or 4K (default: 1K)
  • output_format: Image format: png, jpeg, webp (default: png)
  • reference_image_paths: List of paths to reference images (up to 14 total)
    • Maximum 6 object images for high-fidelity inclusion
    • Maximum 5 human images for character consistency
  • enable_google_search: Enable Google Search grounding for real-time data (default: false)
  • response_modalities: Response types like ["TEXT", "IMAGE"] (default: both)

Examples:

1. Basic image generation:
   Generate an image of "a serene mountain landscape at sunset with a lake reflection"

2. High-resolution with specific aspect ratio:
   Generate a 4K image of "modern minimalist architecture" with aspect_ratio 16:9

3. With Google Search grounding:
   Generate an image with Google Search enabled: "Current weather map for New York City"

4. With reference images:
   Generate an image with reference_image_paths: ["/path/person1.png", "/path/person2.png"]
   and prompt: "An office group photo of these people making funny faces"

batch_generate

Process multiple prompts efficiently with parallel batch processing.

Parameters:

  • prompts (required): List of text prompts
  • model: Model to use for all images
  • enhance_prompt: Enhance all prompts (default: true)
  • aspect_ratio: Aspect ratio for all images
  • batch_size: Parallel processing size (default: from config)

Example:

Batch generate images for these prompts:
1. "minimalist logo design for a tech startup"
2. "modern dashboard UI design"
3. "mobile app wireframe"

🎨 Advanced Features

AI Prompt Enhancement

When enabled (default), the server uses Gemini Flash to automatically enhance your prompts:

Original: a cat wearing a space helmet

Enhanced: A photorealistic portrait of a domestic tabby cat wearing a futuristic space helmet, close-up composition, warm studio lighting, detailed fur texture, reflective helmet visor showing subtle reflections, soft focus background, professional photography style

This significantly improves image quality without requiring you to be a prompt engineering expert!

Google Search Grounding

Generate images based on real-time data:

Generate an image with Google Search enabled:
- prompt: "Visualize the current weather forecast for San Francisco as a modern chart"
- enable_google_search: true

The response will include grounding metadata with search sources used.

Reference Images for Consistency

Maintain consistent characters and objects across generations:

Generate an image with:
- prompt: "An office group photo of these people, they are making funny faces"
- reference_image_paths: ["/path/person1.png", "/path/person2.png", "/path/person3.png"]
- aspect_ratio: "5:4"
- image_size: "2K"

You can provide up to 14 reference images (max 6 objects, max 5 humans).

High-Resolution Assets

Generate professional 4K assets:

Generate a 4K image of "minimalist logo design for a tech startup"
with image_size: "4K" and aspect_ratio: "1:1"

⚙️ Configuration

Environment Variables

Variable Description Default
GEMINI_API_KEY Google Gemini API key (required) -
OUTPUT_DIR Directory for generated images ~/gemini_images
ENABLE_PROMPT_ENHANCEMENT Enable AI prompt enhancement true
ENABLE_BATCH_PROCESSING Enable batch processing true
DEFAULT_MODEL Default model gemini-3-pro-image-preview
DEFAULT_IMAGE_SIZE Default resolution 2K
ENABLE_GOOGLE_SEARCH Enable Google Search grounding false
REQUEST_TIMEOUT API request timeout (seconds) 60
MAX_BATCH_SIZE Maximum parallel batch size 8
LOG_LEVEL Logging level INFO

📚 MCP Resources

models://list

View all available models with descriptions and features.

settings://config

View current server configuration.

🎭 Use Cases

Web Development

  • Hero images and banners
  • UI/UX mockups and wireframes
  • Logo and branding assets
  • Placeholder images

App Development

  • App icons and splash screens
  • User interface elements
  • Marketing materials
  • Documentation images

Content Creation

  • Blog post illustrations
  • Social media graphics
  • Presentation visuals
  • Product mockups

Creative Projects

  • Character design iterations
  • Concept art exploration
  • Style variations
  • Scene composition

📊 Gemini 3 Pro Image Features

Feature Support Details
Resolution Options ✅ 1K, 2K, 4K Built-in high-resolution generation
Reference Images ✅ Up to 14 6 objects + 5 humans for consistency
Google Search Grounding ✅ Real-time data Weather, stocks, events, maps
Thinking Mode ✅ Advanced reasoning Visible thought process and interim images
Text Rendering ✅ Advanced Legible text in infographics, menus, diagrams
Aspect Ratios ✅ 10 options Full flexibility for any format
Response Modalities ✅ TEXT + IMAGE Dual output modes
Prompt Enhancement ✅ Built-in AI-powered optimization
Thought Signatures ✅ Automatic Preserved across multi-turn interactions
Best For Professional assets, marketing, real-time visualization

🐛 Troubleshooting

"spawn uvx ENOENT" error

  • Cause: Claude Desktop cannot find the uvx command in its PATH
  • Solution: Use the full path to uvx in your config:
    "command": "/Users/yourusername/.local/bin/uvx"
    
  • Find your uvx location with: which uvx

Custom output directory

  • Default: Images are automatically saved to ~/gemini_images in your home directory
  • Customize: Set OUTPUT_DIR in your MCP config if you want a different location:
    "env": {
      "GEMINI_API_KEY": "your-key",
      "OUTPUT_DIR": "/your/custom/path"
    }
    

"GEMINI_API_KEY not found"

  • Add your API key to .env or environment variables
  • Get a free key at Google AI Studio

"Content blocked by safety filters"

  • Modify your prompt to comply with content policies
  • Try rephrasing without potentially sensitive content

"Rate limit exceeded"

  • Wait a few moments and try again
  • Consider upgrading your API plan for higher limits

Images not saving

  • Check that OUTPUT_DIR exists and is writable
  • Verify you have sufficient disk space
  • Create the directory manually: mkdir -p /path/to/your/images

🤝 Contributing

Contributions are welcome! This project combines the best features from multiple MCP servers:

  • mcp-image (TypeScript): Prompt enhancement and editing features
  • nanobanana-mcp-server (Python): Architecture and FastMCP integration
  • gemini-imagen-mcp-server (TypeScript): Imagen API support and batch processing

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Built on the excellent work of:

🔗 Links


Ready to create amazing AI-generated images? Install now and start generating! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultimate_gemini_mcp-3.0.2.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ultimate_gemini_mcp-3.0.2-py3-none-any.whl (28.4 kB view details)

Uploaded Python 3

File details

Details for the file ultimate_gemini_mcp-3.0.2.tar.gz.

File metadata

  • Download URL: ultimate_gemini_mcp-3.0.2.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for ultimate_gemini_mcp-3.0.2.tar.gz
Algorithm Hash digest
SHA256 d59008689829a629fa5744a7de72aa89430be8f5e1e066db53621c4a8df99ed1
MD5 52949a755e9da7c984c3131f90c9472e
BLAKE2b-256 2411fd538440fbfe9ffd11088eb5c9e2f8cf34db122c5932adb7441f9528281d

See more details on using hashes here.

File details

Details for the file ultimate_gemini_mcp-3.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ultimate_gemini_mcp-3.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8e3e3cb62d5f53510c31c30cf6acf6d3079568dd0e1c1552dd15906cb8054628
MD5 a0e2950b3f26328a77bcf589a959636a
BLAKE2b-256 8138dd719c897dea257557536164533a5e303b291d033d00de6e0535475189aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page