MCP server for image generation, editing, and analysis using OpenAI's gpt-image-1 model
Project description
GPT Image MCP Server
A Model Context Protocol (MCP) server for image generation, editing, and analysis powered by OpenAI's gpt-image-1 model. Built with FastMCP โ generate YouTube thumbnails, blog headers, social media images, or any custom image, with optional reference-image support and platform-aware optimization.
๐ฆ Install
pip install gpt-image-mcp
# or
uv add gpt-image-mcp
Set OPENAI_API_KEY in your environment, then run:
gpt-image-mcp
๐ฏ Perfect for Content Creators: Generate professional thumbnails with your photo automatically positioned and branded consistently, or get creative when you want variety.
๐ Features
๐ฏ Specialized Content Generation
- YouTube Thumbnails: Optimized for engagement (1536ร1024 landscape format)
- Blog Images: Professional headers and featured images
- Social Media: Platform-optimized content for Instagram, Twitter, Facebook
- General Purpose: Flexible image generation for any use case
๐ผ๏ธ Reference Image Integration
- Personal Branding: Use your photos to create consistent thumbnails
- Style Preservation: Maintains facial features and appearance from reference images
- Custom Layouts: Generate thumbnails in your established style (positioning, text placement, colors)
- High Input Fidelity: Advanced reference image processing for accurate results
- Creative Flexibility: Choose between consistent branding or creative freedom
- Multiple Composition Styles: Centered, dynamic, left/right positioning, or fully experimental
๐ Advanced AI Integration
- GPT-Image-1 Support: Uses OpenAI's latest and best image generation model
- Multi-Model Fallback: Automatic fallback to DALL-E 3 for reliability
- Smart Prompt Optimization: Enhanced prompts based on content type
- Batch Processing: Generate multiple images concurrently
๐จ Platform Intelligence
- Auto-Sizing: Intelligent size selection based on content type
- Style Variants: Professional, casual, dramatic, minimalist, educational
- Emotional Tones: Excited, confident, friendly, serious, and more
- Brand Integration: Custom color schemes and consistent styling
๐ Analysis & Optimization
- Effectiveness Scoring: Thumbnail analysis with 0-10 effectiveness scoring
- Platform Optimization: Convert images for specific platforms
- Improvement Suggestions: Actionable recommendations for better performance
- Best Practices: Built-in knowledge of platform requirements
๐ฆ Installation
Prerequisites
- Python 3.11+
- OpenAI API key with GPT-Image-1/DALL-E 3 access
- UV package manager (recommended)
Quick Start
# Clone the repository
git clone https://github.com/labeveryday/gpt-image-mcp.git
cd gpt-image-mcp
# Install dependencies
uv sync
# Configure your API key
cp .env.example .env
# Edit .env and add: OPENAI_API_KEY=your_key_here
# Test the installation
uv run python demo.py
๐ Usage
MCP Client Integration (Recommended)
This server is designed to work with MCP clients like Claude Code. Add it to your MCP configuration:
{
"name": "gpt-image-mcp",
"command": "uv",
"args": ["run", "gpt-image-mcp"],
"cwd": "/path/to/gpt-image-mcp"
}
Quick MCP Examples
Once connected, you can simply ask Claude:
๐ฏ "Generate a YouTube thumbnail for my Python tutorial"
โ Creates professional thumbnail (default strict mode)
๐จ "Generate a creative YouTube thumbnail with me centered"
โ Uses creative mode with centered composition
๐ธ "Generate a thumbnail using my photo with 'LEARN CODING' text"
โ Uses reference image with professional layout
๐ "Be experimental with the layout and try something artistic"
โ Uses experimental creative mode for unique designs
Starting the MCP Server (Manual)
# Start with UV (recommended)
uv run gpt-image-mcp
# Or run the server directly
uv run python src/gpt_image_mcp/server.py
Demo Usage
# Run the demo to test functionality
uv run python demo.py
# Test individual features
uv run python -c "from demo import demo_youtube_thumbnail; import asyncio; asyncio.run(demo_youtube_thumbnail())"
๐ ๏ธ Available Tools
1. generate_image - Primary Image Generation
Generate optimized images for any platform or purpose.
{
"prompt": "Excited tech reviewer with the latest gadget, studio lighting",
"content_type": "youtube_thumbnail",
"style": "professional",
"emotional_tone": "excited",
"size": "1536x1024",
"include_text_overlay": true,
"text_overlay": "Amazing New Tech!",
"brand_colors": ["#FF6B6B", "#4ECDC4"],
"reference_image": "/path/to/your/photo.jpg", // File path or base64 data
"creative_mode": false,
"composition_style": "right",
"layout_freedom": "standard"
}
2. generate_reference_thumbnail - Personal Branding
Create thumbnails using your photo in your established style.
{
"reference_image": "/Users/me/photos/headshot.png", // File path or base64 data
"main_text": "5 TECH SIDE HUSTLES",
"secondary_text": "THAT MAKE $10K/MONTH",
"topic": "entrepreneurship",
"style_override": "professional",
"creative_mode": false,
"composition_style": "right",
"layout_freedom": "standard"
}
3. analyze_thumbnail - AI-Powered Analysis
Get effectiveness scores and improvement suggestions.
{
"image_data": "base64_encoded_image_data",
"platform": "youtube",
"content_category": "education"
}
4. optimize_for_platform - Platform Conversion
Adapt existing images for different platforms.
{
"image_data": "base64_encoded_image_data",
"target_platform": "instagram",
"optimization_focus": ["engagement", "readability"]
}
5. generate_batch - Bulk Generation
Generate multiple images efficiently.
{
"requests": [
{"prompt": "Tutorial thumbnail 1", "content_type": "youtube_thumbnail"},
{"prompt": "Tutorial thumbnail 2", "content_type": "youtube_thumbnail"}
],
"max_concurrent": 3
}
6. get_prompt_suggestions - Prompt Enhancement
Get AI suggestions for better prompts.
{
"content_type": "youtube_thumbnail",
"current_prompt": "Python tutorial video"
}
๐ Supported Sizes & Platforms
| Platform | Optimal Size | Aspect Ratio | Notes |
|---|---|---|---|
| YouTube | 1792ร1024 | ~16:9 | OpenAI supported landscape |
| 1024ร1024 | 1:1 | Square format | |
| 1792ร1024 | ~16:9 | Wide landscape format | |
| 1792ร1024 | ~16:9 | Cover images | |
| Blog Header | 1792ร1024 | ~16:9 | Professional headers |
| Blog Featured | 1024ร1792 | ~9:16 | Portrait format |
All sizes use OpenAI's currently supported dimensions: 1024ร1024, 1024ร1792, and 1792ร1024.
๐ธ Reference Image Handling
File Path Support
Reference images can be provided as either file paths or base64 encoded data:
// Using file paths (recommended - automatic resizing)
"reference_image": "/Users/you/photos/headshot.jpg"
"reference_image": "./images/profile.png"
"reference_image": "/home/user/pictures/photo.jpg"
// Using base64 data (backward compatibility)
"reference_image": "iVBORw0KGgoAAAANSUhEUgAA..."
Automatic Image Processing
- Large Image Handling: Input images over 2MB are automatically resized
- Format Support: JPEG, PNG, WebP, and other common formats
- Size Optimization: YouTube thumbnails are optimized to stay under 2MB
- Quality Preservation: Smart resizing maintains image quality
๐จ Content Types & Styles
Content Types
youtube_thumbnail- High-impact video thumbnails (auto-optimized under 2MB)blog_header- Professional article headersblog_featured- Featured/hero imagessocial_media- General social contentgeneral- Flexible general-purpose images
Styles
professional- Clean, business-appropriatecasual- Relaxed, approachabledramatic- High-contrast, boldminimalist- Simple, eleganteducational- Clear, instructionalentertainment- Fun, engaging
Emotional Tones
excited- High energy, enthusiasticconfident- Strong, authoritativefriendly- Warm, approachablecurious- Intriguing, mysteriousserious- Professional, formalsurprised- Attention-grabbingdramatic- Intense, compelling
Creative Mode System
๐ DEFAULT: Strict Professional Mode
creative_mode=False(default) - Consistent, reliable professional layouts- Person positioned right, text on left, red banner for emphasis
- Perfect for consistent branding and professional thumbnails
- This is the recommended default for most users
๐จ CREATIVE MODE: When You Want Variety
creative_mode=True- Unlocks flexible and experimental options- Only activated when you specifically request creative freedom
Layout Freedom Levels (when creative_mode=True)
standard- Consistent branding (same as strict mode)flexible- Some creative freedom while maintaining best practicesexperimental- Complete creative freedom with unconventional designs
Composition Styles (when creative_mode=True)
left- Position person on the left sideright- Position person on the right sidecentered- Center the person prominentlydynamic- Use energetic, dynamic positioningcreative- Experiment with artistic composition techniques
Usage Patterns
# Professional consistency (RECOMMENDED DEFAULT)
# Just use the tool without creative parameters
# Creative with structure
creative_mode=True, layout_freedom="flexible", composition_style="centered"
# Full creative freedom
creative_mode=True, layout_freedom="experimental", composition_style="creative"
๐พ File Storage
Temporary Image Storage
Generated images are automatically saved to cross-platform temporary directories:
- macOS:
/var/folders/.../gpt-image-mcp/ - Windows:
C:\Users\{user}\AppData\Local\Temp\gpt-image-mcp\ - Linux:
/tmp/gpt-image-mcp/
Automatic Cleanup:
- Files older than 24 hours are automatically deleted
- Cleanup runs on server startup and via the
cleanup_temp_filestool - Unique filenames prevent conflicts:
image_20250825_142324_3566695c.png
Manual Management:
# Check temp directory status
uv run python -c "from src.gpt_image_mcp.file_manager import temp_image_manager; print(temp_image_manager.get_temp_dir_info())"
# Clean up old files manually
uv run python -c "from src.gpt_image_mcp.file_manager import temp_image_manager; print(f'Cleaned {temp_image_manager.cleanup_old_files()} files')"
๐ง Configuration
Environment Variables (.env)
# Required
OPENAI_API_KEY=your_openai_api_key
# Optional - Model Configuration
DEFAULT_MODEL=gpt-image-1 # Primary model (OpenAI's best)
IMAGE_MODEL=gpt-image-1 # Direct image model
FALLBACK_MODEL=dall-e-3 # Fallback option
# Optional - Performance
MAX_CONCURRENT_GENERATIONS=5 # Batch processing limit
TIMEOUT_SECONDS=120 # Request timeout
RATE_LIMIT_PER_MINUTE=30 # API rate limiting
# Optional - Quality
DEFAULT_QUALITY=auto # Image quality
ENABLE_COMPRESSION=true # File size optimization
MAX_IMAGE_SIZE_MB=10.0 # Size limits
# Optional - Logging
LOG_LEVEL=INFO # DEBUG for verbose logging
ENABLE_DETAILED_LOGGING=false # Request/response logging
๐ Examples
MCP Usage with Claude (Recommended)
Simply ask Claude naturally - the MCP server will handle the technical details:
๐ค "Generate a YouTube thumbnail for my Python tutorial with 'MASTER PYTHON FAST' text"
๐ค Claude creates professional thumbnail with:
- Your photo positioned on the right
- Bold white text on the left
- Red banner for emphasis
- Professional dark background
๐ค "Be creative with the layout and center me in the composition"
๐ค Claude uses creative_mode=True, composition_style="centered" for artistic variety
๐ค "Generate 5 different thumbnail variations for my coding series"
๐ค Claude uses batch generation with different styles and compositions
Direct API Usage (Advanced)
Professional Consistent Thumbnail (Default)
{
"prompt": "Professional YouTube thumbnail about Python programming",
"content_type": "youtube_thumbnail",
"text_overlay": "MASTER PYTHON FAST!",
"reference_image": "base64_encoded_headshot"
}
Creative Experimental Thumbnail
{
"prompt": "Creative coding tutorial thumbnail",
"content_type": "youtube_thumbnail",
"text_overlay": "CODE CREATIVELY",
"reference_image": "base64_encoded_headshot",
"creative_mode": true,
"layout_freedom": "experimental",
"composition_style": "dynamic"
}
Standard YouTube Thumbnail (No Reference)
request = {
"prompt": "Enthusiastic developer coding Python, modern setup, vibrant colors",
"content_type": "youtube_thumbnail",
"style": "professional",
"emotional_tone": "excited",
"text_overlay": "Master Python Fast!",
"brand_colors": ["#3776ab", "#ffd343"] # Python colors
}
Blog Header Image
request = {
"prompt": "Modern digital workspace with analytics and growth charts",
"content_type": "blog_header",
"topic": "business growth",
"target_audience": "entrepreneurs",
"style": "professional"
}
Social Media Post
request = {
"prompt": "Cozy coffee shop workspace with laptop and notebook",
"content_type": "social_media",
"style": "casual",
"emotional_tone": "friendly",
"size": "1024x1024" # Instagram square
}
๐งช Testing & Development
Test Reference Image Functionality
# Test with sample superhero image
uv run examples/superhero_thumbnail_test.py
# Test with your own photo
uv run examples/test_reference_thumbnail.py /path/to/your/photo.jpg
# Demo creative mode options (no API calls)
uv run examples/demo_creative_modes.py
# Test all creative modes (requires API key)
uv run examples/test_creative_modes.py
# Run demo for general testing
uv run python demo.py
Run Tests
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=src/gpt_image_mcp
# Test specific functionality
uv run python demo.py
Code Quality
# Format code
uv run black src/ tests/
# Lint code
uv run ruff check src/ tests/
# Type checking
uv run mypy src/
Development Server
# Start in development mode with detailed logging
LOG_LEVEL=DEBUG ENABLE_DETAILED_LOGGING=true uv run python src/gpt_image_mcp/server.py
๐ค Integration Examples
Direct MCP Usage
import asyncio
from mcp import ClientSession, stdio_client, StdioServerParameters
async def generate_thumbnail():
async with stdio_client(StdioServerParameters(
command="uv", args=["run", "gpt-image-mcp"]
)) as (read, write):
async with ClientSession(read, write) as client:
result = await client.call_tool("generate_image", {
"prompt": "Amazing tech review thumbnail",
"content_type": "youtube_thumbnail"
})
return result
# Run it
result = asyncio.run(generate_thumbnail())
Claude Code Integration
The server integrates seamlessly with Claude Code for AI-powered content creation workflows.
๐ Troubleshooting
Common Issues
API Key Errors
# Verify your API key is set
echo $OPENAI_API_KEY
# Check API key validity
uv run python -c "import openai; print(openai.api_key)"
Image Generation Fails
- Simplify complex prompts
- Check API credits and rate limits
- Try fallback models (DALL-E 3)
Large File Sizes
- Enable compression:
ENABLE_COMPRESSION=true - Reduce quality:
DEFAULT_QUALITY=medium - Check size limits:
MAX_IMAGE_SIZE_MB=10
Rate Limiting
- Adjust concurrent requests:
MAX_CONCURRENT_GENERATIONS=3 - Increase timeout:
TIMEOUT_SECONDS=180 - Lower rate limit:
RATE_LIMIT_PER_MINUTE=20
Debug Mode
# Enable verbose logging
LOG_LEVEL=DEBUG ENABLE_DETAILED_LOGGING=true uv run gpt-image-mcp
# Check server health
uv run python -c "from src.gpt_image_mcp.config import settings; print(settings)"
๐ Performance Notes
- OpenAI API Compatibility: Uses OpenAI-supported image dimensions (1024ร1024, 1792ร1024, 1024ร1792)
- Optimized Tool Schemas: Simplified models for better MCP client compatibility
- Batch Processing: Use
generate_batchfor multiple images - Fallback Strategy: Automatic model fallback ensures reliability
๐ค Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make changes with proper Pydantic validation
- Add tests for new functionality
- Run code quality checks:
uv run black src/ && uv run ruff check src/ - Submit a pull request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Built with FastMCP for clean MCP server architecture
- Powered by OpenAI GPT and DALL-E models
- Uses Pydantic for robust data validation
- Package management with UV
๐ Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: See
docs/directory for detailed guides
Happy image generating! ๐จโจ
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gpt_image_mcp-0.1.0.tar.gz.
File metadata
- Download URL: gpt_image_mcp-0.1.0.tar.gz
- Upload date:
- Size: 38.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d1c96a40b7dc952fdd6c6a0fb204cdec268e0e040b2cf806bf0be047ea56012
|
|
| MD5 |
d2bc2e3b043cea780a92c9867df4624d
|
|
| BLAKE2b-256 |
4ff53d6e73fb2e663538f752fe92777863e1d6b1344e9fc01df7469f82cc96cd
|
Provenance
The following attestation bundles were made for gpt_image_mcp-0.1.0.tar.gz:
Publisher:
ci.yml on labeveryday/gpt-image-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gpt_image_mcp-0.1.0.tar.gz -
Subject digest:
8d1c96a40b7dc952fdd6c6a0fb204cdec268e0e040b2cf806bf0be047ea56012 - Sigstore transparency entry: 1257078005
- Sigstore integration time:
-
Permalink:
labeveryday/gpt-image-mcp@78807a484a2f1c9bd740cd18ff24b0772e3f48ba -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/labeveryday
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@78807a484a2f1c9bd740cd18ff24b0772e3f48ba -
Trigger Event:
push
-
Statement type:
File details
Details for the file gpt_image_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: gpt_image_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 37.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c915d572515e71bfc2df4f5c37602e7d98ea99e246257f69f7aeba9302a47d6
|
|
| MD5 |
b75ba0af93e8e258e16079f0ff044ed0
|
|
| BLAKE2b-256 |
928ffc43a05e48794bc390d09cdcbec9d7a9ac650403130a5fe18419fec52f19
|
Provenance
The following attestation bundles were made for gpt_image_mcp-0.1.0-py3-none-any.whl:
Publisher:
ci.yml on labeveryday/gpt-image-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gpt_image_mcp-0.1.0-py3-none-any.whl -
Subject digest:
0c915d572515e71bfc2df4f5c37602e7d98ea99e246257f69f7aeba9302a47d6 - Sigstore transparency entry: 1257078137
- Sigstore integration time:
-
Permalink:
labeveryday/gpt-image-mcp@78807a484a2f1c9bd740cd18ff24b0772e3f48ba -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/labeveryday
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@78807a484a2f1c9bd740cd18ff24b0772e3f48ba -
Trigger Event:
push
-
Statement type: