Skip to main content

MCP Server for Gemini Image and Audio generation

Project description

Gemini Gen MCP

PyPI version License: MIT

MCP Server for Gemini Image and Audio generation using Google's Gemini AI models.

Features

This MCP server provides tools to:

  • Generate images from text using Gemini's Flash Image model
  • Generate audio from text using Gemini 2.5 Flash Preview TTS model

Installation

From PyPI

pip install gemini-gen-mcp

From Source

git clone https://github.com/ServiceStack/gemini-gen-mcp.git
cd gemini-gen-mcp
pip install -e .

Prerequisites

You need a Google Gemini API key to use this server. Get one from Google AI Studio.

Environment Variables

Variable Required Default Description
GEMINI_API_KEY Yes - Your Google Gemini API key
GEMINI_DOWNLOAD_PATH No /tmp/gemini_gen_mcp Directory where generated files are saved

Set the environment variables:

export GEMINI_API_KEY='your-api-key-here'
export GEMINI_DOWNLOAD_PATH='/path/to/downloads'  # optional

Generated files are organized by type and date:

  • Images: $GEMINI_DOWNLOAD_PATH/images/YYYY-MM-DD/
  • Audio: $GEMINI_DOWNLOAD_PATH/audios/YYYY-MM-DD/

Each generated file includes a companion .info.json file with generation metadata.

Usage

Running the Server

Run the MCP server directly:

gemini-gen-mcp

Or as a Python module:

python -m gemini_gen_mcp.server

Using with Claude Desktop

See CLAUDE_CONFIG.md for detailed instructions.

Add this to your llmspy.org MCP or claude_desktop_config.json:

{
  "mcpServers": {
    "gemini-gen": {
      "description": "Gemini Image and Audio TTS generation",
      "command": "uvx",
      "args": [
        "gemini-gen-mcp"
      ],
      "env": {
        "GEMINI_API_KEY": "$GEMINI_API_KEY"
      }
    }
  }
}

Development Server

For development, you can run this server using uv:

{
  "mcpServers": {
    {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/path/to/ServiceStack/gemini-gen-mcp",
        "gemini-gen-mcp"
      ],
      "env": {
        "GEMINI_API_KEY": "$GEMINI_API_KEY"
      }
    }
  }
}

Available Tools

text_to_image

Generate images from text descriptions using Gemini's image generation models.

Parameters:

  • prompt (string, required): Text description of the image to generate
  • model (string, optional): Gemini model to use
    • gemini-2.5-flash-image (default)
    • gemini-3-pro-image-preview
  • aspect_ratio (string, optional): Aspect ratio for the generated image (default: "1:1")
    • Supported: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
  • temperature (float, optional): Sampling temperature for image generation (default: 1.0)
  • top_p (float, optional): Nucleus sampling parameter (optional)

Example:

{
  "prompt": "A serene mountain landscape at sunset with a lake",
  "model": "gemini-2.5-flash-image",
  "aspect_ratio": "16:9",
  "temperature": 1.0
}

text_to_audio

Generate audio/speech from text using Gemini's TTS models. Output is saved as WAV format.

Parameters:

  • text (string, required): Text to convert to speech
  • model (string, optional): Gemini TTS model to use
    • gemini-2.5-flash-preview-tts (default)
    • gemini-2.5-pro-preview-tts
  • voice (string, optional): Voice to use for speech generation (default: "Kore")

Available Voices:

Voice Style Voice Style Voice Style
Zephyr Bright Puck Upbeat Charon Informative
Kore Firm Fenrir Excitable Leda Youthful
Orus Firm Aoede Breezy Callirrhoe Easy-going
Autonoe Bright Enceladus Breathy Iapetus Clear
Umbriel Easy-going Algieba Smooth Despina Smooth
Erinome Clear Algenib Gravelly Rasalgethi Informative
Laomedeia Upbeat Achernar Soft Alnilam Firm
Schedar Even Gacrux Mature Pulcherrima Forward
Achird Friendly Zubenelgenubi Casual Vindemiatrix Gentle
Sadachbia Lively Sadaltager Knowledgeable Sulafat Warm

Example:

{
  "text": "Hello, this is a test of the Gemini text to speech system.",
  "model": "gemini-2.5-flash-preview-tts",
  "voice": "Kore"
}

Development

Setup Development Environment

# Clone the repository
git clone https://github.com/ServiceStack/gemini-gen-mcp.git
cd gemini-gen-mcp

# Install in editable mode with dependencies
pip install -e .

Running Tests

# Install test dependencies
pip install pytest pytest-asyncio

# Run tests
```bash
# uv run pytest tests -v
npm test

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues and questions, please use the GitHub Issues page.

Acknowledgments

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gemini_gen_mcp-0.0.4.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gemini_gen_mcp-0.0.4-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file gemini_gen_mcp-0.0.4.tar.gz.

File metadata

  • Download URL: gemini_gen_mcp-0.0.4.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gemini_gen_mcp-0.0.4.tar.gz
Algorithm Hash digest
SHA256 db59cb480e5b6b909fadc1a8b3cc8ae0f4f90539af45def793b1ad58f6701f9b
MD5 fc4d63469d98b0c6ddbd652a7108e701
BLAKE2b-256 171c612b5d1664a840f064f5eb719dce195ea6c69b11f1bd5a222d9841211631

See more details on using hashes here.

Provenance

The following attestation bundles were made for gemini_gen_mcp-0.0.4.tar.gz:

Publisher: python-publish.yml on ServiceStack/gemini-gen-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gemini_gen_mcp-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: gemini_gen_mcp-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gemini_gen_mcp-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b6bdad0ec895c19f3d9cc916ba7abcab8c13ba9ffa509abe9d6b367bba4c9c63
MD5 359fcd6c40f73c0b8cf9f96be05c8de5
BLAKE2b-256 1a7f212ce5898600f00b3dc6b375cba0a60e69fe7c96f2a9ff7f9d6bd2367849

See more details on using hashes here.

Provenance

The following attestation bundles were made for gemini_gen_mcp-0.0.4-py3-none-any.whl:

Publisher: python-publish.yml on ServiceStack/gemini-gen-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page