Skip to main content

Multi-AI MCP bridge: Gemini + OpenRouter (400+ models) — text, code, image, video, TTS, RAG, Deep Research

Project description

omni-ai-mcp

The complete AI bridge for Claude Code — Gemini's exclusive capabilities (video, TTS, 1M context, RAG, Deep Research) plus 400+ models via OpenRouter. One MCP server, every AI model, zero friction.

License: MIT Python 3.9+ Version 4.0.1 PyPI MCP Compatible


Why This Exists

Claude is exceptional at reasoning and code generation. But sometimes you need more:

  • A second opinion from a different AI model (GPT-4o, Llama, Mistral, Claude via OpenRouter)
  • Real-time web search with Google grounding and source citations
  • Autonomous deep research that runs for minutes and produces structured reports from 40+ sources
  • Video generation with Veo 3.1 — the only MCP server with native audio video generation
  • Image generation with Imagen up to 4K resolution
  • Text-to-speech with 30 natural voices and multi-speaker support
  • RAG for querying your own documents with citations
  • Large codebase analysis with Gemini's 1M token context window
  • Multi-turn conversations with cloud persistence (55-day retention, resume from any device)
  • Access to 400+ models through one unified interface

omni-ai-mcp bridges Claude Code with Google Gemini and OpenRouter, enabling Claude to orchestrate any AI model as a tool.


What's New in v4.0.0–4.0.1

Multi-Provider: Gemini + OpenRouter

# Ask any of 400+ models — auto-routes from model name
ask_model("Explain quantum computing", model="openai/gpt-4o")
ask_model("Write a poem", model="meta-llama/llama-3.3-70b-instruct")
ask_model("Review this code", model="gemini-3.1-pro-preview")  # auto-routes to Gemini native API

# If no Gemini key but OpenRouter key exists, Gemini models route via OpenRouter automatically
ask_model("Summarize this", model="gemini-3-flash-preview")  # -> google/ prefix on OpenRouter

# Discover all available models
gemini_list_models()

Dynamic Model Registry

No more hardcoded model IDs. The server discovers available models at runtime and always uses the latest. If a model is deprecated, it automatically falls back to the next available option.

# Override via env vars if needed:
export GEMINI_MODEL_PRO=gemini-3.1-pro-preview
export OPENROUTER_DEFAULT_MODEL=openai/gpt-4o

Smart Routing Rules

  1. Explicit Gemini model + GEMINI_API_KEY -> always Gemini native API (fastest, cheapest)
  2. Gemini model + no Gemini key + OPENROUTER_API_KEY -> OpenRouter google/ prefix (automatic fallback)
  3. veo-*, imagen-*, deep-research-* models -> Gemini native only (no OpenRouter equivalent)
  4. OpenRouter model (openai/, meta-llama/, etc.) -> OpenRouter (requires OPENROUTER_API_KEY)

PyPI Distribution

pip install omni-ai-mcp
omni-ai-mcp-setup  # interactive setup wizard

Claude Desktop Extension (.dxt)

Install with one click — no Python setup required:

  1. Download omni-ai-mcp-vX.Y.Z.dxt from GitHub Releases
  2. Double-click the file (macOS/Windows) or drag it into Claude Desktop
  3. Enter your Gemini API key when prompted (OpenRouter key is optional)
  4. Done — all 20 tools are immediately available in Claude Desktop

The .dxt bundle includes all Python dependencies — users don't need to install anything else.


20 Tools

Multi-Provider

Tool Description
ask_model Ask any AI: Gemini or 400+ models via OpenRouter — auto-routes from model name
gemini_list_models Live model discovery: Gemini registry + OpenRouter catalog, deprecation warnings

Text & Reasoning

Tool Description Model
ask_gemini Text generation with thinking mode, multi-turn, dual storage (local/cloud) Gemini 3.1 Pro
gemini_code_review Security, performance, and quality analysis Gemini 3.1 Pro
gemini_brainstorm Creative ideation with 6 methodologies (SCAMPER, TRIZ, etc.) Gemini 3.1 Pro
gemini_challenge Devil's advocate — find flaws in ideas, plans, and code Gemini 3.1 Pro

Code

Tool Description Model
gemini_analyze_codebase Whole-codebase analysis up to 1M tokens / 5MB Gemini 3.1 Pro
gemini_generate_code Structured code generation with dry-run preview and XML output Gemini 3.1 Pro

Research & Web

Tool Description Model
gemini_web_search Real-time search with Google grounding & citations Gemini 3 Flash
gemini_deep_research Autonomous 5-60 min research, 40+ sources, structured report Deep Research Agent

RAG

Tool Description
gemini_file_search Query documents with citations
gemini_create_file_store Create document stores
gemini_upload_file Upload PDF, DOCX, code, etc.
gemini_list_file_stores List available stores

Media (Gemini exclusive)

Tool Description Model
gemini_analyze_image Vision: describe, OCR, Q&A on images Gemini 3 Flash
gemini_generate_image Imagen — up to 4K resolution Gemini 3 Pro Image
gemini_generate_video Veo 3.1 — 4-8s with native audio (dialogue, effects, ambient) Veo 3.1
gemini_text_to_speech 30 natural voices, multi-speaker dialogue Gemini 2.5 Flash TTS

Conversation

Tool Description
gemini_list_conversations List history: title, mode, turns, last activity
gemini_delete_conversation Delete by ID or title (partial match)

Quick Start

Prerequisites

  • Python 3.9+
  • Claude Code CLI
  • Gemini API key — get one free
  • (Optional) OpenRouter API key — openrouter.ai for 400+ models

Install from PyPI (Recommended)

pip install omni-ai-mcp
omni-ai-mcp-setup

The setup wizard configures Claude Code automatically.

Install from Source

git clone https://github.com/marmyx77/omni-ai-mcp.git
cd omni-ai-mcp

# Gemini only
./setup.sh YOUR_GEMINI_API_KEY

# Gemini + OpenRouter (400+ models)
./setup.sh YOUR_GEMINI_API_KEY YOUR_OPENROUTER_KEY

Restart Claude Code. Verify:

claude mcp list
# omni-ai-mcp: Connected

Manual Install

pip install 'mcp[cli]>=1.0.0' 'google-genai>=1.55.0' pydantic defusedxml filelock

mkdir -p ~/.claude-mcp-servers/omni-ai-mcp
cp -r app/ run.py pyproject.toml ~/.claude-mcp-servers/omni-ai-mcp/

claude mcp add omni-ai-mcp --scope user \
  -e GEMINI_API_KEY=YOUR_KEY \
  -e OPENROUTER_API_KEY=YOUR_OR_KEY \
  -- python3 ~/.claude-mcp-servers/omni-ai-mcp/run.py

Usage Examples

Multi-Model AI Orchestration

"Ask GPT-4o to review this authentication function"
-> ask_model(model="openai/gpt-4o", prompt="review this auth function...")

"Compare how Gemini and Llama respond to this design question"
-> ask_model(model="gemini-3.1-pro-preview", ...)
-> ask_model(model="meta-llama/llama-3.3-70b-instruct", ...)

"Get a Mistral opinion on this French legal document"
-> ask_model(model="mistralai/mistral-large-2512", ...)

Conversations with Memory

Gemini remembers previous context across calls via continuation_id:

# First turn
"Ask Gemini to analyze @src/auth.py for security issues"
# Returns: continuation_id: abc-123

# Follow-up — Gemini remembers the previous analysis
"Ask Gemini (continuation_id: abc-123) how to fix the SQL injection"

Dual Storage Mode

Mode Storage Retention Use
local (default) SQLite 3h (configurable) Development, quick chats
cloud Google Interactions API 55 days Long projects, cross-device
# Start a named cloud conversation
"Ask Gemini (mode=cloud, title='Architecture Review'): Analyze my microservices design"
# Returns: continuation_id: int_v1_abc123...

# Resume from any device within 55 days
"Ask Gemini (continuation_id: int_v1_abc123...): What about the database layer?"

Deep Research

Autonomous research agent that runs 5-60 minutes:

"Deep research: Compare AI agent frameworks in 2025 — LangGraph, AutoGen, CrewAI"

The agent will:

  1. Plan a comprehensive research strategy
  2. Execute multiple targeted web searches
  3. Synthesize findings from 40+ sources
  4. Produce a structured report with citations

Use cases: market research, competitive analysis, technical deep dives, trend analysis, literature reviews.

Codebase Analysis

Leverage Gemini's 1M token context to analyze entire codebases at once:

"Analyze codebase src/**/*.py with focus on security"
"Analyze codebase ['app/', 'tests/'] — find architecture issues"

Analysis types: architecture, security, refactoring, documentation, dependencies, general

@File References

Include file contents directly in prompts:

"Ask Gemini to review @src/auth.py for security issues"
"Brainstorm improvements for @README.md"
"Code review @*.py with focus on performance"
"Analyze codebase @src/**/*.ts"

Supported patterns: @file.py, @src/main.py, @*.py, @src/**/*.ts, @. (directory listing)

Video Generation

"Generate a video of ocean waves at sunset, seagulls flying, sound of waves and wind"
  • Duration: 4-8 seconds
  • Resolution: 720p or 1080p (1080p requires 8s)
  • Native audio: dialogue, sound effects, ambient sounds
  • For dialogue: use quotes ("Hello," she said)
  • For sounds: describe explicitly (engine roaring, birds chirping)

Image Generation

"Generate an image of a futuristic Tokyo street at night, neon lights reflecting on wet pavement,
cinematic, shot on 35mm lens"
  • Resolution: up to 4K with Pro model
  • Aspect ratios: 1:1, 16:9, 9:16, 3:2, 4:5, and more
  • Use descriptive sentences, not keyword lists

Text-to-Speech

"Convert this article to speech using the Charon voice (informative, neutral)"

30 available voices — Bright: Zephyr, Autonoe / Upbeat: Puck, Laomedeia / Informative: Charon, Rasalgethi / Warm: Sulafat, Vindemiatrix / and 22 more.

Multi-speaker dialogue:

gemini_text_to_speech(
    text="Host: Welcome!\nGuest: Thanks for having me!",
    speakers=[
        {"name": "Host", "voice": "Charon"},
        {"name": "Guest", "voice": "Aoede"}
    ]
)

Image Analysis

"Analyze this screenshot and extract all visible text: @screenshot.png"
"Describe what's in this diagram and explain the architecture: @diagram.png"

Supported formats: PNG, JPG, JPEG, GIF, WEBP

RAG (Document Search)

# 1. Create a store
"Create a Gemini file store called 'project-docs'"

# 2. Upload files
"Upload the API specification PDF to the project-docs store"

# 3. Query
"Search the project-docs store: What are the rate limits?"

Challenge Tool

Get critical analysis before implementing — find flaws early:

"Challenge this plan with focus on security:
We'll store user sessions in localStorage and use MD5 for passwords"

The tool acts as a Devil's Advocate — it will NOT agree with you. Focus areas: general, security, performance, maintainability, scalability, cost

Code Generation

"Generate a Python FastAPI endpoint for JWT authentication with refresh tokens"

Output is structured XML that Claude can apply directly:

<GENERATED_CODE>
<FILE action="create" path="src/auth.py">
# Complete implementation here...
</FILE>
</GENERATED_CODE>

Options: dry_run=true to preview without writing, language, style (production/prototype/minimal), output_dir

Thinking Mode

"Ask Gemini with high thinking level:
Design an optimal database schema for a social media platform at scale"

Levels: off (default), low (fast reasoning), high (deep analysis)


Model Selection

Text Models

Alias Resolved Model Best For
pro gemini-3.1-pro-preview Complex reasoning, coding, analysis
flash gemini-3-flash-preview Balanced speed/quality
fast / flash-lite gemini-3.1-flash-lite-preview High-volume, simple tasks

Models are resolved dynamically at runtime — if a model is deprecated, the registry automatically falls back to the next available option.

OpenRouter Models (via ask_model)

Provider Example Model ID Notes
OpenAI openai/gpt-4o GPT-4o, o3, o4-mini
Meta meta-llama/llama-3.3-70b-instruct Open source, fast
Anthropic anthropic/claude-3.5-sonnet Claude via OpenRouter
Mistral mistralai/mistral-large-2512 Strong on EU languages
Google google/gemini-3.1-pro-preview Gemini via OpenRouter (fallback)
340+ more gemini_list_models() to browse

Configuration

All settings via environment variables:

Variable Default Description
GEMINI_API_KEY required Google Gemini API key
OPENROUTER_API_KEY OpenRouter key (enables ask_model for 400+ models)
GEMINI_MODEL_PRO gemini-3.1-pro-preview Override Pro text model
GEMINI_MODEL_FLASH gemini-2.5-flash Static fallback model
GEMINI_MODEL_DEEP_RESEARCH deep-research-pro-preview Override research agent
OPENROUTER_DEFAULT_MODEL openai/gpt-4o Default OpenRouter model
GEMINI_SANDBOX_ROOT cwd Root directory for file access
GEMINI_SANDBOX_ENABLED true Enable path sandboxing
GEMINI_MAX_FILE_SIZE 102400 Max file size in bytes (100KB)
GEMINI_CONVERSATION_TTL_HOURS 3 Local conversation expiry
GEMINI_CONVERSATION_MAX_TURNS 50 Max turns per thread
GEMINI_LOG_DIR ~/.omni-ai-mcp Log & DB directory
GEMINI_LOG_FORMAT text json or text
GEMINI_DISABLED_TOOLS Comma-separated tool names to disable

Claude Code Plugin

Slash Commands

Included in .claude/commands/ (auto-available in Claude Code when working inside this repo, or copy to ~/.claude/commands/ for global access):

Command Action
/gemini <prompt> Ask Gemini Pro anything
/gemini-research <topic> Autonomous deep research (40+ sources, 5-30 min)
/gemini-review <file> Code review focused on bugs, security, performance
/gemini-challenge <idea> Devil's Advocate — find flaws in a plan or architecture
/gemini-analyze <path> Codebase analysis with 1M token context window
/gemini-brainstorm <topic> Structured brainstorming (6 methodologies)
/gemini-models List available models (Gemini + OpenRouter)
/ask-model [model] <prompt> Ask any model: GPT-4o, Llama, Mistral, Gemini, etc.
/cowork <task> Claude + Gemini working in parallel on the same task

Subagents

Included in .claude/agents/ — Claude Code activates these automatically based on context:

Agent Trigger Capability
gemini-researcher "research X", "investigate Y", "find sources on Z" Deep Research Agent, 40+ sources
gemini-analyzer "analyze codebase", "security audit", "review architecture" 1M token context window
model-orchestrator "ask GPT-4o", "compare models", "use Llama for this" Routes to 400+ models via OpenRouter
cowork "second opinion", "verify with Gemini", "stress test this" Claude + Gemini parallel analysis with synthesis

Install globally (all projects)

cp -r .claude/commands/* ~/.claude/commands/
cp -r .claude/agents/* ~/.claude/agents/

Multi-Model Architecture

omni-ai-mcp uses Claude as the orchestrator with other models as tools:

User -> Claude Code
            (orchestrates)
       omni-ai-mcp tools
       +-- ask_model("openai/gpt-4o")       -> OpenRouter -> GPT-4o
       +-- ask_model("meta-llama/...")       -> OpenRouter -> Llama 3
       +-- ask_model("gemini-3.1-pro...")    -> Gemini API (native)
       +-- ask_gemini(...)                   -> Gemini API -> Gemini Pro
       +-- gemini_deep_research(...)         -> Gemini API -> Deep Research Agent
       +-- gemini_generate_video(...)        -> Gemini API -> Veo 3.1

This is different from provider replacement tools like claude-code-router which replace Claude's backend entirely. omni-ai-mcp keeps Claude in control while giving it access to every AI model as a tool.


Architecture

omni-ai-mcp/
+-- app/
|   +-- server.py              # FastMCP -- 20 @mcp.tool() registrations
|   +-- core/                  # Config, structured logging, security
|   +-- services/
|   |   +-- gemini.py          # Gemini client + generate_with_fallback()
|   |   +-- model_registry.py  # Dynamic model discovery (API + cache)
|   |   +-- openrouter.py      # OpenRouter client (OpenAI-compatible)
|   |   +-- persistence.py     # SQLite conversation storage + index
|   +-- tools/                 # Tool implementations by domain
|   |   +-- text/              # ask_gemini, ask_model, models, brainstorm, etc.
|   |   +-- code/              # analyze_codebase, generate_code
|   |   +-- media/             # image, video, TTS
|   |   +-- web/               # web_search, deep_research
|   |   +-- rag/               # file_store, file_search, upload
|   +-- schemas/               # Pydantic v2 validation
|   +-- utils/                 # @file expansion, token estimation
+-- tests/                     # 174+ tests (unit + integration)
+-- .claude/
|   +-- commands/              # Slash commands
|   +-- agents/                # Subagents
+-- setup.sh                   # One-command install
+-- manifest.json              # DXT extension manifest (Claude Desktop)
+-- pyproject.toml

Security

  • Path sandboxing: all file access restricted to GEMINI_SANDBOX_ROOT
  • Secrets sanitization: API keys masked in logs (Google, AWS, GitHub, OpenAI, Anthropic, Slack, JWT, Bearer tokens)
  • XML sanitization: LLM output sanitized before parsing to prevent injection
  • Atomic file writes: automatic backups before any file modification
  • Input validation: Pydantic v2 schemas at all tool boundaries
  • Rate limiting: via provider-side quotas

Docker Deployment

# Build and run
docker-compose up -d

# With log viewer (port 8080)
docker-compose --profile monitoring up -d

Docker features: non-root user, read-only filesystem with tmpfs, health check every 30s, resource limits (2 CPU, 2GB RAM), log rotation.


Troubleshooting

MCP not connecting

claude mcp list           # check registration
claude mcp remove omni-ai-mcp
./setup.sh YOUR_API_KEY   # re-register
# Restart Claude Code

Import or syntax errors

python3 -c "from app.core.config import config; print(f'v{config.version}')"
python3 -m pytest tests/unit/ -q

Video / Image generation timeouts

  • Video generation can take 1-6 minutes — this is normal
  • Large images (4K) take longer
  • Check your Gemini API quota at AI Studio

OpenRouter errors

  • Verify OPENROUTER_API_KEY is set and has credit
  • Check the model ID with gemini_list_models(include_openrouter=True)
  • Use the exact model ID from the list

API key update

claude mcp remove omni-ai-mcp
claude mcp add omni-ai-mcp --scope user \
  -e GEMINI_API_KEY=NEW_KEY \
  -e OPENROUTER_API_KEY=NEW_OR_KEY \
  -- python3 ~/.claude-mcp-servers/omni-ai-mcp/run.py

API Costs

Feature Notes
Text generation Free tier available · $0.075-0.30 per 1M tokens
Web Search ~$14 per 1000 queries
File Search indexing $0.15 per 1M tokens (one-time)
Image generation Varies by resolution and model
Video generation Varies by duration/resolution
Text-to-speech Varies by character count
OpenRouter Per-model pricing — see openrouter.ai/models

See Google AI pricing for Gemini rates.


Development

# Run tests
python -m pytest tests/unit/ -v
python -m pytest tests/integration/ -v  # requires GEMINI_API_KEY

# Quick import check
python -c "from app.core.config import config; print(f'v{config.version}')"

# Reinstall after changes
rsync -a app/ ~/.claude-mcp-servers/omni-ai-mcp/app/
# Restart Claude Code

Adding a New Tool

  1. Create app/tools/<domain>/my_tool.py with @tool(name="...", description="...", input_schema=...)
  2. Import in app/tools/<domain>/__init__.py
  3. Add Pydantic schema in app/schemas/inputs.py
  4. Write tests in tests/unit/

See CLAUDE.md for the full development guide.


Changelog

v4.0.1

  • Fixed routing: Gemini models always use native API when key available (even if provider=openrouter)
  • Added OpenRouter fallback for Gemini text models when no Gemini key (google/ prefix)
  • Fixed Python 3.11 f-string syntax in challenge tool
  • Fixed stale unit test imports (103 -> 174 passing tests)
  • Fixed model registry: corrected flash model names (gemini-3-flash-preview, gemini-3.1-flash-lite-preview)

v4.0.0

  • ask_model: 400+ models via OpenRouter — auto-routes from model name
  • gemini_list_models: live model discovery with deprecation warnings
  • Dynamic model registry: no more hardcoded model IDs
  • PyPI distribution: pip install omni-ai-mcp
  • Claude Code plugin: slash commands + subagents
  • GitHub Actions CI/CD with Trusted Publishing

v3.3.0

  • Dual storage mode for ask_gemini: local (SQLite) or cloud (Interactions API, 55-day retention)
  • gemini_list_conversations, gemini_delete_conversation
  • Cross-platform file locking

v3.2.0

  • gemini_deep_research: autonomous multi-step research (5-60 min, 40+ sources)

v3.0.0

  • FastMCP migration, SQLite persistence, security hardening

Contributing

Contributions are welcome! Open an issue or pull request on GitHub.


License

MIT — see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omni_ai_mcp-4.0.8.tar.gz (138.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omni_ai_mcp-4.0.8-py3-none-any.whl (93.2 kB view details)

Uploaded Python 3

File details

Details for the file omni_ai_mcp-4.0.8.tar.gz.

File metadata

  • Download URL: omni_ai_mcp-4.0.8.tar.gz
  • Upload date:
  • Size: 138.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omni_ai_mcp-4.0.8.tar.gz
Algorithm Hash digest
SHA256 d55b0da5c052f6b98f6f423bbec4a88fec61ee43f44f32326c6c055d66bb478b
MD5 8483930e2ee0255896f93713f1eda1bd
BLAKE2b-256 cd13e79c90a450902a5ee413f77e26b48755797745cb51504861b8727be99ba2

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_ai_mcp-4.0.8.tar.gz:

Publisher: publish.yml on marmyx77/omni-ai-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omni_ai_mcp-4.0.8-py3-none-any.whl.

File metadata

  • Download URL: omni_ai_mcp-4.0.8-py3-none-any.whl
  • Upload date:
  • Size: 93.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omni_ai_mcp-4.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 ec79288f440790eb9bc763c0bf56f71a7b96bf764e937d55eba6c966c0365bca
MD5 3d48d79b7bb75bb5cc024515e685c53d
BLAKE2b-256 dba078a5fb87d5c934061e2ff1d7f17b5cdc0339952d0bea59589e8f6be3c582

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_ai_mcp-4.0.8-py3-none-any.whl:

Publisher: publish.yml on marmyx77/omni-ai-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page