Gemini Omni Flash MCP server for text-to-video, image-to-video, reference-guided video, and conversational video editing

These details have not been verified by PyPI

Project links

Project description

Gemini Omni MCP Banner

Gemini Omni MCP

MCP server for Google's Gemini Omni Flash video model — text-to-video, image-to-video, reference-guided video, and conversational video editing with native audio, straight from your AI agent.

Setup

Get a Gemini API key from Google AI Studio, then add the server to your MCP config.

Claude Desktop / Claude Code / Cursor

Add to your MCP config (mcp.json / .claude.json / claude_desktop_config.json):

{
  "mcpServers": {
    "gemini-omni": {
      "command": "uvx",
      "args": ["gemini-omni-mcp@latest"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Droid CLI

droid mcp add gemini-omni "uvx gemini-omni-mcp@latest" --env GEMINI_API_KEY=your-api-key-here

Generated MP4s are saved to ~/gemini_omni_videos by default (set OUTPUT_DIR to change).

Features

Text-to-video: prompt-only MP4 generation with generated audio (music, ambience, SFX)
Image-to-video: animate a single reference image with motion and camera direction
Reference-to-video: up to 6 reference images to lock subjects, style, or props
Conversational editing: iterate on a generated video via previous_interaction_id, or upload your own MP4 and edit it
Prompt role tags: <FIRST_FRAME> and <IMAGE_REF_N> bind reference images to roles
Timing cues: [0-3s], [3-6s], [6-10s] direct the action beat by beat
Batch generation: run multiple prompts in conservative parallel batches (max 4)
URI or inline delivery: robust Files API polling and download built in

Output is 720p 24fps MP4 with SynthID watermarking (preview-quality model).

Showcase

All videos below were generated by this server with gemini-omni-flash-preview, sound on.

Corgi on a hoverboard

A corgi wearing tiny goggles rides a glowing hoverboard through a neon-lit Tokyo street at night, rain reflections on the pavement, camera tracking alongside, single continuous shot, cinematic lighting, upbeat synthwave music.

https://github.com/user-attachments/assets/4b9a8e87-7db0-469d-a261-3c354f7fe9b8

Astronaut latte art

An astronaut in a white spacesuit pours latte art into a floating cup inside a cozy moon-base cafe, Earth visible through a large window, steam swirling in low gravity, slow dolly-in, warm lighting, gentle ambient cafe sounds.

https://github.com/user-attachments/assets/88072ef8-1ce4-4c80-a05a-bfb5531d1271

Origami ocean

An origami paper whale swims gracefully through a stylized paper-craft ocean, paper waves folding and unfolding, paper seagulls gliding above, soft sunlight, camera slowly orbiting, calm orchestral score.

https://github.com/user-attachments/assets/c1e31341-e53c-4a55-af12-d6bb9432dcf5

Tools

`generate_video`

Generates or edits one MP4 and returns JSON with video.path, interaction_id, and metadata.

Argument	Type	Description
`prompt`	string	Scene, motion, camera, lighting, mood, and audio direction
`task`	string?	`text_to_video`, `image_to_video`, `reference_to_video`, or `edit`. Inferred if omitted
`aspect_ratio`	string?	`16:9` (default) or `9:16`
`duration_seconds`	int?	Optional preview field, `3` to `10`
`reference_image_paths`	list?	Up to 6 local image paths
`input_video_path`	string?	Local MP4 to upload and edit
`delivery`	string?	`uri` (default, recommended) or `inline`
`previous_interaction_id`	string?	Continue editing a generated video
`enhance_prompt`	bool?	Optional LLM prompt enhancement, default `false`

`batch_generate`

Runs multiple prompts in parallel batches, capped at 4.

Argument	Type	Description
`prompts`	list	One prompt per video
`task`, `aspect_ratio`, `duration_seconds`, `reference_image_paths`, `delivery`, `enhance_prompt`	—	Shared across the batch, same semantics as `generate_video`
`batch_size`	int?	Parallelism, capped at `MAX_BATCH_SIZE`

Configuration

Everything is configured through environment variables (or a local .env):

Variable	Default	Description
`GEMINI_API_KEY`	—	Required. `GOOGLE_API_KEY` also accepted
`OUTPUT_DIR`	`~/gemini_omni_videos`	Where generated MP4s are saved
`DEFAULT_ASPECT_RATIO`	`16:9`	`16:9` or `9:16`
`DEFAULT_DELIVERY`	`uri`	`uri` or `inline`
`DEFAULT_DURATION_SECONDS`	unset	Optional 3-10s target
`REQUEST_TIMEOUT`	`300`	Generation timeout in seconds
`FILE_POLL_INTERVAL`	`5.0`	Seconds between Files API polls
`FILE_POLL_TIMEOUT`	`600`	Max seconds waiting for file activation
`MAX_BATCH_SIZE`	`4`	Max parallel generations
`ENABLE_PROMPT_ENHANCEMENT`	`false`	LLM-enhance prompts before generation
`LOG_LEVEL`	`INFO`	Logging level

Prompting tips

Ask for a "single continuous shot" and "no scene cuts" for one-scene outputs.
Always include audio direction, for example "gentle ambient sound, no dialogue".
For edits, keep the prompt short and add "Keep everything else the same".
Use <FIRST_FRAME> and <IMAGE_REF_N> tags to bind reference-image roles.
Timing cues like [0-3s], [3-6s], and [6-10s] work well.

Limitations

Preview model: 720p, 24fps, MP4 only, SynthID-watermarked.
System instructions, temperature, negative prompts, voice edits, YouTube sources, and multi-video reasoning are unsupported.
Uploaded-video editing is unavailable in some regions.

Development

git clone https://github.com/nikships/gemini-omni-mcp
cd gemini-omni-mcp
uv sync --all-extras
uv run ruff format .
uv run ruff check .
uv run mypy gemini_omni_mcp/
uv run pytest
uv build

Releases are automated: every push to main bumps the version and publishes to PyPI (see PUBLISHING.md).

License

Gemini Omni MCP is licensed under the MIT license. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.2

Jul 1, 2026

1.0.1

Jul 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gemini_omni_mcp-1.0.2.tar.gz (1.6 MB view details)

Uploaded Jul 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gemini_omni_mcp-1.0.2-py3-none-any.whl (27.0 kB view details)

Uploaded Jul 1, 2026 Python 3

File details

Details for the file gemini_omni_mcp-1.0.2.tar.gz.

File metadata

Download URL: gemini_omni_mcp-1.0.2.tar.gz
Upload date: Jul 1, 2026
Size: 1.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for gemini_omni_mcp-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`893c0cf4f25b843758c2b5288aba8f52db9d18da77bef8f4964bd521901e5d7b`
MD5	`b8c03dfb1428b88a589c2ab387206cf1`
BLAKE2b-256	`2f012cf531f6cb21e3484bd3735df88dab97f3c01e0c6ba4da82ed0fc8280b8d`

See more details on using hashes here.

File details

Details for the file gemini_omni_mcp-1.0.2-py3-none-any.whl.

File metadata

Download URL: gemini_omni_mcp-1.0.2-py3-none-any.whl
Upload date: Jul 1, 2026
Size: 27.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for gemini_omni_mcp-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9d7aad194e518abff1cf960ed51501e965856d3d02760d9c15ed144a16bf4d31`
MD5	`ae1f43a35980cfda2f04ba46d15dd67c`
BLAKE2b-256	`853e0b6fe3f9efe6dd1e50d5390b5bc973756f85032a8871728ca1027f7351ac`

See more details on using hashes here.

gemini-omni-mcp 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Gemini Omni MCP

Setup

Claude Desktop / Claude Code / Cursor

Droid CLI

Features

Showcase

Corgi on a hoverboard

Astronaut latte art

Origami ocean

Tools

generate_video

batch_generate

Configuration

Prompting tips

Limitations

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`generate_video`

`batch_generate`