FastMCP server for reading local audio and video files with Google Gen AI.
Project description
multimodal-reader-mcp
MCP server for reading local audio and video files with Google Gen AI and returning structured observations, timelines, and transcripts.
It analyzes a local media file and returns:
- a short summary
- a timeline of key moments
- transcript snippets for spoken or visible text
- key observations and notable signals
- relevant clues tailored to the user's question
- open questions plus a confidence level
Requirements
uv- Python
3.14 GOOGLE_API_KEY
Model configuration
The default model is gemini-2.5-flash.
You can override the default model for all requests by setting:
MULTIMODAL_READER_MODEL
Users can also still pass model directly to the read_media tool call.
MCP client configuration
Example Cursor MCP config:
{
"mcpServers": {
"multimodal-reader": {
"command": "uvx",
"args": ["multimodal-reader-mcp"],
"env": {
"GOOGLE_API_KEY": "${env:GOOGLE_API_KEY}",
"MULTIMODAL_READER_MODEL": "gemini-2.5-flash"
}
}
}
}
Tool
The package exposes one MCP tool:
read_media(file_path, question=None, model="gemini-2.5-flash")
file_path must be an absolute path to a local media file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file multimodal_reader_mcp-0.1.0.tar.gz.
File metadata
- Download URL: multimodal_reader_mcp-0.1.0.tar.gz
- Upload date:
- Size: 31.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d30190a7c67f6004393f91ca237045ea8707cd32ab69878accd5619240d5dcc4
|
|
| MD5 |
876e572eab2dabee5ccbb407d932143c
|
|
| BLAKE2b-256 |
94f94e7be5e8f70f690af402a5b43fee5e429a1dfd54e039995532529585358b
|
File details
Details for the file multimodal_reader_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: multimodal_reader_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b51bcc3ba4705f77e2a70570047b7b013609ffd295e7630b33bfeb04d0863a4f
|
|
| MD5 |
943aef9e72ae1c202ea1ea538b946f7e
|
|
| BLAKE2b-256 |
41198e42af6ed0304fa762b4d76e4254d3e8e8514640250df80278ed96c5f56a
|