Skip to main content

FastMCP server for reading local audio and video files with Google Gen AI.

Project description

multimodal-reader-mcp

MCP server for reading local audio and video files with Google Gen AI and returning structured observations, timelines, and transcripts.

It analyzes a local media file and returns:

  • a short summary
  • a timeline of key moments
  • transcript snippets for spoken or visible text
  • key observations and notable signals
  • relevant clues tailored to the user's question
  • open questions plus a confidence level

Requirements

  • uv
  • Python 3.14
  • GOOGLE_API_KEY

Model configuration

The default model is gemini-2.5-flash.

You can override the default model for all requests by setting:

  • MULTIMODAL_READER_MODEL

Users can also still pass model directly to the read_media tool call.

MCP client configuration

Example Cursor MCP config:

{
  "mcpServers": {
    "multimodal-reader": {
      "command": "uvx",
      "args": ["multimodal-reader-mcp"],
      "env": {
        "GOOGLE_API_KEY": "${env:GOOGLE_API_KEY}",
        "MULTIMODAL_READER_MODEL": "gemini-2.5-flash"
      }
    }
  }
}

Tool

The package exposes one MCP tool:

  • read_media(file_path, question=None, model="gemini-2.5-flash")

file_path must be an absolute path to a local media file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multimodal_reader_mcp-0.1.0.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multimodal_reader_mcp-0.1.0-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file multimodal_reader_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: multimodal_reader_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for multimodal_reader_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d30190a7c67f6004393f91ca237045ea8707cd32ab69878accd5619240d5dcc4
MD5 876e572eab2dabee5ccbb407d932143c
BLAKE2b-256 94f94e7be5e8f70f690af402a5b43fee5e429a1dfd54e039995532529585358b

See more details on using hashes here.

File details

Details for the file multimodal_reader_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: multimodal_reader_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for multimodal_reader_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b51bcc3ba4705f77e2a70570047b7b013609ffd295e7630b33bfeb04d0863a4f
MD5 943aef9e72ae1c202ea1ea538b946f7e
BLAKE2b-256 41198e42af6ed0304fa762b4d76e4254d3e8e8514640250df80278ed96c5f56a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page