Video analysis using Google's Gemini 2.5 Flash API
Project description
Merleau
"The world is not what I think, but what I live through." — Maurice Merleau-Ponty
A CLI tool for video understanding using Google's Gemini API. Named after Maurice Merleau-Ponty, the phenomenologist philosopher whose work on perception inspires how this tool helps you perceive your videos.
https://github.com/user-attachments/assets/e2c5b476-ddab-49ab-a35c-9ae5e880c25c
Why Merleau?
Google Gemini is the only major AI provider with native video understanding—Claude doesn't support video, and GPT-4o requires frame extraction workarounds. Merleau is the first CLI that actually understands video rather than analyzing frames.
Features
- Native Gemini video processing - Upload and analyze videos directly
- YouTube URL support - Analyze videos directly from YouTube (free preview)
- Customizable prompts - Ask any question about your video
- Cost estimation - Token usage tracking and cost breakdown
- Multiple models - Support for different Gemini models
- Web UI - Streamlit app for browser-based analysis
Use cases
Clone apps
Take a screencast of your app and ask:
- "What are the main features of this app?"
- "What are the main UI elements?"
- "What are the main user flows?"
Extract code from a coding screencast
ponty https://www.youtube.com/watch?v=Be0ceKN81S8 -p "Extract the text in the first claude code session" -e md
Installation
Using uv (recommended):
uv sync
Or install from PyPI:
pip install merleau
Configuration
- Get a Gemini API key from Google AI Studio
- Set the API key as an environment variable or create a
.envfile:GEMINI_API_KEY=your_api_key_here
Usage
# Basic video analysis
ponty video.mp4
# Analyze a YouTube video directly
ponty https://youtu.be/VIDEO_ID
ponty https://www.youtube.com/watch?v=VIDEO_ID
# Custom prompt
ponty video.mp4 -p "Summarize the key points in this video"
# Use a different model
ponty video.mp4 -m gemini-2.0-flash
# Export analysis to markdown
ponty video.mp4 -e md
# Hide cost information
ponty video.mp4 --no-cost
Web UI
Try it online: https://merleau.streamlit.app/
The web app supports both file uploads and YouTube URLs (paste a URL in the YouTube tab to preview and analyze directly).
Or run locally:
pip install merleau[web]
streamlit run streamlit_app.py
Options
| Option | Description |
|---|---|
-p, --prompt |
Prompt for the analysis (default: "Explain what happens in this video") |
-m, --model |
Gemini model to use (default: gemini-2.5-flash) |
-e, --export |
Export analysis to file (supported formats: md) |
--no-cost |
Hide usage and cost information |
-V, --version |
Show version and exit |
Reducing Costs with Compression
Compressing videos before analysis can reduce API costs by ~10-15% without degrading analysis quality. Gemini's token count is affected by video resolution and bitrate.
Quick Compression with ffmpeg
# Basic compression (recommended)
ffmpeg -i input.mp4 -vcodec libx264 -crf 28 -preset medium -vf "scale=1280:-2" output.mp4
# Aggressive compression (smaller file, lower quality)
ffmpeg -i input.mp4 -vcodec libx264 -crf 32 -preset medium -vf "scale=640:-2" output.mp4
# Keep audio (for speech analysis)
ffmpeg -i input.mp4 -vcodec libx264 -crf 28 -preset medium -vf "scale=1280:-2" -acodec aac -b:a 128k output.mp4
Compression Options Explained
| Option | Description |
|---|---|
-crf 28 |
Quality level (18-28 recommended, higher = smaller file) |
-preset medium |
Encoding speed/quality tradeoff |
-vf "scale=1280:-2" |
Resize to 1280px width, maintain aspect ratio |
-an |
Remove audio (if not needed) |
-acodec aac -b:a 128k |
Compress audio to 128kbps AAC |
Cost Comparison Example
| Version | File Size | Prompt Tokens | Input Cost |
|---|---|---|---|
| Original (1080p) | 52 MB | 14,757 | $0.00221 |
| Compressed (720p) | 2.6 MB | 13,157 | $0.00197 |
| Savings | 95% | 10.8% | 10.8% |
Output
The CLI provides:
- Video content analysis from Gemini
- Token usage breakdown (prompt, response, total)
- Estimated cost based on Gemini pricing
Pricing Reference
Gemini 2.5 Flash (as of 2025):
- Input: $0.15 per 1M tokens (text/image), $0.075 per 1M tokens (video)
- Output: $0.60 per 1M tokens, $3.50 for thinking tokens
A 1-hour video costs approximately $0.11-0.32 to analyze.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file merleau-0.5.0.tar.gz.
File metadata
- Download URL: merleau-0.5.0.tar.gz
- Upload date:
- Size: 129.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9fd75069ddb337986d1983616b8452f8bb50ff0ad108e5a179a535e74b87c6a1
|
|
| MD5 |
48b83f664a4b8c018775caeae60d3da5
|
|
| BLAKE2b-256 |
0c3f000a55ccd998eba289b39ed9db4215286922d214b5518199308e4ff16c57
|
Provenance
The following attestation bundles were made for merleau-0.5.0.tar.gz:
Publisher:
python-publish.yml on yanndebray/merleau
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
merleau-0.5.0.tar.gz -
Subject digest:
9fd75069ddb337986d1983616b8452f8bb50ff0ad108e5a179a535e74b87c6a1 - Sigstore transparency entry: 926943099
- Sigstore integration time:
-
Permalink:
yanndebray/merleau@c89b72482bce5b9de75785ea55d039a1f8ef4a90 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/yanndebray
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@c89b72482bce5b9de75785ea55d039a1f8ef4a90 -
Trigger Event:
release
-
Statement type:
File details
Details for the file merleau-0.5.0-py3-none-any.whl.
File metadata
- Download URL: merleau-0.5.0-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14b56aac47827123b8ff4f2594569a220ca29ca635d48582fc2e249055d14144
|
|
| MD5 |
bf7af2d601cdda624e7e17f0f80e9f56
|
|
| BLAKE2b-256 |
02338c6553076e82375984fc5e64c9f29cc1bedd9fe5fc734d22f1588baf4aa9
|
Provenance
The following attestation bundles were made for merleau-0.5.0-py3-none-any.whl:
Publisher:
python-publish.yml on yanndebray/merleau
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
merleau-0.5.0-py3-none-any.whl -
Subject digest:
14b56aac47827123b8ff4f2594569a220ca29ca635d48582fc2e249055d14144 - Sigstore transparency entry: 926943100
- Sigstore integration time:
-
Permalink:
yanndebray/merleau@c89b72482bce5b9de75785ea55d039a1f8ef4a90 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/yanndebray
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@c89b72482bce5b9de75785ea55d039a1f8ef4a90 -
Trigger Event:
release
-
Statement type: