Extract frames and transcripts from video files for LLM context and multimodal pipelines.
Project description
clip2context
Extract frames and transcripts from video files — structured output ready for LLM context, multimodal pipelines, or archival.
Given one or more video files, clip2context produces:
- Frames — high-quality WebP images at a configurable frame rate, plus a JSON manifest mapping each frame to its timestamp.
- Transcript — plain text, timestamped JSON segments, and a human-readable timed text file, generated by OpenAI Whisper.
Requirements
- Python 3.12+
- FFmpeg (must be on
PATH)
Install FFmpeg via your package manager:
# macOS
brew install ffmpeg
# Ubuntu / Debian
sudo apt install ffmpeg
# Windows (winget)
winget install ffmpeg
Installation
pip install clip2context
Or from source with uv:
git clone <repo-url>
cd clip2context
uv sync
Usage
clip2context <video_path> [<video_path> ...] [options]
Arguments
| Argument | Description |
|---|---|
video_paths |
One or more video files or directories containing videos. |
--output-dir DIR |
Base directory for all output (default: output/). |
--fps FLOAT |
Frames per second to extract (default: 1.0). Use 0.5 for one frame every two seconds. |
--quality 1-100 |
WebP compression quality (default: 95). Lower = smaller files. |
--only-frames |
Extract frames only; skip transcription. |
--only-transcripts |
Extract transcripts only; skip frame extraction. |
Examples
# Process a single video with defaults (1 fps, quality 95)
clip2context interview.mp4
# Process all videos in a folder, 1 frame every 2 seconds
clip2context ./recordings/ --fps 0.5
# Transcripts only, custom output directory
clip2context lecture.mp4 --only-transcripts --output-dir ./results
# Frames only, lower quality for smaller file sizes
clip2context demo.mov --only-frames --fps 2 --quality 75
# Process multiple videos at once
clip2context video1.mp4 video2.mp4 video3.mp4
Python API
You can also use clip2context programmatically:
from clip2context.main import run
# Full extraction (frames + transcript)
run("interview.mp4")
# Custom options
run(
"lecture.mp4",
output_base="results/",
fps=0.5,
quality=80,
do_frames=True,
do_transcript=True,
)
Or use the individual extractors directly:
from clip2context.extract_frames import extract_frames
from clip2context.extract_transcript import extract_transcript
# Extract frames → returns (output_dir, frame_count)
output_dir, count = extract_frames("video.mp4", "output/frames", fps=1.0, quality=95)
# Transcribe audio → returns output_dir
output_dir = extract_transcript("video.mp4", "output/transcript")
Output layout
Each video produces output under <output_dir>/<video_stem>/:
output/
└── interview/
├── frames/
│ ├── frame_0001.webp
│ ├── frame_0002.webp
│ ├── …
│ └── frames_manifest.json
└── transcript/
├── transcript_raw.txt
├── transcript_timestamped.json
└── transcript_timed.txt
frames_manifest.json
Maps each frame file to its timestamp:
[
{
"frame_filename": "frame_0001.webp",
"timestamp_seconds": 0.0,
"timestamp_formatted": "00:00:00"
},
...
]
transcript_timestamped.json
Word-accurate segment boundaries from Whisper:
[
{
"start": 0.0,
"end": 4.28,
"text": "Welcome to today's session."
},
...
]
transcript_timed.txt
Human-readable transcript with timestamps:
[00:00:00] Welcome to today's session.
[00:00:04] Let's get started.
Supported formats
.mp4 .mov .avi .mkv .webm .m4v .flv .wmv
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clip2context-0.1.3.tar.gz.
File metadata
- Download URL: clip2context-0.1.3.tar.gz
- Upload date:
- Size: 141.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"25.10","id":"questing","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92294dcc795440dc69f1997e106180022b43482df720d786496b610fa8150859
|
|
| MD5 |
427dc9232381ba50af75ebb0bb060dd3
|
|
| BLAKE2b-256 |
78f81de01376571019da902fb0bf30cd1e58783418cf5af4524a3cd04e39e2b1
|
File details
Details for the file clip2context-0.1.3-py3-none-any.whl.
File metadata
- Download URL: clip2context-0.1.3-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"25.10","id":"questing","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83f61b9f7c244cd515b40651ed358f0c43e2abbc099ecdda8e1194c6a320a28f
|
|
| MD5 |
369ff4d22d4d601770c56ced164a7b69
|
|
| BLAKE2b-256 |
2c40e5710a4eb2537e6a0574ebe98a94bf1ecfe337de802747291336301f2f21
|