Extract video frames as descriptive text flows for LLM consumption
Project description
video2flow
Extract video frames as descriptive text flows for LLM consumption.
Claude, GPT-4o, Gemini and other multimodal models can see images but not
video. video2flow bridges the gap: extract frames → generate timestamped
descriptions → feed the text flow to any LLM.
Installation
pip install video2flow
Usage
# Extract frames from a video
video2flow extract video.mp4 -o frames/ --fps 1
# Quick description (without vision API)
video2flow describe video.mp4 --max-frames 10
# Full pipeline
video2flow pipeline video.mp4 -o video_flow/
LLM Integration
Pass the output JSON to any LLM:
import json
flow = json.loads(open("video_flow/flow.json").read())
prompt = flow["usage"]["example_prompt"]
# Then: response = llm.invoke(prompt)
For detailed vision understanding, send image files directly to a multimodal model alongside the flow transcript.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file video2flow-0.3.1-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: video2flow-0.3.1-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 209.7 kB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d2fd2c01fed546901a3eb1d7c0c57282ae843d544d1744a68f91cc2f8e37a0d
|
|
| MD5 |
7ad9f37314a9f9f3107aeb2321b6bac0
|
|
| BLAKE2b-256 |
e2272067f54009892defd4e29d070d5ac851478fd50346a8817e57a122b5bfd0
|