Minimal video generation and processing library.
Project description
videopython
Minimal, LLM-friendly Python library for programmatic video editing, processing, and AI video workflows.
Full documentation: videopython.com
Disclaimer: This project started as a hand-written hobby project, but most of the code is now produced by LLM agents. Humans still drive direction, approve changes, and own design decisions.
Installation
1. Install FFmpeg
# macOS
brew install ffmpeg
# Ubuntu / Debian
sudo apt-get install ffmpeg
# Windows (Chocolatey)
choco install ffmpeg
2. Install videopython
pip install videopython # core video/audio editing
pip install "videopython[ai]" # + local AI features (GPU recommended)
Python >=3.10, <3.14. AI features run locally - no cloud API keys required, but model weights are downloaded on first use.
Quick Start
Imperative editing
Every editing primitive is an Operation subclass — a Pydantic model
whose fields ARE the JSON wire format. Apply one to a Video:
from videopython.base import Video, CutSeconds, Resize, Fade
video = Video.from_path("raw.mp4")
video = CutSeconds(start=10, end=25).apply(video)
video = Resize(width=1080, height=1920).apply(video)
video = Fade(mode="in", duration=0.5).apply(video)
video.save("output.mp4")
Concatenate clips with + (must share fps + dimensions):
combined = video_a + video_b
JSON editing plans
Define multi-segment edits as JSON — the format LLM-driven workflows
generate against. VideoEdit.json_schema() returns the schema:
from videopython.editing import VideoEdit
plan = {
"segments": [{
"source": "raw.mp4",
"start": 10.0,
"end": 20.0,
"operations": [
{"op": "resize", "width": 1080, "height": 1920},
{"op": "color_adjust", "saturation": 1.15, "contrast": 1.05},
{"op": "fade", "mode": "in", "duration": 0.5,
"window": {"stop": 0.5}},
],
}],
}
edit = VideoEdit.from_dict(plan)
edit.validate() # dry-run via metadata, no frames loaded
edit.run_to_file("output.mp4") # stream to disk, ~constant memory
run_to_file() pipes ffmpeg decode → per-frame effects → ffmpeg encode,
so memory stays bounded even for hour-long sources. Use edit.run()
instead if you want the result back in memory as a Video.
AI generation
from videopython.ai import TextToImage, ImageToVideo, TextToSpeech
from videopython.base import Resize
image = TextToImage().generate_image("A cinematic mountain sunrise")
video = ImageToVideo().generate_video(image=image)
audio = TextToSpeech().generate_audio("Welcome to videopython.")
video = Resize(width=1080, height=1920).apply(video)
video.add_audio(audio).save("ai_video.mp4")
LLM & AI Agent Integration
The library is built for LLM-driven editing. Two surfaces matter:
1. Plan schema for tool / structured-output calls.
VideoEdit.json_schema() returns a JSON Schema covering segments,
post_operations, and a discriminated union over every registered
Operation. Drop it into any LLM API:
from videopython.editing import VideoEdit
schema = VideoEdit.json_schema()
# Anthropic: tools=[{"name": "edit", "input_schema": schema}]
# OpenAI: tools=[{"type": "function",
# "function": {"name": "edit", "parameters": schema}}]
Validate the LLM's output without touching the filesystem, then run it:
edit = VideoEdit.from_dict(plan)
edit.validate() # catches bad ops, time ranges, fps mismatches
edit.run_to_file("output.mp4")
2. Operation discovery for agent loops. Every registered op exposes its own Pydantic schema, so an agent can introspect what's available without hardcoded lists:
from videopython.base import Operation, OpCategory
for op_id, cls in Operation.registry().items():
print(f"{op_id}: {(cls.__doc__ or '').splitlines()[0]}")
schema = Operation.get("color_adjust").model_json_schema() # per-op schema
Field constraints (minimum, maximum, enum, exclusiveMinimum,
nullability) flow through to the schema, so LLMs that support
constrained generation produce valid parameters on the first try.
For ops that need side-channel data (e.g. silence_removal and
add_subtitles need a Transcription), pass it via context:
edit.run(context={"transcription": my_transcription})
Docs: Editing Plans | Operations | LLM Integration Guide
Features
videopython.base - core editing (no AI dependencies)
| Area | Highlights |
|---|---|
| Video I/O | Video, VideoMetadata, FrameIterator - load, save, inspect |
| Operation foundation | Operation, Effect, TimeRange, OpCategory - Pydantic base + auto-registry + discriminated-union schema |
| Editing plans | VideoEdit, SegmentConfig - JSON/LLM-friendly multi-segment plans with JSON Schema generation, dry-run validation, and streaming run_to_file |
| Transforms | Cut (time/frame), resize, crop, FPS resampling, speed change, reverse, freeze frame, silence removal |
| Effects | Blur, zoom, color grading, vignette, Ken Burns, image overlay, fade, text overlay, volume adjust |
| Audio | Load/save, overlay, concat, normalize, time-stretch, silence detection, segment classification |
| Text | Transcription data classes, TranscriptionOverlay for subtitle rendering |
| Scene detection | Histogram-based scene boundaries (detect, detect_streaming, detect_parallel) |
API docs: Core | Video | Audio | Editing Plans | Operations | Transforms | Effects | Text
videopython.ai - local AI features (install with [ai])
| Area | Highlights |
|---|---|
| Generation | TextToVideo, ImageToVideo, TextToImage, TextToSpeech, TextToMusic |
| Understanding | AudioToText (transcription), AudioClassifier, SceneVLM (structured visual scene description), FaceTracker (per-shot face tracks) |
| Scene detection | SemanticSceneDetector (neural scene boundaries) |
| Video analysis | VideoAnalyzer - full-pipeline analysis combining multiple AI capabilities |
| Transforms | FaceTrackingCrop |
| Dubbing | VideoDubber - voice cloning and revoicing with timing sync |
API docs: Generation | Understanding | Transforms | Dubbing
Examples
Development
See DEVELOPMENT.md for local setup, testing, and contribution workflow.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file videopython-0.31.3.tar.gz.
File metadata
- Download URL: videopython-0.31.3.tar.gz
- Upload date:
- Size: 142.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b770b9de37d1b6345c8252d4bf42ace2d073870709d9b374128b4312bb994090
|
|
| MD5 |
ca50ab8306aae0e681bb733c3f07da0d
|
|
| BLAKE2b-256 |
b98a5a7379e837d4de29935d87fd01c70a2bd1178ae3b13584747c7415d7eb53
|
Provenance
The following attestation bundles were made for videopython-0.31.3.tar.gz:
Publisher:
publish.yml on BartWojtowicz/videopython
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
videopython-0.31.3.tar.gz -
Subject digest:
b770b9de37d1b6345c8252d4bf42ace2d073870709d9b374128b4312bb994090 - Sigstore transparency entry: 1540021022
- Sigstore integration time:
-
Permalink:
BartWojtowicz/videopython@7c8dcf0112812542f4c810914789949795a6b93a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/BartWojtowicz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7c8dcf0112812542f4c810914789949795a6b93a -
Trigger Event:
push
-
Statement type:
File details
Details for the file videopython-0.31.3-py3-none-any.whl.
File metadata
- Download URL: videopython-0.31.3-py3-none-any.whl
- Upload date:
- Size: 164.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd4dbd35aa41c8c12423e682abb97cc0f6b1caa3e9b6a6a36d3dfb56c4b1dcfe
|
|
| MD5 |
b9984dc22702b5cceaf4fdf19e135d90
|
|
| BLAKE2b-256 |
2dd37f05d72bed02416f122e17fba90aa88dcf05cc9afd495bbbc502d8697c5d
|
Provenance
The following attestation bundles were made for videopython-0.31.3-py3-none-any.whl:
Publisher:
publish.yml on BartWojtowicz/videopython
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
videopython-0.31.3-py3-none-any.whl -
Subject digest:
bd4dbd35aa41c8c12423e682abb97cc0f6b1caa3e9b6a6a36d3dfb56c4b1dcfe - Sigstore transparency entry: 1540021127
- Sigstore integration time:
-
Permalink:
BartWojtowicz/videopython@7c8dcf0112812542f4c810914789949795a6b93a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/BartWojtowicz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7c8dcf0112812542f4c810914789949795a6b93a -
Trigger Event:
push
-
Statement type: