Scriptable demo video recording for apps, terminals, and AI agents.
Project description
demo-video-recorder
Scriptable demo video recording for Python agents and humans. The package separates reusable recording primitives from project-specific demo steps, so an agent can inspect a project, write a small record_demo.py, react to CLI output, and produce an MP4 with burned subtitles and optional narration audio.
The built-in backend uses the installed ffmpeg and ffprobe executables for screen capture, encoding, probing, subtitle burn-in, and narration audio muxing. Windows capture uses gdigrab; macOS capture uses avfoundation. Linux capture is not implemented yet.
Install
uv sync
External tools required for recording:
ffmpeg -version
ffprobe -version
For Web UI demos, install the Playwright browser binaries once:
uv run playwright install chromium
On macOS, high-quality burned subtitles require an ffmpeg build with the subtitles filter, which depends on libass. The default Homebrew ffmpeg formula does not include it. Install a libass-enabled build such as ffmpeg-full, then put it on PATH:
brew install ffmpeg-full
export PATH="/opt/homebrew/opt/ffmpeg-full/bin:$PATH"
ffmpeg -hide_banner -filters | rg subtitles
On macOS, the first real recording attempt may require granting Screen Recording permission to Terminal, iTerm, or the Python host (IDE, VS Code) you run the script from. You can preflight that prompt without recording by running uv run python mac_request_access.py. Add --new-window to check Terminal.app specifically.
Quick Start
Record the bundled CLI example:
uv run python record.py --new-window
Add narration audio with Edge TTS:
uv run python record.py --new-window --tts
Print the available Edge TTS speakers:
uv run python record.py --list-speakers
Test only the narration path without opening a new terminal window or recording the screen:
uv run python record.py --audio-only
On macOS, record.py defaults to --check-access, which requests Screen Recording permission before capture starts and stops early if access is still denied.
The example app lives in examples/guessing_game.py; the recording script is examples/record_guessing_game.py.
The example intentionally uses a random secret number, so the recorder reads the app output and chooses guesses from the hints instead of replaying fixed inputs.
Record the bundled Web UI example:
uv run python examples/record_webui_app.py
That script serves examples/webui_app/ on localhost, opens it with Playwright, fills inputs, selects date/color/dropdown values, animates a slider, clicks a button, waits for the output, and writes out/webui-demo.mp4.
Defaults
The built-in defaults mirror the original PowerShell script:
from demo_video_recorder import DEFAULTS
DEFAULTS.words_per_minute # 170
DEFAULTS.min_pause_seconds # 2.0
DEFAULTS.command_lead_seconds # 0.0
DEFAULTS.typed_character_delay # 0.018
DEFAULTS.capture_framerate # 15
DEFAULTS.video_scale_width # 1280
Use FAST_SMOKE_TEST_DEFAULTS for quick local script checks, not polished final videos.
CLI Demo API
from demo_video_recorder import CLIDemoRecorder
from demo_video_recorder import EdgeTTSBackend
def main():
tts = EdgeTTSBackend(
save_dir="out/demo.tts",
speaker="en-US-AvaMultilingualNeural",
speed="+0%",
volume="+0%",
)
r = CLIDemoRecorder("out/demo.mp4", words_per_minute=165, tts=tts)
try:
r.open_terminal(
title="Demo",
top=True,
window_size=(1200, 1200),
start_recording=True,
clear=True,
)
prepared = r.synthesize_if_tts_enabled(
"The app responds to typed input while subtitles explain the action."
)
r.explain("Today we'll demo the main workflow.")
r.run(["python", "app.py"], interactive=True, command_label="python app.py")
r.expect_output(">")
marker = r.mark_output()
r.input("help")
r.expect_regex(r"Commands?:", since=marker)
r.explain(prepared)
r.input("quit")
r.stop_app()
finally:
r.close()
if r.is_recording:
r.stop_recording()
Useful methods:
open_terminal(...): configures the terminal and can start recording immediately.clear(): clears the current terminal withclearorcls.run(..., interactive=True): starts a CLI app and streams stdout/stderr to the recorded terminal.input("text"): types into the active CLI app with a configurable typing delay.expect_output("text"): waits until expected app output appears.expect_regex(r"..."): waits for a regex match and returns the match object.mark_output()/output_since(marker): isolate output caused by one action.output_text("stdout")andoutput_text("stderr"): inspect streams separately.explain("..."): adds narration subtitles and, when TTS is configured, also generates a spoken narration clip.explain(prepared_explanation): reuses pre-generated narration text and audio without repeating the same string literal.synthesize_explanation_audio("..."): prepares aSynthesizedExplanationahead of time soexplain()does not need to wait on synthesis during capture.synthesize_if_tts_enabled("..."): returns a prepared explanation when TTS is configured, or the trimmed text when it is not. Prefer it over synthesize_explanation_audio as your smoke test won't end up doing the time costly synthesize all the time. use text directly if you do not use tts at all.EdgeTTSBackend.list_speakers(): returns available Edge voices so you can choose one that fits the audience and tone.stop_recording(): stops capture, trims subtitles to video duration, and writes the final MP4 with subtitles and narration audio.render_narration_audio(): exports just the synthesized narration timeline, useful for--audio-onlytest runs.
When new_window=True is used, the recorder re-runs the script in a dedicated terminal session. On Windows it opens a new console; on macOS it opens a new Terminal.app window and captures that window instead of the whole display when bounds are available. Worker stdout and stderr are also mirrored to out/<name>.worker.log. If the worker fails, the parent process prints the log tail so the recording script is easier to debug.
Platform notes for terminal window control:
- Windows supports
maximize,top=True, andwindow_size=(w, h)for the recorder-managed console window. - macOS now applies
maximizeandwindow_size=(w, h)as a best-effort resize for Terminal.app and iTerm windows by scripting their window bounds. - macOS does not currently support persistent
top=True/ always-on-top behavior. The recorder can bring the terminal to the front, but Terminal.app and iTerm do not expose a portable AppleScript API for keeping a normal window above all other apps.
When TTS is enabled, explain() uses the real generated audio length instead of the word-count estimate. If synthesis latency could show up in the capture, pre-generate the clip and pass it straight into explain(prepared_explanation). Intermediate per-line clips are removed after the final output unless keep_tts_audio=True.
GUI or App Window API
Currently it can capture the app window, more controls will be added later
from demo_video_recorder import DemoVideoRecorder
def main():
r = DemoVideoRecorder("out/notepad-demo.mp4")
try:
r.open_app(["notepad.exe"], title_hint="Untitled - Notepad", capture_window=True)
r.start_capture_window()
r.explain("Notepad is open and the window is being captured.")
finally:
r.close()
if r.is_recording:
r.stop_recording()
Web UI Demo API
WebUIRecorder is built for browser demos. It defaults to Playwright's own video recorder, which works for headless browser contexts and produces the raw MP4 that the existing subtitle and narration pipeline finalizes with ffmpeg.
from demo_video_recorder import WebUIRecorder
def main():
r = WebUIRecorder("out/web-demo.mp4", headless=True, viewport=(1280, 720))
try:
r.serve("dist", 8000)
r.open_web("/")
r.explain("The local web app is open.")
r.find_input(label="Email address").fill("ada@example.com")
r.find_input(label="Date of birth").set_date("1991-08-14")
r.find_select(label="Salary tier").select_option(label="$100,000 to $150,000")
r.find_input("input", type="range").set_range(8)
r.find("button", text="Review intake details").click()
r.find("aside", text="ada@example.com").highlight()
finally:
r.close()
if r.is_recording:
r.stop_recording()
Useful methods:
serve(path, port=8000): serves a static folder overhttp://127.0.0.1:<port>.open_web(url=None, ...): opens a URL. Bare domains such asexample.combecomehttps://example.com; relative paths such as/demouse the served folder.find(...): bs4-style element lookup that raisesWebElementNotFoundErrorwhen nothing visible is found.find_optional(...): same lookup, returningNonewhen nothing is found.find_all(...): returns all matched elements.find_input(...)/find_all_input(...): restrict lookup toinputandtextareacontrols and returnWebInputElement.find_select(...)/find_all_select(...): restrict lookup toselectcontrols and returnWebSelectElement.- Element methods:
highlight(),click(),double_click(),hover(),wait(),text(), andattribute(). Highlights smooth-scroll the element into view first. - Input/control methods:
fill(),type(),clear(),set_value(),set_range(),set_date(),set_color(),set_files(),press(),check(),uncheck(), andselect_option(). - Visual control methods show recorder-friendly UI before committing values: select dropdown options, date calendars, color swatches, animated range movement, and whole-label highlights for radio/checkbox controls.
- Form methods:
submit().
find() accepts name and attrs like Beautiful Soup, plus Playwright-friendly selectors:
r.find("button", text="Save")
r.find("input", {"name": "email"})
r.find("input", _class="field-control", text="Email")
r.find(selector="[data-testid='submit']")
r.find(role="button", name="Continue")
r.find(label="Email address").fill("ada@example.com")
If you need to record an actual visible browser window instead of Playwright's page video, pass video_backend="ffmpeg" and run headed with headless=False.
Agent Usage
See AGENT.md for instructions aimed at coding agents. The intended flow is:
- Inspect the target project.
- Write a small deterministic recording script.
- Use
explain()around visible actions. - Run and fix the script until
out/<name>.mp4is created.
Publish Notes
Build locally with:
uv build
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file demo_video_recorder-0.1.0.tar.gz.
File metadata
- Download URL: demo_video_recorder-0.1.0.tar.gz
- Upload date:
- Size: 142.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a5cf38f5d194c42ba6337e4b21d154167f496ce49838f0271132c8b000f04a5
|
|
| MD5 |
f2c624959ef77e9de2fdf63ad261ac02
|
|
| BLAKE2b-256 |
da186b28c53729b60bb60ef283bb22aa8300c91ec4127f31a02a0b3da9bb2a33
|
File details
Details for the file demo_video_recorder-0.1.0-py3-none-any.whl.
File metadata
- Download URL: demo_video_recorder-0.1.0-py3-none-any.whl
- Upload date:
- Size: 45.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49bb513c9fe445abf8558145b6e7e858c1c6f975fb642bd2ad3845b5d788a698
|
|
| MD5 |
d50bc55480ff8eb44bd227a6731e32b9
|
|
| BLAKE2b-256 |
58bbb6880e320f9939584d0ff24c3d6bf4e3c5ca1bfde0841b65175dabdba016
|