Skip to main content

Scriptable demo video recording for apps, terminals, and AI agents.

Project description

demo-video-recorder

Scriptable demo video recording for Python agents and humans. The package separates reusable recording primitives from project-specific demo steps, so an agent can inspect a project, write a small record_demo.py, react to CLI output, and produce an MP4 with burned subtitles and optional narration audio.

The built-in backend uses the installed ffmpeg and ffprobe executables for screen capture, encoding, probing, subtitle burn-in, and narration audio muxing. Windows capture uses gdigrab; macOS capture uses avfoundation. Linux capture is not implemented yet.

Install

uv sync

External tools required for recording:

ffmpeg -version
ffprobe -version

For Web UI demos, install the Playwright browser binaries once:

uv run playwright install chromium

On macOS, high-quality burned subtitles require an ffmpeg build with the subtitles filter, which depends on libass. The default Homebrew ffmpeg formula does not include it. Install a libass-enabled build such as ffmpeg-full, then put it on PATH:

brew install ffmpeg-full
export PATH="/opt/homebrew/opt/ffmpeg-full/bin:$PATH"
ffmpeg -hide_banner -filters | rg subtitles

On macOS, the first real recording attempt may require granting Screen Recording permission to Terminal, iTerm, or the Python host (IDE, VS Code) you run the script from. You can preflight that prompt without recording by running uv run python mac_request_access.py. Add --new-window to check Terminal.app specifically.

Quick Start

Record the bundled CLI example:

uv run python record.py --new-window

Add narration audio with Edge TTS:

uv run python record.py --new-window --tts

Print the available Edge TTS speakers:

uv run python record.py --list-speakers

Test only the narration path without opening a new terminal window or recording the screen:

uv run python record.py --audio-only

On macOS, record.py defaults to --check-access, which requests Screen Recording permission before capture starts and stops early if access is still denied.

The example app lives in examples/guessing_game.py; the recording script is examples/record_guessing_game.py. The example intentionally uses a random secret number, so the recorder reads the app output and chooses guesses from the hints instead of replaying fixed inputs.

Record the bundled Web UI example:

uv run python examples/record_webui_app.py

That script serves examples/webui_app/ on localhost, opens it with Playwright, fills inputs, selects date/color/dropdown values, animates a slider, clicks a button, waits for the output, and writes out/webui-demo.mp4.

Defaults

The built-in defaults mirror the original PowerShell script:

from demo_video_recorder import DEFAULTS

DEFAULTS.words_per_minute          # 170
DEFAULTS.min_pause_seconds         # 2.0
DEFAULTS.command_lead_seconds      # 0.0
DEFAULTS.typed_character_delay     # 0.018
DEFAULTS.capture_framerate         # 15
DEFAULTS.video_scale_width         # 1280

Use FAST_SMOKE_TEST_DEFAULTS for quick local script checks, not polished final videos.

CLI Demo API

from demo_video_recorder import CLIDemoRecorder
from demo_video_recorder import EdgeTTSBackend


def main():
    tts = EdgeTTSBackend(
        save_dir="out/demo.tts",
        speaker="en-US-AvaMultilingualNeural",
        speed="+0%",
        volume="+0%",
    )
    r = CLIDemoRecorder("out/demo.mp4", words_per_minute=165, tts=tts)
    try:
        r.open_terminal(
            title="Demo",
            top=True,
            window_size=(1200, 1200),
            start_recording=True,
            clear=True,
        )
        prepared = r.synthesize_if_tts_enabled(
            "The app responds to typed input while subtitles explain the action."
        )
        r.explain("Today we'll demo the main workflow.")
        r.run(["python", "app.py"], interactive=True, command_label="python app.py")
        r.expect_output(">")
        marker = r.mark_output()
        r.input("help")
        r.expect_regex(r"Commands?:", since=marker)
        r.explain(prepared)
        r.input("quit")
        r.stop_app()
    finally:
        r.close()
        if r.is_recording:
            r.stop_recording()

Useful methods:

  • open_terminal(...): configures the terminal and can start recording immediately.
  • clear(): clears the current terminal with clear or cls.
  • run(..., interactive=True): starts a CLI app and streams stdout/stderr to the recorded terminal.
  • input("text"): types into the active CLI app with a configurable typing delay.
  • expect_output("text"): waits until expected app output appears.
  • expect_regex(r"..."): waits for a regex match and returns the match object.
  • mark_output() / output_since(marker): isolate output caused by one action.
  • output_text("stdout") and output_text("stderr"): inspect streams separately.
  • explain("..."): adds narration subtitles and, when TTS is configured, also generates a spoken narration clip.
  • explain(prepared_explanation): reuses pre-generated narration text and audio without repeating the same string literal.
  • synthesize_explanation_audio("..."): prepares a SynthesizedExplanation ahead of time so explain() does not need to wait on synthesis during capture.
  • synthesize_if_tts_enabled("..."): returns a prepared explanation when TTS is configured, or the trimmed text when it is not. Prefer it over synthesize_explanation_audio as your smoke test won't end up doing the time costly synthesize all the time. use text directly if you do not use tts at all.
  • EdgeTTSBackend.list_speakers(): returns available Edge voices so you can choose one that fits the audience and tone.
  • stop_recording(): stops capture, trims subtitles to video duration, and writes the final MP4 with subtitles and narration audio.
  • render_narration_audio(): exports just the synthesized narration timeline, useful for --audio-only test runs.

When new_window=True is used, the recorder re-runs the script in a dedicated terminal session. On Windows it opens a new console; on macOS it opens a new Terminal.app window and captures that window instead of the whole display when bounds are available. Worker stdout and stderr are also mirrored to out/<name>.worker.log. If the worker fails, the parent process prints the log tail so the recording script is easier to debug.

Platform notes for terminal window control:

  • Windows supports maximize, top=True, and window_size=(w, h) for the recorder-managed console window.
  • macOS now applies maximize and window_size=(w, h) as a best-effort resize for Terminal.app and iTerm windows by scripting their window bounds.
  • macOS does not currently support persistent top=True / always-on-top behavior. The recorder can bring the terminal to the front, but Terminal.app and iTerm do not expose a portable AppleScript API for keeping a normal window above all other apps.

When TTS is enabled, explain() uses the real generated audio length instead of the word-count estimate. If synthesis latency could show up in the capture, pre-generate the clip and pass it straight into explain(prepared_explanation). Intermediate per-line clips are removed after the final output unless keep_tts_audio=True.

GUI or App Window API

Currently it can capture the app window, more controls will be added later

from demo_video_recorder import DemoVideoRecorder


def main():
    r = DemoVideoRecorder("out/notepad-demo.mp4")
    try:
        r.open_app(["notepad.exe"], title_hint="Untitled - Notepad", capture_window=True)
        r.start_capture_window()
        r.explain("Notepad is open and the window is being captured.")
    finally:
        r.close()
        if r.is_recording:
            r.stop_recording()

Web UI Demo API

WebUIRecorder is built for browser demos. It defaults to Playwright's own video recorder, which works for headless browser contexts and produces the raw MP4 that the existing subtitle and narration pipeline finalizes with ffmpeg.

from demo_video_recorder import WebUIRecorder


def main():
    r = WebUIRecorder("out/web-demo.mp4", headless=True, viewport=(1280, 720))
    try:
        r.serve("dist", 8000)
        r.open_web("/")
        r.explain("The local web app is open.")
        r.find_input(label="Email address").fill("ada@example.com")
        r.find_input(label="Date of birth").set_date("1991-08-14")
        r.find_select(label="Salary tier").select_option(label="$100,000 to $150,000")
        r.find_input("input", type="range").set_range(8)
        r.find("button", text="Review intake details").click()
        r.find("aside", text="ada@example.com").highlight()
    finally:
        r.close()
        if r.is_recording:
            r.stop_recording()

Useful methods:

  • serve(path, port=8000): serves a static folder over http://127.0.0.1:<port>.
  • open_web(url=None, ...): opens a URL. Bare domains such as example.com become https://example.com; relative paths such as /demo use the served folder.
  • find(...): bs4-style element lookup that raises WebElementNotFoundError when nothing visible is found.
  • find_optional(...): same lookup, returning None when nothing is found.
  • find_all(...): returns all matched elements.
  • find_input(...) / find_all_input(...): restrict lookup to input and textarea controls and return WebInputElement.
  • find_select(...) / find_all_select(...): restrict lookup to select controls and return WebSelectElement.
  • Element methods: highlight(), click(), double_click(), hover(), wait(), text(), and attribute(). Highlights smooth-scroll the element into view first.
  • Input/control methods: fill(), type(), clear(), set_value(), set_range(), set_date(), set_color(), set_files(), press(), check(), uncheck(), and select_option().
  • Visual control methods show recorder-friendly UI before committing values: select dropdown options, date calendars, color swatches, animated range movement, and whole-label highlights for radio/checkbox controls.
  • Form methods: submit().

find() accepts name and attrs like Beautiful Soup, plus Playwright-friendly selectors:

r.find("button", text="Save")
r.find("input", {"name": "email"})
r.find("input", _class="field-control", text="Email")
r.find(selector="[data-testid='submit']")
r.find(role="button", name="Continue")
r.find(label="Email address").fill("ada@example.com")

If you need to record an actual visible browser window instead of Playwright's page video, pass video_backend="ffmpeg" and run headed with headless=False.

Agent Usage

See AGENT.md for instructions aimed at coding agents. The intended flow is:

  1. Inspect the target project.
  2. Write a small deterministic recording script.
  3. Use explain() around visible actions.
  4. Run and fix the script until out/<name>.mp4 is created.

Publish Notes

Build locally with:

uv build

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

demo_video_recorder-0.1.0.tar.gz (142.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

demo_video_recorder-0.1.0-py3-none-any.whl (45.3 kB view details)

Uploaded Python 3

File details

Details for the file demo_video_recorder-0.1.0.tar.gz.

File metadata

  • Download URL: demo_video_recorder-0.1.0.tar.gz
  • Upload date:
  • Size: 142.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for demo_video_recorder-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1a5cf38f5d194c42ba6337e4b21d154167f496ce49838f0271132c8b000f04a5
MD5 f2c624959ef77e9de2fdf63ad261ac02
BLAKE2b-256 da186b28c53729b60bb60ef283bb22aa8300c91ec4127f31a02a0b3da9bb2a33

See more details on using hashes here.

File details

Details for the file demo_video_recorder-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for demo_video_recorder-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 49bb513c9fe445abf8558145b6e7e858c1c6f975fb642bd2ad3845b5d788a698
MD5 d50bc55480ff8eb44bd227a6731e32b9
BLAKE2b-256 58bbb6880e320f9939584d0ff24c3d6bf4e3c5ca1bfde0841b65175dabdba016

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page