Skip to main content

CLI-first screen recording and SOP rendering harness for AI agents.

Project description

Screen Harness

CI

Screen Harness is a CLI-first macOS tool for recording screen workflows and turning them into SOP videos and Markdown documents. It records an immutable raw.mp4, stores editable events in timeline.json, then renders an intro card, frosted-glass step overlays, highlight strokes, an outro/end card, and Markdown SOP from one timeline.

The MVP stays intentionally small: it proves the local recording → rendering loop before adding a daemon, global hotkeys, or cloud AI providers.

What It Does

  • Records the macOS screen with FFmpeg (AVFoundation).
  • Optionally crops the recording to a single window region (no Dock, no menu bar) via the region=(x, y, w, h) argument on start_recording.
  • Lets an agent or user add structured timeline events from Python helpers (intro, step, caption, highlight_region, outro, …).
  • Generates sop.srt, sop.ass, and sop.md from timeline.json.
  • Renders final.mp4 without mutating raw.mp4. The training template produces:
    • Intro card — branded charcoal background, wordmark, title, "STARTING IN" countdown.
    • Main recording — fixed-position frosted-glass step card (FFmpeg boxblur over the underlying frame, off-white tint) with a teal step number and dark title; teal highlight strokes around the regions you call out.
    • Outro card — held end frame with title, optional subtitle, and the project URL in the accent color.
  • Loads reusable helpers from agent-workspace/agent_helpers.py.
  • Supports manual/offline transcript input for AI-style SOP caption generation.
  • Scans transcript and timeline text for sensitive strings such as emails and tokens.

The visual system uses Color Hunt's "DevDark" palette: #222831 charcoal, #393E46 graphite, #00ADB5 teal accent, #EEEEEE off-white. The same palette drives intro, step card, highlight strokes, and outro for visual continuity.

Install

uv sync
uv run screen-harness init
uv run screen-harness doctor

Expected MVP signals:

screen capture: ok
render ffmpeg: ok

Microphone can be not detected; the current MVP works without microphone input.

Run The Safari/GitHub Demo

The current demo script forces Safari to a deterministic 1920×1080 window, derives the crop region from Safari's actual bounds (so the recording excludes the Dock and menu bar), opens the repository on github.com, calls out the URL bar / repo header / file list / README, and finally holds a 4-second outro card with the project URL.

uv run screen-harness -c 'exec(open("examples/expense_sop.py").read())'

The script prints the recording directory and rendered final.mp4 path when it completes. To find the latest demo later:

ls -td recordings/safari_github_repo_demo_* | head -1

Expected files in the recording directory:

raw.mp4
timeline.json
sop.srt
sop.ass
sop.md
intro.ass
intro-source.mp4
intro.mp4
main.mp4
outro.ass
outro-source.mp4
outro.mp4
final.mp4
metadata.json
ffmpeg.log

To re-render after editing captions or events:

uv run screen-harness render <recording_id>

Use the professional training template explicitly:

uv run screen-harness render <recording_id> --template training

Helper API (used inside screen-harness -c '<python>')

Helper Purpose
start_recording(name, *, region=None, capture_cursor=True, capture_mouse_clicks=True) Start FFmpeg AVFoundation capture. region=(x, y, w, h) crops to that screen rect — combine with AppleScript-driven window placement to record a single app.
stop_recording() Finalize the capture and update metadata.json.
wait(seconds), wait_for_user(msg) Pace the script.
intro(title, *, subtitle=None, countdown=5) Configure the pre-roll intro card.
chapter(title) Mark a chapter (currently ignored by training render).
step(title, *, note=None, number=None) A timeline step; renders inside the frosted-glass step card.
caption(text, *, duration=None) A caption track entry → sop.srt.
click(x, y, *, label=None) Draw a small red dot at a click position.
highlight_region(x, y, w, h, *, text=None, duration=3.0, color=None, thickness=None) Draw a colored stroke around a region (defaults to teal #00ADB5).
redact_region(x, y, w, h, *, reason=None, duration=None) Black-fill a sensitive region.
outro(title="Thanks for watching", *, subtitle=None, url=None, duration=4.0) Hold a branded end card after the main recording.
`render(*, template="training" "debug")`
transcribe(), generate_ai_sop(), scan_redactions() Transcript + AI SOP + redaction scan.

Minimal Helper Example

uv run screen-harness -c '
start_recording("quick_demo")
intro("This video demonstrates the quick demo", subtitle="A short Screen Harness example", countdown=5)
step("Open the target app", note="Prepare the workflow for recording.")
caption("Open the target app and prepare the workflow.")
wait(3)
outro("Thanks for watching", url="github.com/frankyxhl/screen-harness")
stop_recording()
render(template="training")
'

AI SOP Generation

Create a manual transcript beside a recording:

recordings/<recording_id>/manual_transcript.txt

Timed lines are preferred:

00:00:01.000 --> 00:00:03.000 Open the target app.
00:00:03.500 --> 00:00:06.000 Submit the form.

Then run:

uv run screen-harness transcribe <recording_id>
uv run screen-harness sop ai-generate <recording_id>
uv run screen-harness redact scan <recording_id>
uv run screen-harness render <recording_id>

Project Layout

src/screen_harness/       CLI, recording, rendering, timeline, transcript, SOP logic
examples/                 Runnable demo scripts
agent-workspace/          User-editable helper and domain-skill workspace
interaction-skills/       Reusable interaction notes
rules/                    Alfred planning and change records
tests/                    Unit, BDD, and end-to-end tests
SKILL.md                  AI-agent oriented install + usage prompt

Development

uv run pytest -q
uv run --with pytest-cov pytest --cov=src/screen_harness --cov-report=term-missing -q
uv run python -m compileall -q src tests
af validate

Current test posture: 90 tests, ~86% line coverage across unit, BDD-style, and FFmpeg-backed end-to-end tests (the e2e tests skip cleanly when FFmpeg lacks the subtitles / drawbox filters).

CI/CD

GitHub Actions runs on pull requests and pushes to main:

  • test: Python 3.14 with pytest and compileall.
  • alfred: validates rules/ with uvx --from fx-alfred af validate.
  • package: builds source and wheel distributions with uv build, then uploads dist/* as the screen-harness-dist workflow artifact.

Use as a Claude Skill

SKILL.md at the repo root is an AI-agent oriented install + usage prompt — drop the file (or its frontmatter+body) into a Claude Code or Claude Agent SDK skill bundle and the model can drive Screen Harness end-to-end without further hand-holding.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

screen_harness-0.1.0.tar.gz (52.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

screen_harness-0.1.0-py3-none-any.whl (37.1 kB view details)

Uploaded Python 3

File details

Details for the file screen_harness-0.1.0.tar.gz.

File metadata

  • Download URL: screen_harness-0.1.0.tar.gz
  • Upload date:
  • Size: 52.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for screen_harness-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1e843d566285ed76db2b4527ea49f5760746d2b05c33bc2bdfb47b25f9e1e83f
MD5 4e1df744e13f4c02c5430025c22ab75b
BLAKE2b-256 bc70d1919ad8cd4abf778bd8c4c1866bb76a8cbfa7f60563a6abbcd5f193d587

See more details on using hashes here.

Provenance

The following attestation bundles were made for screen_harness-0.1.0.tar.gz:

Publisher: release.yml on frankyxhl/screen-harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file screen_harness-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: screen_harness-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 37.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for screen_harness-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcd91df231fbb00599bda6b9f4a65ccd05031222b89bf3608f66eb9480010914
MD5 b176ce0ae83fcc902c10402be67050d0
BLAKE2b-256 dd1cf38cda6f3cb42fea067f56d6f07ebd6df869ebae59141c24dc6b6e47844a

See more details on using hashes here.

Provenance

The following attestation bundles were made for screen_harness-0.1.0-py3-none-any.whl:

Publisher: release.yml on frankyxhl/screen-harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page