Skip to main content

GUI interaction capture - platform-agnostic event streams with time-aligned media

Project description

OpenAdapt Capture

Build Status License: MIT Python 3.10+

PyPI version Downloads

OpenAdapt Capture is the data collection component of the OpenAdapt GUI automation ecosystem.

Capture platform-agnostic GUI interaction streams with time-aligned screenshots and audio for training ML models or replaying workflows.

Status: Pre-alpha.


The OpenAdapt Ecosystem

                          OpenAdapt GUI Automation Pipeline
                          =================================

    +-----------------+          +------------------+          +------------------+
    |                 |          |                  |          |                  |
    | openadapt-      |  ------> | openadapt-ml     |  ------> |    Deploy        |
    | capture         |  Convert | (Train & Eval)   |  Export  |    (Inference)   |
    |                 |          |                  |          |                  |
    +-----------------+          +------------------+          +------------------+
          |                             |                             |
          v                             v                             v
    - Record GUI                  - Fine-tune VLMs              - Run trained
      interactions                - Evaluate on                   agent on new
    - Mouse, keyboard,              benchmarks (WAA)              tasks
      screen, audio               - Compare models              - Real-time
    - Privacy scrubbing           - Cloud GPU training            automation

Component Purpose Repository
openadapt-capture Record human demonstrations GitHub
openadapt-ml Train and evaluate GUI automation models GitHub
openadapt-privacy PII scrubbing for recordings GitHub

Installation

uv add openadapt-capture

This includes everything needed to capture and replay GUI interactions (mouse, keyboard, screen recording).

For audio capture with Whisper transcription (large download):

uv add "openadapt-capture[audio]"

Quick Start

Capture

from openadapt_capture import Recorder

# Record GUI interactions
with Recorder("./my_capture", task_description="Demo task") as recorder:
    # Captures mouse, keyboard, and screen until context exits
    input("Press Enter to stop recording...")

Replay / Analysis

from openadapt_capture import Capture

# Load and iterate over time-aligned events
capture = Capture.load("./my_capture")

for action in capture.actions():
    # Each action has an associated screenshot
    print(f"{action.timestamp}: {action.type} at ({action.x}, {action.y})")
    screenshot = action.screenshot  # PIL Image at time of action

Low-Level API

from openadapt_capture.db import create_db, get_session_for_path
from openadapt_capture.db import crud
from openadapt_capture.db.models import Recording, ActionEvent

# Create a database
engine, Session = create_db("/path/to/recording.db")
session = Session()

# Insert a recording
recording = crud.insert_recording(session, {
    "timestamp": 1700000000.0,
    "monitor_width": 1920,
    "monitor_height": 1080,
    "platform": "win32",
    "task_description": "My task",
})

# Insert events
crud.insert_action_event(session, recording, 1700000001.0, {
    "name": "click",
    "mouse_x": 100.0,
    "mouse_y": 200.0,
    "mouse_button_name": "left",
    "mouse_pressed": True,
})

# Query events back
from openadapt_capture.capture import CaptureSession
capture = CaptureSession.load("/path/to/capture_dir")
actions = list(capture.actions())

Event Types

Raw events (captured):

  • mouse.move, mouse.down, mouse.up, mouse.scroll
  • key.down, key.up

Actions (processed):

  • mouse.singleclick, mouse.doubleclick, mouse.drag
  • key.type (merged keystrokes into text)

Architecture

The recorder uses a multi-process architecture copied from legacy OpenAdapt:

  • Reader threads: Capture mouse, keyboard, screen, and window events into a central queue
  • Processor thread: Routes events to type-specific write queues
  • Writer processes: Persist events to SQLAlchemy DB (one process per event type)
  • Action-gated video: Only encodes video frames when user actions occur
capture_directory/
├── recording.db           # SQLite: events, screenshots, window events, perf stats
├── oa_recording-{ts}.mp4  # Screen recording (action-gated)
└── audio.flac             # Audio (optional)

Performance Testing

Run a performance test with synthetic input:

uv run python scripts/perf_test.py

This records for 10 seconds using pynput Controllers, then reports:

  • Wall/CPU time and memory usage
  • Event counts and action types
  • Output file sizes
  • Memory usage plot (saved to capture directory)

Run integration tests (requires accessibility permissions):

uv run pytest tests/test_performance.py -v -m slow

Visualization

Generate animated demos and interactive viewers from recordings:

Animated GIF Demo

from openadapt_capture import Capture, create_demo

capture = Capture.load("./my_capture")
create_demo(capture, output="demo.gif", fps=10, max_duration=15)

Interactive HTML Viewer

from openadapt_capture import Capture, create_html

capture = Capture.load("./my_capture")
create_html(capture, output="viewer.html", include_audio=True)

Sharing Recordings

Share recordings between machines using Magic Wormhole:

# On the sending machine
capture share send ./my_capture
# Shows a code like: 7-guitarist-revenge

# On the receiving machine
capture share receive 7-guitarist-revenge

The share command compresses the recording, sends it via Magic Wormhole, and extracts it on the receiving end. No account or setup required - just share the code.

Optional Extras

Extra Features
audio Audio capture + Whisper transcription
privacy PII scrubbing (openadapt-privacy)
share Recording sharing via Magic Wormhole
all Everything

Development

uv sync --dev
uv run pytest tests/ -v --ignore=tests/test_browser_bridge.py

# Run slow integration tests (requires accessibility permissions)
uv run pytest tests/ -v -m slow

Related Projects

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openadapt_capture-0.5.1.tar.gz (11.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openadapt_capture-0.5.1-py3-none-any.whl (135.0 kB view details)

Uploaded Python 3

File details

Details for the file openadapt_capture-0.5.1.tar.gz.

File metadata

  • Download URL: openadapt_capture-0.5.1.tar.gz
  • Upload date:
  • Size: 11.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openadapt_capture-0.5.1.tar.gz
Algorithm Hash digest
SHA256 fabfc09de6fc4a2ef4e861fa140b3a830c30106364b093bdcc3d4f90cd8d6ed7
MD5 29f2dc130f5b36f8f0e3ec2b92086d53
BLAKE2b-256 3d4246eedd3167bd6fc1599998282dc07b64ead27d5d2fdef8ebf59e890a11bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for openadapt_capture-0.5.1.tar.gz:

Publisher: release.yml on OpenAdaptAI/openadapt-capture

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openadapt_capture-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for openadapt_capture-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ea3e2c1b5d4a06f517621c989d1b96756142b245714dfcead477c77577ab0905
MD5 0155d4de13a64a10418f1c96d8d3d318
BLAKE2b-256 86bcd041ac78e793f32ec7fab8bda0dd9b3bc6344e7cad3ed464f9ddd3c4dd03

See more details on using hashes here.

Provenance

The following attestation bundles were made for openadapt_capture-0.5.1-py3-none-any.whl:

Publisher: release.yml on OpenAdaptAI/openadapt-capture

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page