GUI interaction capture - platform-agnostic event streams with time-aligned media
Project description
OpenAdapt Capture
OpenAdapt Capture is the data collection component of the OpenAdapt GUI automation ecosystem.
Capture platform-agnostic GUI interaction streams with time-aligned screenshots and audio for training ML models or replaying workflows.
Status: Pre-alpha.
The OpenAdapt Ecosystem
OpenAdapt GUI Automation Pipeline
=================================
+-----------------+ +------------------+ +------------------+
| | | | | |
| openadapt- | ------> | openadapt-ml | ------> | Deploy |
| capture | Convert | (Train & Eval) | Export | (Inference) |
| | | | | |
+-----------------+ +------------------+ +------------------+
| | |
v v v
- Record GUI - Fine-tune VLMs - Run trained
interactions - Evaluate on agent on new
- Mouse, keyboard, benchmarks (WAA) tasks
screen, audio - Compare models - Real-time
- Privacy scrubbing - Cloud GPU training automation
| Component | Purpose | Repository |
|---|---|---|
| openadapt-capture | Record human demonstrations | GitHub |
| openadapt-ml | Train and evaluate GUI automation models | GitHub |
| openadapt-privacy | PII scrubbing for recordings | GitHub |
Installation
uv add openadapt-capture
This includes everything needed to capture and replay GUI interactions (mouse, keyboard, screen recording).
For audio capture with Whisper transcription (large download):
uv add "openadapt-capture[audio]"
Quick Start
Capture
from openadapt_capture import Recorder
# Record GUI interactions
with Recorder("./my_capture", task_description="Demo task") as recorder:
# Captures mouse, keyboard, and screen until context exits
input("Press Enter to stop recording...")
Replay / Analysis
from openadapt_capture import Capture
# Load and iterate over time-aligned events
capture = Capture.load("./my_capture")
for action in capture.actions():
# Each action has an associated screenshot
print(f"{action.timestamp}: {action.type} at ({action.x}, {action.y})")
screenshot = action.screenshot # PIL Image at time of action
Low-Level API
from openadapt_capture.db import create_db, get_session_for_path
from openadapt_capture.db import crud
from openadapt_capture.db.models import Recording, ActionEvent
# Create a database
engine, Session = create_db("/path/to/recording.db")
session = Session()
# Insert a recording
recording = crud.insert_recording(session, {
"timestamp": 1700000000.0,
"monitor_width": 1920,
"monitor_height": 1080,
"platform": "win32",
"task_description": "My task",
})
# Insert events
crud.insert_action_event(session, recording, 1700000001.0, {
"name": "click",
"mouse_x": 100.0,
"mouse_y": 200.0,
"mouse_button_name": "left",
"mouse_pressed": True,
})
# Query events back
from openadapt_capture.capture import CaptureSession
capture = CaptureSession.load("/path/to/capture_dir")
actions = list(capture.actions())
Event Types
Raw events (captured):
mouse.move,mouse.down,mouse.up,mouse.scrollkey.down,key.up
Actions (processed):
mouse.singleclick,mouse.doubleclick,mouse.dragkey.type(merged keystrokes into text)
Architecture
The recorder uses a multi-process architecture copied from legacy OpenAdapt:
- Reader threads: Capture mouse, keyboard, screen, and window events into a central queue
- Processor thread: Routes events to type-specific write queues
- Writer processes: Persist events to SQLAlchemy DB (one process per event type)
- Action-gated video: Only encodes video frames when user actions occur
capture_directory/
├── recording.db # SQLite: events, screenshots, window events, perf stats
├── oa_recording-{ts}.mp4 # Screen recording (action-gated)
└── audio.flac # Audio (optional)
Performance Testing
Run a performance test with synthetic input:
uv run python scripts/perf_test.py
This records for 10 seconds using pynput Controllers, then reports:
- Wall/CPU time and memory usage
- Event counts and action types
- Output file sizes
- Memory usage plot (saved to capture directory)
Run integration tests (requires accessibility permissions):
uv run pytest tests/test_performance.py -v -m slow
Visualization
Generate animated demos and interactive viewers from recordings:
Animated GIF Demo
from openadapt_capture import Capture, create_demo
capture = Capture.load("./my_capture")
create_demo(capture, output="demo.gif", fps=10, max_duration=15)
Interactive HTML Viewer
from openadapt_capture import Capture, create_html
capture = Capture.load("./my_capture")
create_html(capture, output="viewer.html", include_audio=True)
Sharing Recordings
Share recordings between machines using Magic Wormhole:
# On the sending machine
capture share send ./my_capture
# Shows a code like: 7-guitarist-revenge
# On the receiving machine
capture share receive 7-guitarist-revenge
The share command compresses the recording, sends it via Magic Wormhole, and extracts it on the receiving end. No account or setup required - just share the code.
Optional Extras
| Extra | Features |
|---|---|
audio |
Audio capture + Whisper transcription |
privacy |
PII scrubbing (openadapt-privacy) |
share |
Recording sharing via Magic Wormhole |
all |
Everything |
Development
uv sync --dev
uv run pytest tests/ -v --ignore=tests/test_browser_bridge.py
# Run slow integration tests (requires accessibility permissions)
uv run pytest tests/ -v -m slow
Related Projects
- openadapt-ml - Train and evaluate GUI automation models
- openadapt-privacy - PII detection and scrubbing for recordings
- openadapt-evals - Benchmark evaluation for GUI agents
- Windows Agent Arena - Benchmark for Windows GUI agents
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openadapt_capture-0.5.1.tar.gz.
File metadata
- Download URL: openadapt_capture-0.5.1.tar.gz
- Upload date:
- Size: 11.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fabfc09de6fc4a2ef4e861fa140b3a830c30106364b093bdcc3d4f90cd8d6ed7
|
|
| MD5 |
29f2dc130f5b36f8f0e3ec2b92086d53
|
|
| BLAKE2b-256 |
3d4246eedd3167bd6fc1599998282dc07b64ead27d5d2fdef8ebf59e890a11bd
|
Provenance
The following attestation bundles were made for openadapt_capture-0.5.1.tar.gz:
Publisher:
release.yml on OpenAdaptAI/openadapt-capture
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openadapt_capture-0.5.1.tar.gz -
Subject digest:
fabfc09de6fc4a2ef4e861fa140b3a830c30106364b093bdcc3d4f90cd8d6ed7 - Sigstore transparency entry: 1113651674
- Sigstore integration time:
-
Permalink:
OpenAdaptAI/openadapt-capture@505eb05a2153a462e86ed2347ed2f41f3bc00562 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/OpenAdaptAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@505eb05a2153a462e86ed2347ed2f41f3bc00562 -
Trigger Event:
push
-
Statement type:
File details
Details for the file openadapt_capture-0.5.1-py3-none-any.whl.
File metadata
- Download URL: openadapt_capture-0.5.1-py3-none-any.whl
- Upload date:
- Size: 135.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea3e2c1b5d4a06f517621c989d1b96756142b245714dfcead477c77577ab0905
|
|
| MD5 |
0155d4de13a64a10418f1c96d8d3d318
|
|
| BLAKE2b-256 |
86bcd041ac78e793f32ec7fab8bda0dd9b3bc6344e7cad3ed464f9ddd3c4dd03
|
Provenance
The following attestation bundles were made for openadapt_capture-0.5.1-py3-none-any.whl:
Publisher:
release.yml on OpenAdaptAI/openadapt-capture
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openadapt_capture-0.5.1-py3-none-any.whl -
Subject digest:
ea3e2c1b5d4a06f517621c989d1b96756142b245714dfcead477c77577ab0905 - Sigstore transparency entry: 1113651675
- Sigstore integration time:
-
Permalink:
OpenAdaptAI/openadapt-capture@505eb05a2153a462e86ed2347ed2f41f3bc00562 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/OpenAdaptAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@505eb05a2153a462e86ed2347ed2f41f3bc00562 -
Trigger Event:
push
-
Statement type: