GUI automation with ML - record, train, deploy, evaluate
Project description
OpenAdapt: AI-First Process Automation with Large Multimodal Models (LMMs)
OpenAdapt is the open source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web GUIs.
Record GUI demonstrations, train ML models, and evaluate agents - all from a unified CLI.
Join us on Discord | Documentation | OpenAdapt.ai
Architecture
OpenAdapt v1.0+ uses a modular meta-package architecture. The main openadapt package provides a unified CLI and depends on focused sub-packages via PyPI:
| Package | Description | Repository |
|---|---|---|
openadapt |
Meta-package with unified CLI | This repo |
openadapt-capture |
Event recording and storage | openadapt-capture |
openadapt-ml |
ML engine, training, inference | openadapt-ml |
openadapt-evals |
Benchmark evaluation | openadapt-evals |
openadapt-viewer |
HTML visualization | openadapt-viewer |
openadapt-grounding |
UI element localization | openadapt-grounding |
openadapt-retrieval |
Multimodal demo retrieval | openadapt-retrieval |
openadapt-privacy |
PII/PHI scrubbing | openadapt-privacy |
Installation
Install what you need:
pip install openadapt # Minimal CLI only
pip install openadapt[capture] # GUI capture/recording
pip install openadapt[ml] # ML training and inference
pip install openadapt[evals] # Benchmark evaluation
pip install openadapt[privacy] # PII/PHI scrubbing
pip install openadapt[all] # Everything
Requirements: Python 3.10+
Quick Start
1. Record a demonstration
openadapt capture start --name my-task
# Perform actions in your GUI, then press Ctrl+C to stop
2. Train a model
openadapt train start --capture my-task --model qwen3vl-2b
3. Evaluate
openadapt eval run --checkpoint training_output/model.pt --benchmark waa
4. View recordings
openadapt capture view my-task
CLI Reference
openadapt capture start --name <name> Start recording
openadapt capture stop Stop recording
openadapt capture list List captures
openadapt capture view <name> Open capture viewer
openadapt train start --capture <name> Train model on capture
openadapt train status Check training progress
openadapt train stop Stop training
openadapt eval run --checkpoint <path> Evaluate trained model
openadapt eval run --agent api-claude Evaluate API agent
openadapt eval mock --tasks 10 Run mock evaluation
openadapt serve --port 8080 Start dashboard server
openadapt version Show installed versions
openadapt doctor Check system requirements
How It Works
See the full Architecture Documentation for detailed diagrams.
flowchart TB
%% Main workflow phases
subgraph Record["1. RECORD"]
direction TB
DEMO[User Demo] --> CAPTURE[openadapt-capture]
end
subgraph Train["2. TRAIN"]
direction TB
DATA[Captured Data] --> ML[openadapt-ml]
end
subgraph Deploy["3. DEPLOY"]
direction TB
MODEL[Trained Model] --> AGENT[Agent Policy]
AGENT --> REPLAY[Action Replay]
end
subgraph Evaluate["4. EVALUATE"]
direction TB
BENCH[Benchmarks] --> EVALS[openadapt-evals]
EVALS --> METRICS[Metrics]
end
%% Main flow connections
CAPTURE --> DATA
ML --> MODEL
AGENT --> BENCH
%% Viewer - independent component
VIEWER[openadapt-viewer]
VIEWER -.->|"view at any phase"| Record
VIEWER -.->|"view at any phase"| Train
VIEWER -.->|"view at any phase"| Deploy
VIEWER -.->|"view at any phase"| Evaluate
%% Optional packages with integration points
PRIVACY[openadapt-privacy]
RETRIEVAL[openadapt-retrieval]
GROUNDING[openadapt-grounding]
PRIVACY -.->|"PII/PHI scrubbing"| CAPTURE
RETRIEVAL -.->|"demo retrieval"| ML
GROUNDING -.->|"UI localization"| REPLAY
%% Styling
classDef corePhase fill:#e1f5fe,stroke:#01579b
classDef optionalPkg fill:#fff3e0,stroke:#e65100,stroke-dasharray: 5 5
classDef viewerPkg fill:#e8f5e9,stroke:#2e7d32,stroke-dasharray: 3 3
class Record,Train,Deploy,Evaluate corePhase
class PRIVACY,RETRIEVAL,GROUNDING optionalPkg
class VIEWER viewerPkg
OpenAdapt:
- Records screenshots and user input events
- Trains ML models on demonstrations
- Generates and replays synthetic input via model completions
- Evaluates agents on GUI automation benchmarks
Key differentiators:
- Model agnostic - works with any LMM
- Auto-prompted from human demonstration (not user-prompted)
- Works with all desktop GUIs including virtualized and web
- Open source (MIT license)
Key Concepts
Meta-Package Structure
OpenAdapt v1.0+ uses a modular architecture where the main openadapt package acts as a meta-package that coordinates focused sub-packages:
-
Core Packages: Essential for the main workflow
openadapt-capture- Records screenshots and input eventsopenadapt-ml- Trains models on demonstrationsopenadapt-evals- Evaluates agents on benchmarks
-
Optional Packages: Enhance specific workflow phases
openadapt-privacy- Integrates at Record phase for PII/PHI scrubbingopenadapt-retrieval- Integrates at Train phase for multimodal demo retrievalopenadapt-grounding- Integrates at Deploy phase for UI element localization
-
Independent Components:
openadapt-viewer- HTML visualization that works with any phase
Two Paths to Automation
-
Custom Training Path: Record demonstrations -> Train your own model -> Deploy agent
- Best for: Repetitive tasks specific to your workflow
- Requires:
openadapt[core]
-
API Agent Path: Use pre-trained LMM APIs (Claude, GPT-4V, etc.) -> Evaluate on benchmarks
- Best for: General-purpose automation, rapid prototyping
- Requires:
openadapt[evals]
Installation Paths
Choose your installation based on your use case:
What do you want to do?
|
+-- Just evaluate API agents on benchmarks?
| +-- pip install openadapt[evals]
|
+-- Train custom models on your demonstrations?
| +-- pip install openadapt[core]
|
+-- Full suite with all optional packages?
| +-- pip install openadapt[all]
|
+-- Minimal CLI only (add packages later)?
+-- pip install openadapt
| Installation | Included Packages | Use Case |
|---|---|---|
openadapt |
CLI only | Start minimal, add what you need |
openadapt[evals] |
+ evals | Benchmark API agents (Claude, GPT-4V) |
openadapt[core] |
+ capture, ml, viewer | Full training workflow |
openadapt[all] |
+ privacy, retrieval, grounding | Everything including optional enhancements |
Demos
- https://twitter.com/abrichr/status/1784307190062342237
- https://www.loom.com/share/9d77eb7028f34f7f87c6661fb758d1c0
Permissions
macOS: Grant Accessibility, Screen Recording, and Input Monitoring permissions to your terminal. See permissions guide.
Windows: Run as Administrator if needed for input capture.
Legacy Version
The monolithic OpenAdapt codebase (v0.46.0) is preserved in the legacy/ directory.
To use the legacy version:
pip install openadapt==0.46.0
See docs/LEGACY_FREEZE.md for migration guide and details.
Contributing
- Join Discord
- Pick an issue from the relevant sub-package repository
- Submit a PR
For sub-package development:
git clone https://github.com/OpenAdaptAI/openadapt-ml # or other sub-package
cd openadapt-ml
pip install -e ".[dev]"
Related Projects
- OpenAdaptAI/SoM - Set-of-Mark prompting
- OpenAdaptAI/pynput - Input monitoring fork
- OpenAdaptAI/atomacos - macOS accessibility
Support
- Discord: https://discord.gg/yF527cQbDG
- Issues: Use the relevant sub-package repository
- Architecture docs: GitHub Wiki
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openadapt-1.0.0.tar.gz.
File metadata
- Download URL: openadapt-1.0.0.tar.gz
- Upload date:
- Size: 4.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.10.19 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4cf56316699a262c8e3b10d151c7cc7edb01c1d341863ab6563820e643e5197
|
|
| MD5 |
ea2fcf7b7f978762788e498e346b7062
|
|
| BLAKE2b-256 |
a6e831bd3f23a74910710f7d1965454169bf1a458b88aed957086cae63d4ae48
|
File details
Details for the file openadapt-1.0.0-py3-none-any.whl.
File metadata
- Download URL: openadapt-1.0.0-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.10.19 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d052967d7c027821776b192b3c5759ba6579aa48dcc9fce9387ad63257885a6a
|
|
| MD5 |
5b0f44db335d4c0ff05dc7a661b30117
|
|
| BLAKE2b-256 |
4be51fd2542a0c4e630a1e3aa57fb1ed73d5ca1043ecc71681c53fde7e88e90e
|