Canonical Pydantic schemas for computer-use agents: ComputerState, Action, ActionResult, UINode
Project description
openadapt-types
Canonical Pydantic schemas for computer-use agents.
pip install openadapt-types
What's in the box
| Schema | Purpose |
|---|---|
ComputerState |
Screen state: screenshot + UI element graph + window context |
UINode |
Single UI element with role, bbox, hierarchy, platform anchors |
Action |
Agent action with typed action space + flexible targeting |
ActionTarget |
Where to act: node_id > description > (x, y) coordinates |
ActionResult |
Execution outcome with error taxonomy + state delta |
Episode / Step |
Complete task trajectory (observation → action → result) |
FailureRecord |
Classified failure for dataset pipelines |
Quick start
from openadapt_types import (
Action, ActionTarget, ActionType,
ComputerState, UINode, BoundingBox,
)
# Describe what's on screen
state = ComputerState(
viewport=(1920, 1080),
nodes=[
UINode(node_id="n0", role="window", name="My App", children_ids=["n1"]),
UINode(node_id="n1", role="button", name="Submit", parent_id="n0",
bbox=BoundingBox(x=500, y=400, width=100, height=40)),
],
)
# Agent decides what to do
action = Action(
type=ActionType.CLICK,
target=ActionTarget(node_id="n1"),
reasoning="Click Submit to proceed",
)
# Render element tree for LLM prompts
print(state.to_text_tree())
# [n0] window: My App
# [n1] button: Submit
Action targeting
ActionTarget supports three grounding strategies (in priority order):
# 1. Element-based (preferred — most robust)
ActionTarget(node_id="n1")
# 2. Description-based (resolved by grounding module)
ActionTarget(description="the blue submit button")
# 3. Coordinate-based (fallback)
ActionTarget(x=550, y=420)
ActionTarget(x=0.29, y=0.39, is_normalized=True)
Agents SHOULD produce node_id or description. The runtime resolves to coordinates.
Compatibility with existing schemas
Converters for three existing OpenAdapt schema formats:
from openadapt_types._compat import (
from_benchmark_observation, # openadapt-evals BenchmarkObservation
from_benchmark_action, # openadapt-evals BenchmarkAction
from_ml_observation, # openadapt-ml Observation
from_ml_action, # openadapt-ml Action
from_omnimcp_screen_state, # omnimcp ScreenState
from_omnimcp_action_decision, # omnimcp ActionDecision
)
# Convert existing data
state = from_benchmark_observation(obs.__dict__)
action = from_benchmark_action(act.__dict__)
JSON Schema
Export for language-agnostic tooling:
import json
from openadapt_types import ComputerState, Action, Episode
# Get JSON Schema
schema = ComputerState.model_json_schema()
print(json.dumps(schema, indent=2))
Design principles
- Pydantic v2 — runtime validation, JSON Schema export, fast serialization
- Pixels + structure — always capture both visual and semantic UI state
- Node graph — full element tree, not just focused element
- Platform-agnostic — same schema for Windows, macOS, Linux, web
- Extension-friendly —
raw,attributes,metadatafields everywhere - Backward compatible —
_compatconverters for gradual migration
Dependencies
Just pydantic>=2.0. No ML libraries, no heavy deps.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openadapt_types-0.2.0.tar.gz.
File metadata
- Download URL: openadapt_types-0.2.0.tar.gz
- Upload date:
- Size: 52.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4c5d72f62b2bd2f001647898f5658ff0d240289e870196eae91b608d7115283
|
|
| MD5 |
5fec3385028a6578ccc64f686731986b
|
|
| BLAKE2b-256 |
8c082170a6dd59eb7800030be781d8648dda628645b8e90d8fa5a2f6d2f4329c
|
Provenance
The following attestation bundles were made for openadapt_types-0.2.0.tar.gz:
Publisher:
publish.yml on OpenAdaptAI/openadapt-types
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openadapt_types-0.2.0.tar.gz -
Subject digest:
d4c5d72f62b2bd2f001647898f5658ff0d240289e870196eae91b608d7115283 - Sigstore transparency entry: 1194432021
- Sigstore integration time:
-
Permalink:
OpenAdaptAI/openadapt-types@0c2ac58d2bf2f9395be27c0c67b4cb844d1db52f -
Branch / Tag:
refs/heads/main - Owner: https://github.com/OpenAdaptAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0c2ac58d2bf2f9395be27c0c67b4cb844d1db52f -
Trigger Event:
push
-
Statement type:
File details
Details for the file openadapt_types-0.2.0-py3-none-any.whl.
File metadata
- Download URL: openadapt_types-0.2.0-py3-none-any.whl
- Upload date:
- Size: 20.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9817fe3df70d5872696a4b04b2d603f8c13c81b5c626f5d4bb09229a30fc4b94
|
|
| MD5 |
abf3fd3bf26c99b112eb4b27ae948516
|
|
| BLAKE2b-256 |
c0f60d247a729cb167581055fae62006abb0e6117218ed92f2201fb04513c27f
|
Provenance
The following attestation bundles were made for openadapt_types-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on OpenAdaptAI/openadapt-types
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
openadapt_types-0.2.0-py3-none-any.whl -
Subject digest:
9817fe3df70d5872696a4b04b2d603f8c13c81b5c626f5d4bb09229a30fc4b94 - Sigstore transparency entry: 1194432026
- Sigstore integration time:
-
Permalink:
OpenAdaptAI/openadapt-types@0c2ac58d2bf2f9395be27c0c67b4cb844d1db52f -
Branch / Tag:
refs/heads/main - Owner: https://github.com/OpenAdaptAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0c2ac58d2bf2f9395be27c0c67b4cb844d1db52f -
Trigger Event:
push
-
Statement type: