Skip to main content

Canonical Pydantic schemas for computer-use agents: ComputerState, Action, ActionResult, UINode

Project description

openadapt-types

Canonical Pydantic schemas for computer-use agents.

pip install openadapt-types

What's in the box

Schema Purpose
ComputerState Screen state: screenshot + UI element graph + window context
UINode Single UI element with role, bbox, hierarchy, platform anchors
Action Agent action with typed action space + flexible targeting
ActionTarget Where to act: node_id > description > (x, y) coordinates
ActionResult Execution outcome with error taxonomy + state delta
Episode / Step Complete task trajectory (observation → action → result)
FailureRecord Classified failure for dataset pipelines

Quick start

from openadapt_types import (
    Action, ActionTarget, ActionType,
    ComputerState, UINode, BoundingBox,
)

# Describe what's on screen
state = ComputerState(
    viewport=(1920, 1080),
    nodes=[
        UINode(node_id="n0", role="window", name="My App", children_ids=["n1"]),
        UINode(node_id="n1", role="button", name="Submit", parent_id="n0",
               bbox=BoundingBox(x=500, y=400, width=100, height=40)),
    ],
)

# Agent decides what to do
action = Action(
    type=ActionType.CLICK,
    target=ActionTarget(node_id="n1"),
    reasoning="Click Submit to proceed",
)

# Render element tree for LLM prompts
print(state.to_text_tree())
# [n0] window: My App
#   [n1] button: Submit

Action targeting

ActionTarget supports three grounding strategies (in priority order):

# 1. Element-based (preferred — most robust)
ActionTarget(node_id="n1")

# 2. Description-based (resolved by grounding module)
ActionTarget(description="the blue submit button")

# 3. Coordinate-based (fallback)
ActionTarget(x=550, y=420)
ActionTarget(x=0.29, y=0.39, is_normalized=True)

Agents SHOULD produce node_id or description. The runtime resolves to coordinates.

Compatibility with existing schemas

Converters for three existing OpenAdapt schema formats:

from openadapt_types._compat import (
    from_benchmark_observation,   # openadapt-evals BenchmarkObservation
    from_benchmark_action,        # openadapt-evals BenchmarkAction
    from_ml_observation,          # openadapt-ml Observation
    from_ml_action,               # openadapt-ml Action
    from_omnimcp_screen_state,    # omnimcp ScreenState
    from_omnimcp_action_decision, # omnimcp ActionDecision
)

# Convert existing data
state = from_benchmark_observation(obs.__dict__)
action = from_benchmark_action(act.__dict__)

JSON Schema

Export for language-agnostic tooling:

import json
from openadapt_types import ComputerState, Action, Episode

# Get JSON Schema
schema = ComputerState.model_json_schema()
print(json.dumps(schema, indent=2))

Design principles

  • Pydantic v2 — runtime validation, JSON Schema export, fast serialization
  • Pixels + structure — always capture both visual and semantic UI state
  • Node graph — full element tree, not just focused element
  • Platform-agnostic — same schema for Windows, macOS, Linux, web
  • Extension-friendlyraw, attributes, metadata fields everywhere
  • Backward compatible_compat converters for gradual migration

Dependencies

Just pydantic>=2.0. No ML libraries, no heavy deps.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openadapt_types-0.2.0.tar.gz (52.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openadapt_types-0.2.0-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file openadapt_types-0.2.0.tar.gz.

File metadata

  • Download URL: openadapt_types-0.2.0.tar.gz
  • Upload date:
  • Size: 52.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openadapt_types-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d4c5d72f62b2bd2f001647898f5658ff0d240289e870196eae91b608d7115283
MD5 5fec3385028a6578ccc64f686731986b
BLAKE2b-256 8c082170a6dd59eb7800030be781d8648dda628645b8e90d8fa5a2f6d2f4329c

See more details on using hashes here.

Provenance

The following attestation bundles were made for openadapt_types-0.2.0.tar.gz:

Publisher: publish.yml on OpenAdaptAI/openadapt-types

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openadapt_types-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for openadapt_types-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9817fe3df70d5872696a4b04b2d603f8c13c81b5c626f5d4bb09229a30fc4b94
MD5 abf3fd3bf26c99b112eb4b27ae948516
BLAKE2b-256 c0f60d247a729cb167581055fae62006abb0e6117218ed92f2201fb04513c27f

See more details on using hashes here.

Provenance

The following attestation bundles were made for openadapt_types-0.2.0-py3-none-any.whl:

Publisher: publish.yml on OpenAdaptAI/openadapt-types

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page