Agent-as-Annotators: Structured Distillation of Web Agent Capabilities

These details have been verified by PyPI

Project links

Owner

McGill NLP

GitHub Statistics

These details have not been verified by PyPI

Project description

Agent-as-Annotators (A3)

💾 Code	📄 Paper	🌐 Website
🤗 Dataset	🤖 Model	📦 PyPI

Structured Distillation of Web Agent Capabilities Enables Generalization

Xing Han Lù, Siva Reddy

This repository contains the code for the A3 framework, which uses LLMs to systematically generate synthetic web agent training data by decomposing the annotation process into three roles: Task Designer, Annotator, and Supervisor.

Installation

pip install agent-as-annotators

Or install from source:

git clone https://github.com/McGill-NLP/agent-as-annotators.git
cd agent-as-annotators
pip install -e .

Quick Start: Evaluation

1. Serve a model with vLLM

vllm serve --config configs/vllm/Qwen3.5-9B.yaml

2. Run evaluation

a3-eval --benchmark webarena_test --model A3-qwen3.5-9b

Pipeline: Generating A3-Synth

The A3 pipeline generates synthetic training data in 5 steps:

Step 1: Create personas

python scripts/create_personas.py

Step 2: Generate task intents (via exploration)

a3-explore
python scripts/generate_task_intents.py

Step 3: Create A3-Synth task configs

python scripts/create_synth_configs.py

Step 4: Collect trajectories

a3-synth --benchmark a3_synth --model gemini-3-pro

Step 5: Convert to training data

python scripts/convert_trajectories_to_json.py
python scripts/generate_rft_data.py

Training

a3-train --config configs/train/qwen3.5-9b.json

Training uses SFT with FSDP for multi-GPU parallelism. See configs/train/ for hyperparameters and configs/accelerate/ for FSDP configuration.

CLI Commands

Command	Description
`a3-eval`	Run evaluation on WebArena, VisualWebArena, WorkArena, MiniWoB
`a3-synth`	Run trajectory collection for A3-Synth
`a3-explore`	Run environment exploration
`a3-train`	Fine-tune a model with SFT
`a3-screen-utils`	Screen session management utilities

Project Structure

agent-as-annotators/
  agent_as_annotators/       # Core package
    cli/                     # CLI entry points (eval, synth, explore, train)
    modeling.py              # Agent model wrapper (vLLM, Gemini, OpenAI)
    prompts/                 # All prompt templates
    judge/                   # Inverted evaluation protocol (Judge module)
    benchmarks/a3_synth/     # A3-Synth benchmark registration
    exploration/             # Exploration task registration
    utils/                   # Utilities
    configs/a3_synth/        # A3-Synth task configurations
  configs/
    model_configs.json       # Model registry
    train/                   # Training hyperparameters
    vllm/                    # vLLM serving configs
    accelerate/              # FSDP configs
  scripts/                   # Data pipeline scripts

Project details

These details have been verified by PyPI

Project links

Owner

McGill NLP

GitHub Statistics

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 10, 2026

0.0.1

Apr 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_as_annotators-0.1.0.tar.gz (72.7 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_as_annotators-0.1.0-py3-none-any.whl (83.2 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file agent_as_annotators-0.1.0.tar.gz.

File metadata

Download URL: agent_as_annotators-0.1.0.tar.gz
Upload date: Apr 10, 2026
Size: 72.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_as_annotators-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0dc96976826a2f4f77efa9215846592d0e85828b35e0dc6fb099f7d1316cda5f`
MD5	`063de4070a28ab473647f7a68b798f26`
BLAKE2b-256	`c9adc4d75e07dc02944d763981d20c66c94a349b840bda0a5fbc71304346fcc7`

See more details on using hashes here.

File details

Details for the file agent_as_annotators-0.1.0-py3-none-any.whl.

File metadata

Download URL: agent_as_annotators-0.1.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 83.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_as_annotators-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ebaf6fe8a7805b332109aa17b51faffefe9a50257ffd15ad758300c13002d6ff`
MD5	`71fb46d70bec334521e009b0138645c0`
BLAKE2b-256	`ce45b56b98b2dd096007f8d77f7d82b397a7573ab9f4b7d4f7c8d6fb8b060668`

See more details on using hashes here.

agent-as-annotators 0.1.0

Navigation

Verified details

Project links

Owner

GitHub Statistics

Unverified details

Meta

Classifiers

Project description

Agent-as-Annotators (A3)

Installation

Quick Start: Evaluation

1. Serve a model with vLLM

2. Run evaluation

Pipeline: Generating A3-Synth

Step 1: Create personas

Step 2: Generate task intents (via exploration)

Step 3: Create A3-Synth task configs

Step 4: Collect trajectories

Step 5: Convert to training data

Training

CLI Commands

Project Structure

Project details

Verified details

Project links

Owner

GitHub Statistics

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes