Skip to main content

BrowserGym extensions for DoomArena

Project description

BrowserGym+ AgentLab Threat Models

This repository contains tools and scripts for defining and testing threat models in the BrowserGym + AgentLab agentic framework. It allows researchers and developers to evaluate the security posture of LLM-powered web agents against various attack vectors such as security warning popups and malicious banner attacks.

Overview

The framework provides a structured way to:

  • Simulate various attack scenarios against web agents
  • Measure attack success rate (ASR), task success rate (TSR), and attack stealth rate
  • Evaluate different LLM models' robustness against these attacks
  • Generate comprehensive reports of attack effectiveness

This toolkit specifically focuses on testing agents in the BrowserGym setting.

Installation

  1. Run pip install -e doomarena/browsergym from the root repo or do pip install doomarena-browsergym for the latest release
  2. Run your instance of webarena and set an environment variable DOOMARENA_WEBARENA_BASE_URL="http://XXXX.com"
  3. Run the tests pytest doomarena/browsergym/tests (to exclude webarena use -m 'non local' flag)

Usage

You can run any of the following experiment scripts to test different attack scenarios:

  • run_banner_attack_webarena_reddit_notext.py - Test banner attacks without text
  • run_banner_with_alt_text_attack_webarena_reddit.py - Test banner attacks with alt text
  • run_popup_attack_webarena_reddit.py - Test security warning popup attacks
  • run_bgym_experiment.py - Run a general BrowserGym experiment with customizable parameters

For example:

python -m doomarena.browsergym.scripts.run_bgym_experiment

Experiment Configuration Parameters

When running experiments, you can customize various parameters to configure how the tests are executed. Here's an explanation of the key parameters that can be set:

# Example usage:
run_bgym_experiment(
    base_url="<path to your webarena instance>",
    bgym_experiments=bgym_experiments,
    relaunch=False,
    n_jobs=0,  # set to 0 to display browser, 1 for headless, >=1 for parallel headless with "ray" (incompatible with debugger)
    max_steps=15,  # lower for faster testing, use 15 for full task
    skip_reset_and_massage=True,  # skipping is faster, but set False for reproducing numbers
)
  • base_url: The URL of the BrowserGym server where the experiments will run.

  • bgym_experiments: A list of experiment configurations defining which benchmarks, tasks, agents, and attacks to run.

  • relaunch: When set to True, forces rerunning of experiments even if results already exist but were aborted. Default is False to avoid duplicate runs.

  • n_jobs: Controls parallelization of experiment execution:

    • 0: Runs with visible browser UI (non-headless mode); useful for demo/debugging
    • 1: Runs in headless mode (no UI) with a single process
    • >1: Runs in parallel headless mode using Ray with the specified number of workers
  • max_steps: Maximum number of steps each agent is allowed to take per task. Lower values (5-10) are useful for quick testing, while higher values (30) are recommended for full task completion evaluation.

  • skip_reset_and_massage: When set to True, skips environment reset to speed up debugging. This should always be set to False when making experiments.

Results

Experiment results are stored in the results/browsergym directory, organized by the datetime when they were created. Each results directory contains:

  • Detailed information about benchmarks, attacks, and tasks
  • CSV files with metrics including:
    • Attack Success Rate (ASR)
    • Task Success Rate (TSR)
    • Attack Stealth Rate
    • Input/output token usage
    • Step counts
    • Agent model information

You can drill down into individual tasks using standard BrowserGym tools to further analyze agent behavior during the attacks.

To aggregate the results to csv, run:

from doomarena.browsergym.scripts import collect_results

collect_results('<Path to your experiments>')  # recursive

This will produce one attack_df.csv per task, as well as a globally aggregated one.

Attack Types

We have implemented several attacks which can be found in doomarena/browsergym/src/doomarena/browsergym/attacks:

  1. Security Warning Popup Attacks - Simulates malicious popups that attempt to deceive agents
  2. Banner Attacks - Tests agents against misleading banner elements
  3. SVG Attacks - Evaluates agent behavior with potentially malicious SVG elements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doomarena_browsergym-0.0.3.tar.gz (44.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doomarena_browsergym-0.0.3-py3-none-any.whl (48.3 kB view details)

Uploaded Python 3

File details

Details for the file doomarena_browsergym-0.0.3.tar.gz.

File metadata

  • Download URL: doomarena_browsergym-0.0.3.tar.gz
  • Upload date:
  • Size: 44.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for doomarena_browsergym-0.0.3.tar.gz
Algorithm Hash digest
SHA256 cc3756c7c7aabbeae1743f48975baa488a52e09ac602145ee214e2fdb7095bdb
MD5 c3cc670abe186d95c4f113a98ac290fe
BLAKE2b-256 a81639c98eb659fcd743cdc877acea2b14b76e76afd03d3474e0c6d81775db46

See more details on using hashes here.

File details

Details for the file doomarena_browsergym-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for doomarena_browsergym-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 897c0fd44d1ddf746d653cb56d4fba7c4ff1baa35420e9164a2a063baf87b045
MD5 4fda8b2e6abfe3ad860ac51dc8550728
BLAKE2b-256 8545ed2388c9c55769b0117856776f694c9b50cae4bd1faf0f3c34a2c5e51846

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page