Skip to main content

VisualWebArena benchmark for BrowserGym

Project description

WebArena benchmark for BrowserGym

This package provides browsergym.visualwebarena, which is an unofficial port of the VisualWebArena benchmark for BrowserGym.

Note: the original VisualWebArena codebase has been slightly adapted to ensure compatibility.

Setup

  1. Install the package
pip install browsergym-visualwebarena
  1. Download tokenizer ressources
python -c "import nltk; nltk.download('punkt')"
  1. Setup the web servers (follow the visualwebarena README).

  2. Setup the URLs as environment variables (note the VWA_ prefix)

export VWA_CLASSIFIEDS="$BASE_URL:9001/"
export VWA_CLASSIFIEDS_RESET_TOKEN="4b61655535e7ed388f0d40a93600254c"  # Default reset token for classifieds site, change if you edited its docker-compose.yml
export VWA_SHOPPING="$BASE_URL:7770/"
export VWA_REDDIT="$BASE_URL:9999"
export VWA_WIKIPEDIA="$BASE_URL:8888/wikipedia_en_all_maxi_2022-05/A/User:The_other_Kiwix_guy/Landing"
export VWA_HOMEPAGE="$BASE_URL:4399"
  1. Setup an OpenAI API key
export OPENAI_API_KEY=...

NOTE: be mindful of costs, as VisualWebArena will call GPT4 for certain evaluations (llm_fuzzy_match).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browsergym_visualwebarena-0.5.0.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file browsergym_visualwebarena-0.5.0.tar.gz.

File metadata

File hashes

Hashes for browsergym_visualwebarena-0.5.0.tar.gz
Algorithm Hash digest
SHA256 d41d0ddcfd958c031cd36835a352f3e8fa550a79de35ec329dee4b1ee68cb0a7
MD5 e73254f63811c8a07414c84b48d28f8a
BLAKE2b-256 b3ab4afa496f046fc2221c9de541b84e27d8885468df24d0e5e12775eec6f844

See more details on using hashes here.

File details

Details for the file browsergym_visualwebarena-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for browsergym_visualwebarena-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e745228a7a269c912ef615672112fbdaf1585037b6d46676a36c2b6f9da1181
MD5 083266a01fdfe1109ec3a319b4325afd
BLAKE2b-256 536271737f9f72b47245598f2100863e8e904fde09edb2ed2c3315a22289ed3c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page