Skip to main content

VisualWebArena benchmark for BrowserGym

Project description

WebArena benchmark for BrowserGym

This package provides browsergym.visualwebarena, which is an unofficial port of the VisualWebArena benchmark for BrowserGym.

Note: the original VisualWebArena codebase has been slightly adapted to ensure compatibility.

Setup

  1. Install the package
pip install browsergym-visualwebarena
  1. Download tokenizer ressources
python -c "import nltk; nltk.download('punkt')"
  1. Setup the web servers (follow the visualwebarena README).

  2. Setup the URLs as environment variables (note the VWA_ prefix)

export VWA_CLASSIFIEDS="$BASE_URL:9001/"
export VWA_CLASSIFIEDS_RESET_TOKEN="4b61655535e7ed388f0d40a93600254c"  # Default reset token for classifieds site, change if you edited its docker-compose.yml
export VWA_SHOPPING="$BASE_URL:7770/"
export VWA_REDDIT="$BASE_URL:9999"
export VWA_WIKIPEDIA="$BASE_URL:8888/wikipedia_en_all_maxi_2022-05/A/User:The_other_Kiwix_guy/Landing"
export VWA_HOMEPAGE="$BASE_URL:4399"
  1. Setup an OpenAI API key
export OPENAI_API_KEY=...

NOTE: be mindful of costs, as VisualWebArena will call GPT4 for certain evaluations (llm_fuzzy_match).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browsergym_visualwebarena-0.10.1.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file browsergym_visualwebarena-0.10.1.tar.gz.

File metadata

File hashes

Hashes for browsergym_visualwebarena-0.10.1.tar.gz
Algorithm Hash digest
SHA256 a2a402e0abce97c944373db1db14acdbfbc48b2c0ec075a2370e52565af31743
MD5 f32777e04ab01bfe35466e80ef5eb1f6
BLAKE2b-256 4403f4ae2b5538b0828e482509499eff34ca4f9fbb43480ee139ec81d1ae6d2c

See more details on using hashes here.

File details

Details for the file browsergym_visualwebarena-0.10.1-py3-none-any.whl.

File metadata

File hashes

Hashes for browsergym_visualwebarena-0.10.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9e5aa417cf1c961bee14b752f482fec58609775ab05c4f5599fd9f8a8b2ab7ed
MD5 f4ca4d11c8d06078f4e6d7bd5dc33a42
BLAKE2b-256 6db420a21a947fd9542d6f36c08c50a8a2dec08a9e141a17537576e50130e419

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page