Profile of McGill NLP

Last released Apr 25, 2025

BrowserGym integration for the WebLINX benchmark

Last released Jul 11, 2025

Official library for AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Last released Apr 10, 2026

Agent-as-Annotators: Structured Distillation of Web Agent Capabilities

Last released Apr 5, 2026

LLM2Vec-Gen: Generative Embeddings from Large Language Models

Last released Jun 25, 2024

Llama-powered agents for automatic web browsing

Last released Sep 19, 2023

Empirical evaluation of retrieval-augmented instruction-following models.

Last released Oct 1, 2024

The official weblinx library

Last released Apr 11, 2023

The Statcan Dialogue Dataset

Last released Feb 26, 2025

SafeArena is a benchmark for agent safety

McGill NLP