Meta Agents Research Environments is a research-driven environment designed to simulate complex, real-life tasks that span several minutes and require multiple steps to be solved. Unlike static simulation environments, this platform introduces a dynamic setting where the state of the environment evolves and new information is continuously integrated.

These details have not been verified by PyPI

Project links

Project description

Meta Agents Research Environments (ARE)

A research environment for simulating complex, real-life tasks that require multi-step reasoning and dynamic adaptation.

Meta Agents Research Environments (ARE) is a platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this research platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges. In particular, ARE runs the Gaia2 benchmark, a follow-up to Gaia, evaluating a broader range of agent capabilities.

Background
Install
Usage
API
Contributing
License

Background

ARE addresses critical gaps in AI agent evaluation by providing:

Dynamic Environments: Scenarios that evolve over time with new information and changing conditions
Multi-Step Reasoning: Complex tasks requiring 10+ steps and several minutes to complete
Real-World Focus: Grounded situations that mirror actual real-world challenges
Comprehensive Evaluation: The Gaia2 benchmark with 800 scenarios across multiple domains

Getting Started


Quick Start	Get up and running with your first scenario in just a few minutes with step-by-step instructions.
Gaia2 Evaluation	Build and evaluate your agents on the Gaia2 benchmark, a comprehensive suite of 800 dynamic scenarios across 10 universes.
Gaia2 Blog Post	Learn more about Gaia2 on the Hugging Face blog.
Paper	Read the research paper detailing the Gaia2 benchmark and evaluation methodology.
Demo	Try the ARE Demo on Hugging Face — Play around with the agent platform directly in your browser, no installation required!
Gaia2 Leaderboard	Check the self-published results from Gaia2 Benchmark runs.
Learn More	Dive deeper into the core concepts of agents, environments, apps, events, and scenarios.

Install

For complete installation instructions and setup options, see the Installation Guide.

Prerequisites

First, install uv, a fast Python package installer and resolver.

Quick Start with uvx

The fastest way to get started is using uvx to run commands directly:

# Run Gaia2 benchmark scenarios
uvx --from meta-agents-research-environments are-benchmark gaia2-run --hf meta-agents-research-environments/gaia2 --hf_split validation -l 1

# Run custom scenarios
uvx --from meta-agents-research-environments are-run -s scenario_tutorial -a default

All the commands in this README and the documentation are available through uvx.

Traditional Installation

Alternatively, install the package directly with different dependency sets:

# Minimal install (core dependencies only)
# Good for basic benchmarking and running most scenarios
pip install meta-agents-research-environments

# With GUI (recommended for interactive exploration)
pip install "meta-agents-research-environments[gui]"

Which installation should I choose?

Minimal (meta-agents-research-environments): For running benchmarks and scenarios via CLI
With GUI ([gui]): Adds web interface for interactive exploration (recommended for local development)

Usage

Basic Commands

After installation, these command-line tools are available:

Run Individual Scenarios

are-run -s scenario_find_image_file -a default

Benchmark Evaluation

are-benchmark run -d /path/to/scenarios --agent default --limit 10

Gaia2 Evaluation

are-benchmark gaia2-run --hf meta-agents-research-environments/gaia2 --hf_split validation -l 5

Interactive GUI

are-gui -s scenario_find_image_file

The GUI provides a web-based interface for interactive scenario exploration and real-time agent monitoring. When started, it typically runs at http://localhost:8080. The interface supports different view modes:

Playground Mode: Chat-like interface for direct agent interaction
Scenarios Mode: Structured task execution and evaluation with DAG visualization

Scenario DAG Visualization

For detailed information about the GUI features, navigation, and workspace usage, see the Understanding UI Guide.

Model Configuration

ARE supports multiple AI model providers through LiteLLM:

# Llama API
export LLAMA_API_KEY="your-api-key"
are-benchmark run --hf meta-agents-research-environments/gaia2 --hf_split validation \
  --model Llama-3.1-70B-Instruct --provider llama-api --agent default

# Local deployment
are-benchmark run --hf meta-agents-research-environments/gaia2 --hf_split validation \
  --model your-local-model --provider local \
  --endpoint "http://localhost:8000" --agent default

For detailed information on configuring different model providers, environment variables, and advanced options, see the LLM Configuration Guide.

Run any command with --help to see all available options.

Example: Gaia2 Benchmark

# Set up your model configuration
export LLAMA_API_KEY="your-api-key"

# Run a validation set to test your setup
are-benchmark run --hf meta-agents-research-environments/gaia2 --hf_split validation \
  --model meta-llama/Llama-3.3-70B-Instruct --model_provider novita \
  --agent default --limit 10 --output_dir ./validation_results

# Run complete Gaia2 evaluation for leaderboard submission
are-benchmark gaia2-run --hf meta-agents-research-environments/gaia2 \
  --model Llama-3.1-70B-Instruct --provider llama-api \
  --agent default --output_dir ./gaia2_results \
  --hf_upload my-org/gaia2-results

API

Core Concepts

Agents: AI entities that interact with the environment using ReAct (Reasoning + Acting) framework
Apps: Interactive applications (email, calendar, file system) that provide APIs for agent interaction
Events: Dynamic elements that make environments evolve over time
Scenarios: Complete tasks combining apps, events, and validation logic

Documentation

Comprehensive documentation is available at:

Main Documentation: docs/index.rst
Tutorials: docs/tutorials/
API Reference: docs/api_reference/

Key documentation sections:

Core Concepts - Understanding agents, apps, events, and scenarios
Benchmarking Guide - Complete benchmarking and evaluation reference
Gaia2 Evaluation - Detailed Gaia2 benchmark submission guide
Scenario Development - Creating custom scenarios
CLI Reference - Complete command-line interface documentation

Quick Links

Installation Guide: docs/user_guide/installation.rst
Quickstart Tutorial: docs/quickstart.rst

Contributing

We welcome contributions! Please see our Contributing Guide for details on:

Setting up the development environment
Running tests and linting
Submitting pull requests
Creating new scenarios and apps

License

This project is licensed under the MIT License. See the LICENSE file for details.

Citation

If you use Meta Agents Research Environments in your work, please cite:

@misc{andrews2025arescalingagentenvironments,
      title={ARE: Scaling Up Agent Environments and Evaluations},
      author={Pierre Andrews and Amine Benhalloum and Gerard Moreno-Torres Bertran and Matteo Bettini and Amar Budhiraja and Ricardo Silveira Cabral and Virginie Do and Romain Froger and Emilien Garreau and Jean-Baptiste Gaya and Hugo Laurençon and Maxime Lecanu and Kunal Malkan and Dheeraj Mekala and Pierre Ménard and Grégoire Mialon and Ulyana Piterbarg and Mikhail Plekhanov and Mathieu Rita and Andrey Rusakov and Thomas Scialom and Vladislav Vorotilov and Mengjue Wang and Ian Yu},
      year={2025},
      eprint={2509.17158},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2509.17158},
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.2.0

Nov 11, 2025

1.1.0

Oct 14, 2025

1.0.1

Sep 23, 2025

1.0.0

Sep 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meta_agents_research_environments-1.2.0.tar.gz (22.0 MB view details)

Uploaded Nov 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

meta_agents_research_environments-1.2.0-py3-none-any.whl (1.4 MB view details)

Uploaded Nov 11, 2025 Python 3

File details

Details for the file meta_agents_research_environments-1.2.0.tar.gz.

File metadata

Download URL: meta_agents_research_environments-1.2.0.tar.gz
Upload date: Nov 11, 2025
Size: 22.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.19

File hashes

Hashes for meta_agents_research_environments-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`ad99047aef6d597ed9c4fd74fd634b7a648fd20c104212a5686958f7a9a4bcd4`
MD5	`cbee42f7aa22cb2a651de5355819ce3c`
BLAKE2b-256	`341719cb42484e2faab08ad0896ed4a1696028f2788a71069c1836a3535d75e4`

See more details on using hashes here.

File details

Details for the file meta_agents_research_environments-1.2.0-py3-none-any.whl.

File metadata

Download URL: meta_agents_research_environments-1.2.0-py3-none-any.whl
Upload date: Nov 11, 2025
Size: 1.4 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.19

File hashes

Hashes for meta_agents_research_environments-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c16caa85abb36f42172ee4cb9e1df478d953d610f05442120f9b4066db85e5b2`
MD5	`2e5aa18f0d7a8a51cc84c91554934cff`
BLAKE2b-256	`eddc6c8741faf4144227b7d21cede49999f35c163c9ebd7a47f025f0e8df7faa`

See more details on using hashes here.

meta-agents-research-environments 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Meta Agents Research Environments (ARE)

Table of Contents

Background

Getting Started

Install

Prerequisites

Quick Start with uvx

Traditional Installation

Usage

Basic Commands

Model Configuration

Example: Gaia2 Benchmark

API

Core Concepts

Documentation

Quick Links

Contributing

License

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes