decentralized multi-agent LLM-based framework

These details have not been verified by PyPI

Project description

Terrarium

alt text

Overview :herb:

Terrarium is a hackable, modular, and configurable open-source framework for studying and evaluating decentralized LLM-based multi-agent systems (MAS). As the capabilities of agents progress (e.g., tool calling) and their state space expands (e.g., the internet), multi-agent systems will naturally arise in unique and unexpected scenarios. This repo aims to provide researchers, engineers, and students the ability to study this new agentic paradigm in an isolated playground for studying agent behavior, vulnerabilities, and safety. It enables full customization of the communication protocol, communication proxy, environment, tool usage, and agents. View the paper at https://arxiv.org/pdf/2510.14312v1.

This repo is under active development :gear:, so please raise an issue for new features, bugs, or suggestions. If you find this repo useful or interesting please :star: it!

Framework Diagram

Features

Blackboards (Communication Proxies): Append-only event/communication log which acts as a component of the agent's observation and communication with other agents.
Two-Phase Communication Protocol: The implemented communication protocol containes two phases, a (1) planing phase and an (2) execution phase. The planning phase enables communcation between agents to faciliate better action selection during the executation phase. During the executation phase, the agents take actions that affect their environment. This is done in a predefined sequential order to avoid environment simulation clashes.
MCP Servers: We use MCP servers to provide easy integration with varying LLM client APIs while enabling easier configuration of environment and external tools.
DCOP Environments: DCOPs (Distributed Constraint Optimization Problems) have a ground-truth solution and a well-grounded evalution function, evaluating the actions taken by a set of agents. We implement DCOP environments from the CoLLAB benchmark.
- SmartGrid - A home agent's objecitve is to schedule appliance usage throughout the day without overworking the powergrid (Uses real-world home-meter data)
- MeetingScheduling - A calendar agent is tasked with assigning meetings with other agents, trying to satisfy preferences and constraints with respect to other agents schedules (Uses real-world locations)
- PersonalAssistant - An assistant agent chooses outfits for a human while meeting social norm preferences, the preferences of the human, and constrained outfit selection (Uses fully synthetic data)

Documentation

Use the following documentation for detailed instructions about on how to use the framework.

Follow the quick guide provided below for basic testing.

Quick Start

Install (PyPI)

Install Terrarium:

pip install "terrarium-agents[providers,science,plots]"

CoLLAB is required for the DCOP environments. Clone it somewhere and point Terrarium at it:

git clone https://github.com/Saad-Mahmud/CoLLAB_SEA.git /path/to/CoLLAB
export TERRARIUM_COLLAB_PATH=/path/to/CoLLAB

Optional extras:

terrarium-agents[openai], terrarium-agents[anthropic], terrarium-agents[gemini] (provider SDKs)
terrarium-agents[vllm] (local vLLM serving; heavy)
terrarium-agents[all] (everything)

Install (Source)

Clone the repository and update submodules. A submodule exists at external/CoLLAB for a suite of external environments.

git clone <repository-url> Terrarium
cd Terrarium
git submodule update --init --recursive

In this repo, we use uv as our extremely fast package manager. If not already installed follow these installation instructions.

# Run this at the root directory .../Terrarium
uv venv --python 3.11 .venv
source .venv/bin/activate
uv sync

Terrarium enables two types of servicing: (1) API-based providers and (2) vLLM integration for open-source models.

For API-based providers, we currently support OpenAI, Google, Anthropic, and together.ai models. Copy .env.example to .env and set your API keys (never put real keys in .env.example).

cp -n .env.example .env
# Edit `.env` and set (as needed):
# OPENAI_API_KEY=...
# GOOGLE_API_KEY=...
# ANTHROPIC_API_KEY=...
# TOGETHER_API_KEY=...
# FIREWORKS_API_KEY=...

Next, set the model and provider you want to use at llm.provider and llm.<provider>.model in examples/configs/<config>.yaml.

For vLLM servicing, simply set llm.provider:"vllm" and llm.vllm.auto_start_server:true in examples/configs/<config>.yaml for auto-startup and shutdown for a single run. If you require a persistent vLLM server, which is useful for using the same vLLM model for different configurations or environments without the costly startup time, then set llm.vllm.persistent_server:true. To kill all vLLM servers run pkill -f vllm.entrypoints.openai.api_server.

Running a Multi-Agent Trajectory

Start up the persistent MCP server once for tool calls and the blackboard server:

python src/server.py & export MCP_PID=$!

Run a simulation using an execution script along with a config file:

python examples/base_main.py --config <yaml_config_path>

When done, close the persistent MCP server:

kill -9 $MCP_PID

Attack Scenarios

Terrarium ships three reference attacks that exercise different points in the stack. Implementations live in attack_module/attack_modules.py and can be mixed into any simulation via the provided runners.

Attack	What it targets	Entry point	Payload config
Agent poisoning	Replaces every `post_message` payload from the compromised agent before it reaches the blackboard.	`examples/attack_main.py --attack_type agent_poisoning`	`examples/configs/attack_config.yaml` (`poisoning_string`)
Context overflow	Appends a large filler block to agent messages to force downstream context truncation.	`examples/attack_main.py --attack_type context_overflow`	`examples/configs/attack_config.yaml` (`header`, `filler_token`, `repeat`, `max_chars`)
Communication protocol poisoning	Injects malicious system messages into every blackboard via the MCP layer.	`examples/attack_main.py --communication_protocol_poisoning`	`examples/configs/attack_config.yaml` (`poisoning_string`)

Running agent-side attacks

Use the unified driver to launch both the standard run and the selected attack:

# Agent poisoning example
python examples/attack_main.py \
  --config examples/configs/meeting_scheduling.yaml \
  --poison_payload examples/configs/attack_config.yaml \
  --attack_type agent_poisoning

# Context overflow example
python examples/attack_main.py \
  --config examples/configs/meeting_scheduling.yaml \
  --poison_payload examples/configs/attack_config.yaml \
  --attack_type context_overflow

Quick Tips

When working with Terrarium, use sublass definitions (e.g., A2ACommunicationProtocol, EvilAgent) of the base module classes (e.g., CommunicationProtocol, Agent) rather than directly changing the base module classes.
When creating new environments, ensure they inherit the AbstractEnvironment class and all methods are properly defined.
Keep in mind some models (e.g., gpt-4.1-nano) are not capable enough of utilizing tools to take actions in the environment, so track the completion rate such as Meeting completion: 15/15 (100.0%) for MeetingScheduling.

vLLM Provider (Open-Source Models)

Install vLLM (pip install vllm) and make sure CUDA is available.
Set llm.provider: "vllm" in your config and describe the single server under llm.vllm.
All agents share the one configured vLLM model; advanced routing is disabled in this setup.

Best small model for successful tool use tested so far: Qwen/Qwen2.5-7B-Instruct. We have not tested on large >70B open-source models, but use use the Berkeley Function-Calling Leaderboard - BFCL as a reference.

Minimal example:

llm:
  provider: "vllm"
  vllm:
    auto_start_server: true
    persistent_server: false
    startup_timeout: 180
    models:
      - checkpoint: "/data/models/Qwen2-7B-Instruct"
        served_model_name: "Qwen2-7B-Instruct"
        host: "127.0.0.1"
        port: 8001
        tensor_parallel_size: 1
        trust_remote_code: true
        additional_args:
          - "--max-model-len"
          - "65536"

If auto_start_server is true and the configured endpoint is unreachable, Terrarium launches python -m vllm.entrypoints.openai.api_server with the supplied checkpoint and writes stdout/stderr to logs/vllm/<model_id>.log. Processes are cleaned up automatically after each run.

Dashboard

Consolidates runs and logs into a static dashboard for easier navigation:

Export the data bundle (runs + config):

python dashboards/build_data.py \
  --logs-root logs \
  --config examples/configs/meeting_scheduling.yaml \
  --output dashboards/public/dashboard_data.json

Serve the static front-end (or simply open the file via your browser if it allows file:// fetches – a local server is recommended):
```
python -m http.server 5050 --directory dashboards/public
```
Navigate to http://127.0.0.1:5050 to inspect the raw event logs parsed directly from dashboard_data.json in the browser (no backend required).
New runs? Simply repeat step (1.) and refresh the website (No need to restart the server)

Tooling (MCP Servers)

To standardize tool usage among different model providers, we employ an MCP server using FastMCP. Each environment has their own set of MCP tools that are readily available to the agent with the functionality of permitting certain tools by the communication protocol. Some examples of environment tools are MeetingScheduling -> attend_meeting(.), PersonalAssistant -> choose_outfit(.), and SmartGrid -> assign_source(.).

Logging

Terrarium incorporates a set of loggers for prompts, tool usage, agent trajectories, and blackboards. All loggers are defined in src/logger.py, conisting of

BlackboardLogger -- Logs events for all existing blackboards in human-readable format (Useful for tracking conversations between agents and tool calls)
ToolCallLogger -- Tracks the tool called, success, and duration for each agent (Useful for debugging tool implementations)
PromptLogger -- Shows exact system and user prompts used (Useful for debugging F-string formatted prompts)
AgentTrajectoryLogger -- Logs the multi-step conversation of each agent showing their pseudo-reasoning traces (Useful for approximately evaluating the internal reasoning of agents and their associated tool calls)

All logs are saved to logs/<environment>/<tag_model>/<run_timestamp>/seed_<seed>/, including a snapshot of the config used for that run.

Paper Citation

@article{nakamura2025terrarium,
  title={Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies},
  author={Nakamura, Mason and Kumar, Abhinav and Mahmud, Saaduddin and Abdelnabi, Sahar and Zilberstein, Shlomo and Bagdasarian, Eugene},
  journal={arXiv preprint arXiv:2510.14312},
  year={2025}
}

License

MIT. AA

Contributing

We welcome pull requests and issues that improve Terrarium’s tooling, environments, docs, or general ecosystem. Before opening a PR, start a brief issue or discussion outlining the change so we can coordinate scope and avoid overlap. If you are unsure whether an idea fits, just ask.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.1

Feb 18, 2026

This version

0.1.0

Feb 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

terrarium_agents-0.1.0.tar.gz (121.5 kB view details)

Uploaded Feb 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

terrarium_agents-0.1.0-py3-none-any.whl (144.0 kB view details)

Uploaded Feb 17, 2026 Python 3

File details

Details for the file terrarium_agents-0.1.0.tar.gz.

File metadata

Download URL: terrarium_agents-0.1.0.tar.gz
Upload date: Feb 17, 2026
Size: 121.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for terrarium_agents-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5106de60a2ae43338cd5b17f3bc4248cae60f472a422962c5a97f1f73356a856`
MD5	`ebb68b1f38a80b5e1b5e715608e843fc`
BLAKE2b-256	`c7ed3b3f3687aff647a3fdc4b437f622dda973e832cbe15f55509deaafe4c726`

See more details on using hashes here.

Provenance

The following attestation bundles were made for terrarium_agents-0.1.0.tar.gz:

Publisher: publish.yml on umass-aisec/Terrarium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: terrarium_agents-0.1.0.tar.gz
- Subject digest: 5106de60a2ae43338cd5b17f3bc4248cae60f472a422962c5a97f1f73356a856
- Sigstore transparency entry: 957460142
- Sigstore integration time: Feb 17, 2026
Source repository:
- Permalink: umass-aisec/Terrarium@5af2f761cabbeff5860f4d0c98230f6a703a617a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/umass-aisec
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5af2f761cabbeff5860f4d0c98230f6a703a617a
- Trigger Event: release

File details

Details for the file terrarium_agents-0.1.0-py3-none-any.whl.

File metadata

Download URL: terrarium_agents-0.1.0-py3-none-any.whl
Upload date: Feb 17, 2026
Size: 144.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for terrarium_agents-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c79d760c8c0f78c770d75ed9c508667797fb87f298f0da768f5ca9dda4c00a2c`
MD5	`95ccb8b57c4c287da457eaefce5d7316`
BLAKE2b-256	`dae41d6b21379678519d4e009fd51d2fd87f92cc4ca2254512abe39f653c9a50`

See more details on using hashes here.

Provenance

The following attestation bundles were made for terrarium_agents-0.1.0-py3-none-any.whl:

Publisher: publish.yml on umass-aisec/Terrarium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: terrarium_agents-0.1.0-py3-none-any.whl
- Subject digest: c79d760c8c0f78c770d75ed9c508667797fb87f298f0da768f5ca9dda4c00a2c
- Sigstore transparency entry: 957460150
- Sigstore integration time: Feb 17, 2026
Source repository:
- Permalink: umass-aisec/Terrarium@5af2f761cabbeff5860f4d0c98230f6a703a617a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/umass-aisec
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5af2f761cabbeff5860f4d0c98230f6a703a617a
- Trigger Event: release

terrarium-agents 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Terrarium

Overview :herb:

Features

Documentation

Quick Start

Install (PyPI)

Install (Source)

Running a Multi-Agent Trajectory

Attack Scenarios

Running agent-side attacks

Quick Tips

vLLM Provider (Open-Source Models)

Dashboard

Tooling (MCP Servers)

Logging

Paper Citation

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance