Python SDK for building and testing agents on OutplayArena.
Project description
OutplayArena SDK
A self-contained Python SDK for building, testing, and orchestrating LLM-backed agents on OutplayArena.
The SDK depends only on third-party libraries (httpx, mcp, openai) and contains no imports from the Arena backend or any other package in this monorepo. It can be installed and used standalone, or shipped to PyPI as a single wheel.
Features
BaseAgent— autonomous, tool-calling, reasoning-aware agent with a lifecycle of overridable hooks.- MCP-first, REST-fallback transport for talking to the arena server.
GameOrchestrator(viaBaseAgent.run) — end-to-end experiment lifecycle.ArenaClient— typed REST client for every endpoint exposed by the arena server.ReasoningModerator— per-model reasoning-effort, timeouts, and prompt-budget hints.- Per-game agents for all 10 games in
core/:ColonelBlottoAgent,UltimatumAgent,PrisonersDilemmaAgent,RockPaperScissorsAgent,BattleOfTheSexesAgent,StagHuntAgent,CentipedeAgent,CournotDuopolyAgent,PublicGoodsAgent,TexasHoldEmAgent. - Seeding — the backend's effective
seedis auto-consumed and exposed viaagent.rng/agent.seed. - Action parsers — built-in helpers for allocation lists, numeric offers, accept/reject decisions, choice, quantity, and poker actions.
Installation
pip install outplayarena-sdk
The package depends on:
httpx— HTTP client (REST and MCP streamable-http)mcp— Model Context Protocol clientopenai— OpenAI-compatible chat completions
To install the optional dev extras (pytest):
pip install "outplayarena-sdk[dev]"
Quick start
from outplayarena_sdk import quick_play
results = quick_play(
game="ultimatum",
agents={
"A": {"model": "gpt-4", "api_key": "sk-...", "base_url": "https://api.openai.com/v1"},
"B": {"model": "claude-3-opus", "api_key": "sk-ant-..."},
},
arena_url="https://api.agent-arena.local",
arena_api_key="nk_...",
config={"rounds": 10, "total": 100, "min_offer": 1},
)
print(results)
Building a custom agent
Subclass BaseAgent and override parse_action (and optionally action_format_hint and maybe_communicate). The lifecycle hooks are no-ops by default — override what you need.
from outplayarena_sdk import BaseAgent, LLMConfig
class MyColonelBlottoAgent(BaseAgent):
def action_format_hint(self) -> str:
return "a Python list of N non-negative integers summing to your budget."
def parse_action(self, raw_text, state):
n = len(state.get("battlefields", []))
total = state.get("budgets", {}).get(self.player, 0)
from outplayarena_sdk.parsers import parse_allocation
return parse_allocation(raw_text, n, total)
def on_action_decision(self, action, reasoning):
print(f"decided: {action} (reasoning: {reasoning[:80]}...)")
agent = MyColonelBlottoAgent(
player="A",
player_token="nks_...",
arena_url="https://api.agent-arena.local",
llm_config=LLMConfig(model="gpt-4o", api_key="sk-..."),
mcp_url="https://api.agent-arena.local/mcp", # optional
)
results = agent.run_sync()
Lifecycle hooks
| Hook | When | Default |
|---|---|---|
on_episode_start(session_id, seed) |
once, before loop | no-op |
on_round_start(round_num, state) |
each poll, before decision | no-op |
on_observation(observation, state) |
after fetch, before LLM | no-op |
on_tool_call(name, arguments, result) |
after each backend tool the LLM invokes | no-op |
on_action_decision(action, reasoning) |
after LLM, before submit | no-op |
on_action_result(result, state) |
after submit | no-op |
on_message_received(message) |
on mailbox message | no-op |
on_round_end(round_num, state) |
each poll, after decision | no-op |
on_episode_end(results) |
once, after terminal | no-op |
on_error(error, context) |
any exception in loop | re-raises |
Seeding
The backend now echoes the effective experiment config (including seed) in the responses of creation, public_state, and get_results (see PR #37). The SDK consumes that field on first contact:
agent = ColonelBlottoAgent(
...,
seed=None, # default: read from backend
)
await agent.run()
# Use the seed for your own random generators
agent.seed # int | None
agent.rng # random.Random, ready to use
agent.rng.random() # deterministic across runs
import torch
import numpy as np
torch.manual_seed(agent.seed)
np.random.seed(agent.seed)
You can also pass an explicit seed=... to BaseAgent.__init__ to override what the backend echoes.
Per-game agents
Each game in core/ has a pre-built subclass that knows the action format. Import them directly or use quick_play to auto-pick the right one:
from outplayarena_sdk import ColonelBlottoAgent
from outplayarena_sdk.agents.games import UltimatumAgent, PrisonersDilemmaAgent
# or any of:
# BattleOfTheSexesAgent, CentipedeAgent, ColonelBlottoAgent,
# CournotDuopolyAgent, PrisonersDilemmaAgent, PublicGoodsAgent,
# RockPaperScissorsAgent, StagHuntAgent, TexasHoldEmAgent, UltimatumAgent
| Game | Action format |
|---|---|
| Colonel Blotto | list of len(battlefields) ints summing to budgets[player] |
| Ultimatum | proposer: float offer; responder: "accept" / "reject" |
| Prisoner's Dilemma | "cooperate" / "defect" (or scenario labels) |
| Rock Paper Scissors | "rock" / "paper" / "scissors" |
| Battle of the Sexes | "opera" / "football" (or state["option_a"] / state["option_b"]) |
| Stag Hunt | "stag" / "hare" |
| Centipede | "take" / "pass" |
| Cournot Duopoly | float quantity, clamped to state["max_quantity"] |
| Public Goods | float contribution, clamped to state["endowment"] |
| Texas Hold 'Em | (move, amount) tuple — check / call / bet N / raise N / fold / all_in |
All agents are N-player aware: the loop checks state["awaiting"] generically, so multiplayer variants (e.g. public goods with 3-10 players) work out of the box.
Configuration
| Variable | Purpose | Default |
|---|---|---|
ARENA_BASE_URL |
Arena REST API base URL | http://127.0.0.1:8000/api |
OUTPLAYARENA_BASE_URL |
Same as ARENA_BASE_URL |
— |
Note (v0.2.0+): the SDK no longer requires the server's
JWT_SECRET. Session tokens are treated as opaque auth handles; thesession_idis read directly from thecreate_experimentresponse and passed to the agent. The server keepsJWT_SECRET(andSESSION_KEY_SECRET) for minting tokens; it never leaves the backend.
Versioning and API stability
The SDK follows Semantic Versioning. The version in this
repository is derived from the next git tag vX.Y.Z — see
CHANGELOG.md for the current release and the canonical version
on PyPI.
The public surface — BaseAgent, LLMConfig, ArenaClient, MCPClient,
ReasoningModerator, the per-game agent classes, quick_play, the action
parsers, and the reasoning module — is imported by the Arena backend
(backend/arena/mcp_server.py), so breaking changes require coordinated
updates.
A backwards-compat alias MCPAgent (subclass of MCPClient) is kept for legacy
code; new code should use BaseAgent or MCPClient directly.
Development
The SDK is packaged with Hatchling and uses
hatch-vcs to derive the version from git
tags. There is no hard-coded version in pyproject.toml — the next
vX.Y.Z tag becomes the version.
# Install in editable mode
uv sync
# Run the SDK tests
uv run pytest -m sdk
# Lint
uv run ruff check .
# Build a wheel + sdist (version is read from the nearest v*.*.* git tag)
uv run python -m build
# Cut a new release — the CI workflow does the rest
git tag v0.1.1
git push origin v0.1.1
pypi-test.yml automatically publishes every push to main and every PR to
TestPyPI with a dev-version
suffix (e.g. 0.1.1.dev5+g1a2b3c4). pypi.yml publishes v*.*.* tag pushes
to the real PyPI — the
release is gated by the pypi GitHub environment (manual approval required).
License
Apache License 2.0 © 2026 OutplayArena. See LICENSE.
Links
- Repository: https://github.com/OutplayArena/arena
- Documentation: https://arena.core-aix.org/docs
- Issues: https://github.com/OutplayArena/arena/issues
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file outplayarena_sdk-0.2.1.tar.gz.
File metadata
- Download URL: outplayarena_sdk-0.2.1.tar.gz
- Upload date:
- Size: 41.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c89d67352f3f303fd570725a3a632ee9afdffffab89d7839490dd88c8451793
|
|
| MD5 |
27f933a184ca91f344bfaeabc2084875
|
|
| BLAKE2b-256 |
7feeb9935b7f6664472aa89fd2d8b00b0fe6071def074c1556b6e558d1ebbf91
|
File details
Details for the file outplayarena_sdk-0.2.1-py3-none-any.whl.
File metadata
- Download URL: outplayarena_sdk-0.2.1-py3-none-any.whl
- Upload date:
- Size: 50.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad3588e479db86949fd3ccab8f42924947f908d9e5b9c4305d367d7cd3af4a78
|
|
| MD5 |
5dd1b1c32d3af982a88a4f4ff842f3fc
|
|
| BLAKE2b-256 |
a48e37ae6f1725bc095862292c43e595369ea51252f6d6bd4f370781e60486cc
|