Python SDK for building and testing agents on OutplayArena.

These details have not been verified by PyPI

Project links

Project description

OutplayArena SDK

A self-contained Python SDK for building, testing, and orchestrating LLM-backed agents on OutplayArena.

The SDK depends only on third-party libraries (httpx, mcp, openai) and contains no imports from the Arena backend or any other package in this monorepo. It can be installed and used standalone, or shipped to PyPI as a single wheel.

Features

BaseAgent — autonomous, tool-calling, reasoning-aware agent with a lifecycle of overridable hooks.
MCP-first, REST-fallback transport for talking to the arena server.
GameOrchestrator (via BaseAgent.run) — end-to-end experiment lifecycle.
ArenaClient — typed REST client for every endpoint exposed by the arena server.
ReasoningModerator — per-model reasoning-effort, timeouts, and prompt-budget hints.
Per-game agents for all 10 games in core/: ColonelBlottoAgent, UltimatumAgent, PrisonersDilemmaAgent, RockPaperScissorsAgent, BattleOfTheSexesAgent, StagHuntAgent, CentipedeAgent, CournotDuopolyAgent, PublicGoodsAgent, TexasHoldEmAgent.
Seeding — the backend's effective seed is auto-consumed and exposed via agent.rng / agent.seed.
Action parsers — built-in helpers for allocation lists, numeric offers, accept/reject decisions, choice, quantity, and poker actions.

Installation

pip install outplayarena-sdk

The package depends on:

httpx — HTTP client (REST and MCP streamable-http)
mcp — Model Context Protocol client
openai — OpenAI-compatible chat completions

To install the optional dev extras (pytest):

pip install "outplayarena-sdk[dev]"

Quick start

from outplayarena_sdk import quick_play

results = quick_play(
    game="ultimatum",
    agents={
        "A": {"model": "gpt-4", "api_key": "sk-...", "base_url": "https://api.openai.com/v1"},
        "B": {"model": "claude-3-opus", "api_key": "sk-ant-..."},
    },
    arena_url="https://api.agent-arena.local",
    arena_api_key="nk_...",
    config={"rounds": 10, "total": 100, "min_offer": 1},
)
print(results)

Building a custom agent

Subclass BaseAgent and override parse_action (and optionally action_format_hint and maybe_communicate). The lifecycle hooks are no-ops by default — override what you need.

from outplayarena_sdk import BaseAgent, LLMConfig


class MyColonelBlottoAgent(BaseAgent):
    def action_format_hint(self) -> str:
        return "a Python list of N non-negative integers summing to your budget."

    def parse_action(self, raw_text, state):
        n = len(state.get("battlefields", []))
        total = state.get("budgets", {}).get(self.player, 0)
        from outplayarena_sdk.parsers import parse_allocation
        return parse_allocation(raw_text, n, total)

    def on_action_decision(self, action, reasoning):
        print(f"decided: {action} (reasoning: {reasoning[:80]}...)")


agent = MyColonelBlottoAgent(
    player="A",
    player_token="nks_...",
    arena_url="https://api.agent-arena.local",
    llm_config=LLMConfig(model="gpt-4o", api_key="sk-..."),
    mcp_url="https://api.agent-arena.local/mcp",  # optional
)
results = agent.run_sync()

Lifecycle hooks

Hook	When	Default
`on_episode_start(session_id, seed)`	once, before loop	no-op
`on_round_start(round_num, state)`	each poll, before decision	no-op
`on_observation(observation, state)`	after fetch, before LLM	no-op
`on_tool_call(name, arguments, result)`	after each backend tool the LLM invokes	no-op
`on_action_decision(action, reasoning)`	after LLM, before submit	no-op
`on_action_result(result, state)`	after submit	no-op
`on_message_received(message)`	on mailbox message	no-op
`on_round_end(round_num, state)`	each poll, after decision	no-op
`on_episode_end(results)`	once, after terminal	no-op
`on_error(error, context)`	any exception in loop	re-raises

Seeding

The backend now echoes the effective experiment config (including seed) in the responses of creation, public_state, and get_results (see PR #37). The SDK consumes that field on first contact:

agent = ColonelBlottoAgent(
    ...,
    seed=None,  # default: read from backend
)
await agent.run()

# Use the seed for your own random generators
agent.seed             # int | None
agent.rng             # random.Random, ready to use
agent.rng.random()    # deterministic across runs

import torch
import numpy as np
torch.manual_seed(agent.seed)
np.random.seed(agent.seed)

You can also pass an explicit seed=... to BaseAgent.__init__ to override what the backend echoes.

Per-game agents

Each game in core/ has a pre-built subclass that knows the action format. Import them directly or use quick_play to auto-pick the right one:

from outplayarena_sdk import ColonelBlottoAgent
from outplayarena_sdk.agents.games import UltimatumAgent, PrisonersDilemmaAgent
# or any of:
#   BattleOfTheSexesAgent, CentipedeAgent, ColonelBlottoAgent,
#   CournotDuopolyAgent, PrisonersDilemmaAgent, PublicGoodsAgent,
#   RockPaperScissorsAgent, StagHuntAgent, TexasHoldEmAgent, UltimatumAgent

Game	Action format
Colonel Blotto	list of `len(battlefields)` ints summing to `budgets[player]`
Ultimatum	proposer: float offer; responder: `"accept"` / `"reject"`
Prisoner's Dilemma	`"cooperate"` / `"defect"` (or scenario labels)
Rock Paper Scissors	`"rock"` / `"paper"` / `"scissors"`
Battle of the Sexes	`"opera"` / `"football"` (or `state["option_a"]` / `state["option_b"]`)
Stag Hunt	`"stag"` / `"hare"`
Centipede	`"take"` / `"pass"`
Cournot Duopoly	float quantity, clamped to `state["max_quantity"]`
Public Goods	float contribution, clamped to `state["endowment"]`
Texas Hold 'Em	`(move, amount)` tuple — `check` / `call` / `bet N` / `raise N` / `fold` / `all_in`

All agents are N-player aware: the loop checks state["awaiting"] generically, so multiplayer variants (e.g. public goods with 3-10 players) work out of the box.

Configuration

Variable	Purpose	Default
`ARENA_BASE_URL`	Arena REST API base URL	`http://127.0.0.1:8000/api`
`OUTPLAYARENA_BASE_URL`	Same as `ARENA_BASE_URL`	—

Note (v0.2.0+): the SDK no longer requires the server's JWT_SECRET. Session tokens are treated as opaque auth handles; the session_id is read directly from the create_experiment response and passed to the agent. The server keeps JWT_SECRET (and SESSION_KEY_SECRET) for minting tokens; it never leaves the backend.

Versioning and API stability

The SDK follows Semantic Versioning. The version in this repository is derived from the next git tag vX.Y.Z — see CHANGELOG.md for the current release and the canonical version on PyPI.

The public surface — BaseAgent, LLMConfig, ArenaClient, MCPClient, ReasoningModerator, the per-game agent classes, quick_play, the action parsers, and the reasoning module — is imported by the Arena backend (backend/arena/mcp_server.py), so breaking changes require coordinated updates.

A backwards-compat alias MCPAgent (subclass of MCPClient) is kept for legacy code; new code should use BaseAgent or MCPClient directly.

Development

The SDK is packaged with Hatchling and uses hatch-vcs to derive the version from git tags. There is no hard-coded version in pyproject.toml — the next vX.Y.Z tag becomes the version.

# Install in editable mode
uv sync

# Run the SDK tests
uv run pytest -m sdk

# Lint
uv run ruff check .

# Build a wheel + sdist (version is read from the nearest v*.*.* git tag)
uv run python -m build

# Cut a new release — the CI workflow does the rest
git tag v0.1.1
git push origin v0.1.1

pypi-test.yml automatically publishes every push to main and every PR to TestPyPI with a dev-version suffix (e.g. 0.1.1.dev5+g1a2b3c4). pypi.yml publishes v*.*.* tag pushes to the real PyPI — the release is gated by the pypi GitHub environment (manual approval required).

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Jul 2, 2026

0.2.2

Jun 30, 2026

0.2.1

Jun 29, 2026

This version

0.2.0

Jun 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outplayarena_sdk-0.2.0.tar.gz (41.1 kB view details)

Uploaded Jun 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

outplayarena_sdk-0.2.0-py3-none-any.whl (50.7 kB view details)

Uploaded Jun 29, 2026 Python 3

File details

Details for the file outplayarena_sdk-0.2.0.tar.gz.

File metadata

Download URL: outplayarena_sdk-0.2.0.tar.gz
Upload date: Jun 29, 2026
Size: 41.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for outplayarena_sdk-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`378e9e07cd7e6d426a2ad8d07922a7d732386d77cae00954cff694b2fa00a0fa`
MD5	`6d1cb7bc1cd610b29a9d7d9b8d23c2e7`
BLAKE2b-256	`f7dffa226af73bb2de62992d10948f573941b87b8289f7cb0f4116fdb6829a9a`

See more details on using hashes here.

File details

Details for the file outplayarena_sdk-0.2.0-py3-none-any.whl.

File metadata

Download URL: outplayarena_sdk-0.2.0-py3-none-any.whl
Upload date: Jun 29, 2026
Size: 50.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for outplayarena_sdk-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ebdea325778955e8d69f48db653e2ab324753194e4e830795f98e388616b8bdf`
MD5	`05dda3c7bc3b6254559b2cdbd2f32c52`
BLAKE2b-256	`f1c093e16f851ada90e70b416d634a9efca8371170348e3bad46ac757650d6b3`

See more details on using hashes here.

outplayarena-sdk 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OutplayArena SDK

Features

Installation

Quick start

Building a custom agent

Lifecycle hooks

Seeding

Per-game agents

Configuration

Versioning and API stability

Development

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes