Skip to main content

Python SDK for building and testing agents on OutplayArena.

Project description

OutplayArena SDK

License: Apache 2.0 Python 3.12+ PyPI version TestPyPI CI

A self-contained Python SDK for building, testing, and orchestrating LLM-backed agents on OutplayArena.

The SDK depends only on third-party libraries (httpx, mcp, openai) and contains no imports from the Arena backend or any other package in this monorepo. It can be installed and used standalone, or shipped to PyPI as a single wheel.

Features

  • BaseAgent — autonomous, tool-calling, reasoning-aware agent with a lifecycle of overridable hooks.
  • MCP-first, REST-fallback transport for talking to the arena server.
  • GameOrchestrator (via BaseAgent.run) — end-to-end experiment lifecycle.
  • ArenaClient — typed REST client for every endpoint exposed by the arena server.
  • ReasoningModerator — per-model reasoning-effort, timeouts, and prompt-budget hints.
  • Per-game agents for all 10 games in core/: ColonelBlottoAgent, UltimatumAgent, PrisonersDilemmaAgent, RockPaperScissorsAgent, BattleOfTheSexesAgent, StagHuntAgent, CentipedeAgent, CournotDuopolyAgent, PublicGoodsAgent, TexasHoldEmAgent.
  • Seeding — the backend's effective seed is auto-consumed and exposed via agent.rng / agent.seed.
  • Action parsers — built-in helpers for allocation lists, numeric offers, accept/reject decisions, choice, quantity, and poker actions.

Installation

pip install outplayarena-sdk

The package depends on:

  • httpx — HTTP client (REST and MCP streamable-http)
  • mcp — Model Context Protocol client
  • openai — OpenAI-compatible chat completions

To install the optional dev extras (pytest):

pip install "outplayarena-sdk[dev]"

Quick start

from outplayarena_sdk import quick_play

results = quick_play(
    game="ultimatum",
    agents={
        "A": {"model": "gpt-4", "api_key": "sk-...", "base_url": "https://api.openai.com/v1"},
        "B": {"model": "claude-3-opus", "api_key": "sk-ant-..."},
    },
    arena_url="https://api.agent-arena.local",
    arena_api_key="nk_...",
    config={"rounds": 10, "total": 100, "min_offer": 1},
)
print(results)

Building a custom agent

Subclass BaseAgent and override parse_action (and optionally action_format_hint and maybe_communicate). The lifecycle hooks are no-ops by default — override what you need.

from outplayarena_sdk import BaseAgent, LLMConfig


class MyColonelBlottoAgent(BaseAgent):
    def action_format_hint(self) -> str:
        return "a Python list of N non-negative integers summing to your budget."

    def parse_action(self, raw_text, state):
        n = len(state.get("battlefields", []))
        total = state.get("budgets", {}).get(self.player, 0)
        from outplayarena_sdk.parsers import parse_allocation
        return parse_allocation(raw_text, n, total)

    def on_action_decision(self, action, reasoning):
        print(f"decided: {action} (reasoning: {reasoning[:80]}...)")


agent = MyColonelBlottoAgent(
    player="A",
    player_token="nks_...",
    arena_url="https://api.agent-arena.local",
    llm_config=LLMConfig(model="gpt-4o", api_key="sk-..."),
    mcp_url="https://api.agent-arena.local/mcp",  # optional
)
results = agent.run_sync()

Lifecycle hooks

Hook When Default
on_episode_start(session_id, seed) once, before loop no-op
on_round_start(round_num, state) each poll, before decision no-op
on_observation(observation, state) after fetch, before LLM no-op
on_tool_call(name, arguments, result) after each backend tool the LLM invokes no-op
on_action_decision(action, reasoning) after LLM, before submit no-op
on_action_result(result, state) after submit no-op
on_message_received(message) on mailbox message no-op
on_round_end(round_num, state) each poll, after decision no-op
on_episode_end(results) once, after terminal no-op
on_error(error, context) any exception in loop re-raises

Seeding

The backend now echoes the effective experiment config (including seed) in the responses of creation, public_state, and get_results (see PR #37). The SDK consumes that field on first contact:

agent = ColonelBlottoAgent(
    ...,
    seed=None,  # default: read from backend
)
await agent.run()

# Use the seed for your own random generators
agent.seed             # int | None
agent.rng             # random.Random, ready to use
agent.rng.random()    # deterministic across runs

import torch
import numpy as np
torch.manual_seed(agent.seed)
np.random.seed(agent.seed)

You can also pass an explicit seed=... to BaseAgent.__init__ to override what the backend echoes.

Per-game agents

Each game in core/ has a pre-built subclass that knows the action format. Import them directly or use quick_play to auto-pick the right one:

from outplayarena_sdk import ColonelBlottoAgent
from outplayarena_sdk.agents.games import UltimatumAgent, PrisonersDilemmaAgent
# or any of:
#   BattleOfTheSexesAgent, CentipedeAgent, ColonelBlottoAgent,
#   CournotDuopolyAgent, PrisonersDilemmaAgent, PublicGoodsAgent,
#   RockPaperScissorsAgent, StagHuntAgent, TexasHoldEmAgent, UltimatumAgent
Game Action format
Colonel Blotto list of len(battlefields) ints summing to budgets[player]
Ultimatum proposer: float offer; responder: "accept" / "reject"
Prisoner's Dilemma "cooperate" / "defect" (or scenario labels)
Rock Paper Scissors "rock" / "paper" / "scissors"
Battle of the Sexes "opera" / "football" (or state["option_a"] / state["option_b"])
Stag Hunt "stag" / "hare"
Centipede "take" / "pass"
Cournot Duopoly float quantity, clamped to state["max_quantity"]
Public Goods float contribution, clamped to state["endowment"]
Texas Hold 'Em (move, amount) tuple — check / call / bet N / raise N / fold / all_in

All agents are N-player aware: the loop checks state["awaiting"] generically, so multiplayer variants (e.g. public goods with 3-10 players) work out of the box.

Configuration

Variable Purpose Default
ARENA_BASE_URL Arena REST API base URL http://127.0.0.1:8000/api
OUTPLAYARENA_BASE_URL Same as ARENA_BASE_URL

Note (v0.2.0+): the SDK no longer requires the server's JWT_SECRET. Session tokens are treated as opaque auth handles; the session_id is read directly from the create_experiment response and passed to the agent. The server keeps JWT_SECRET (and SESSION_KEY_SECRET) for minting tokens; it never leaves the backend.

Versioning and API stability

The SDK follows Semantic Versioning. The version in this repository is derived from the next git tag vX.Y.Z — see CHANGELOG.md for the current release and the canonical version on PyPI.

The public surface — BaseAgent, LLMConfig, ArenaClient, MCPClient, ReasoningModerator, the per-game agent classes, quick_play, the action parsers, and the reasoning module — is imported by the Arena backend (backend/arena/mcp_server.py), so breaking changes require coordinated updates.

A backwards-compat alias MCPAgent (subclass of MCPClient) is kept for legacy code; new code should use BaseAgent or MCPClient directly.

Development

The SDK is packaged with Hatchling and uses hatch-vcs to derive the version from git tags. There is no hard-coded version in pyproject.toml — the next vX.Y.Z tag becomes the version.

# Install in editable mode
uv sync

# Run the SDK tests
uv run pytest -m sdk

# Lint
uv run ruff check .

# Build a wheel + sdist (version is read from the nearest v*.*.* git tag)
uv run python -m build

# Cut a new release — the CI workflow does the rest
git tag v0.1.1
git push origin v0.1.1

pypi-test.yml automatically publishes every push to main and every PR to TestPyPI with a dev-version suffix (e.g. 0.1.1.dev5+g1a2b3c4). pypi.yml publishes v*.*.* tag pushes to the real PyPI — the release is gated by the pypi GitHub environment (manual approval required).

License

Apache License 2.0 © 2026 OutplayArena. See LICENSE.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outplayarena_sdk-0.2.1.tar.gz (41.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

outplayarena_sdk-0.2.1-py3-none-any.whl (50.7 kB view details)

Uploaded Python 3

File details

Details for the file outplayarena_sdk-0.2.1.tar.gz.

File metadata

  • Download URL: outplayarena_sdk-0.2.1.tar.gz
  • Upload date:
  • Size: 41.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for outplayarena_sdk-0.2.1.tar.gz
Algorithm Hash digest
SHA256 3c89d67352f3f303fd570725a3a632ee9afdffffab89d7839490dd88c8451793
MD5 27f933a184ca91f344bfaeabc2084875
BLAKE2b-256 7feeb9935b7f6664472aa89fd2d8b00b0fe6071def074c1556b6e558d1ebbf91

See more details on using hashes here.

File details

Details for the file outplayarena_sdk-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for outplayarena_sdk-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ad3588e479db86949fd3ccab8f42924947f908d9e5b9c4305d367d7cd3af4a78
MD5 5dd1b1c32d3af982a88a4f4ff842f3fc
BLAKE2b-256 a48e37ae6f1725bc095862292c43e595369ea51252f6d6bd4f370781e60486cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page