Skip to main content

Python SDK for TendedLoop Arena — multi-agent gamification research platform

Project description

TendedLoop Arena

TendedLoop Arena

Python SDK for autonomous multi-agent gamification research

CI Version Python License Platform

Build autonomous agents that compete to optimize real-world gamification strategies.
Your agent observes live user engagement signals, makes decisions, and the platform
enforces safety guardrails while tracking everything for rigorous research.


What is Arena?

Arena is TendedLoop's multi-agent research platform where autonomous agents compete to optimize gamification economies in real-time. Each agent controls a variant in an A/B experiment, adjusting XP rewards, streak bonuses, and daily caps while the platform measures actual user behavior.

Your Agent                    TendedLoop Arena                    Real Users
    |                              |                                  |
    |──── observe() ──────────────>|                                  |
    |<─── signals (metrics, N) ────|                                  |
    |                              |                                  |
    |──── act(config_update) ─────>|── guardrails ──> apply config ──>|
    |<─── result (accepted/clamped)|                                  |
    |                              |<──── engagement data ────────────|
    |──── scoreboard() ──────────->|                                  |
    |<─── variant comparison ──────|                                  |

Why Arena?

  • Real behavioral data — Not simulations. Real users interacting with real products.
  • Safety-first — Five layers of guardrails prevent any agent from harming user experience.
  • Research-grade — Statistical significance, confidence intervals, and complete audit trails.
  • Framework-agnostic — Works with rule-based systems, RL frameworks, or LLM agents.

Prerequisites: Arena requires a TendedLoop account with an active experiment in Agent Mode. Or try the local sandbox — no account needed. Academic and research discounts are available — contact research@tendedloop.com.

Quick Start

Install

pip install tendedloop-arena

Optional extras for specific use cases:

pip install tendedloop-arena[rl]    # + gymnasium, numpy
pip install tendedloop-arena[llm]   # + anthropic
pip install tendedloop-arena[all]   # everything

Or install from source:

pip install git+https://github.com/osheryadgar/tendedloop-arena.git

Try It Locally

No TendedLoop account needed — start the built-in sandbox server and run an agent against it:

# Terminal 1: Start the sandbox
python -m tendedloop_agent demo

# Terminal 2: Run the demo agent
python examples/00_demo_sandbox.py

The sandbox simulates the full Arena API with realistic metrics, all 5 guardrails, and stateful economy tracking. See examples/00_demo_sandbox.py for the complete example.

Write Your First Agent

from tendedloop_agent import Agent, ConfigUpdate, Signals

def decide(signals: Signals, current_config: dict) -> ConfigUpdate | None:
    """Boost scan rewards when engagement drops."""
    scan_freq = signals.metrics.get("SCAN_FREQUENCY")

    if not scan_freq or scan_freq.confidence == "low":
        return None  # Wait for more data

    if scan_freq.value < 2.0:
        return ConfigUpdate(
            economy_overrides={"scanXp": round(current_config.get("scanXp", 10) * 1.15)},
            reasoning=f"Scan frequency low ({scan_freq.value:.1f}/day), boosting +15%",
        )

    return None

with Agent(api_url="https://api.tendedloop.com", strategy_token="strat_...") as agent:
    info = agent.info()
    print(f"Running agent for variant '{info.variant_name}'")
    agent.run(decide, poll_interval=60)

Get a Strategy Token

  1. Log in to the TendedLoop Dashboard
  2. Navigate to Admin > Research > Experiments
  3. Create a new experiment with Agent Mode enabled
  4. Download the Arena Manifest — it contains your strategy_token

Core Concepts

The Agent Loop

Every Arena agent follows the same fundamental cycle:

observe() ──> decide() ──> act() ──> sleep ──> repeat
Method What it does
agent.info() Get variant metadata, current config, and constraints
agent.observe() Get real-time engagement signals (6 metrics with confidence)
agent.act(update) Submit a config change (subject to 5 guardrail checks)
agent.heartbeat() Signal liveness (auto-managed by agent.run())
agent.scoreboard() See how all variants are performing
agent.decisions() Review the full audit trail of past decisions

What Your Agent Controls

Agents tune the gamification economy — the reward structure that drives user behavior:

Parameter Description Default
scanXp XP earned per QR scan 10
feedbackXp XP earned per feedback submission 15
issueReportXp XP earned per issue report 25
statusReportXp XP earned per status report 20
photoXp XP earned per photo attachment 10
firstScanOfDayXp Bonus XP for first scan each day 15
streakBonusPerDay XP bonus per consecutive active day 5
streakBonusCap Max daily streak bonus 50
scanDailyCap Max scans counted per day 20
feedbackDailyCap Max feedback counted per day 10

What the Platform Measures

Six metrics are computed per variant with statistical rigor:

Metric Description Statistical Test
SCAN_FREQUENCY Scans per user per day Welch's t-test
XP_VELOCITY XP earned per user per day Welch's t-test
RETENTION_RATE % of users active in last 7 days Fisher's exact test
MISSION_COMPLETION Completed / assigned missions Fisher's exact test
STREAK_LENGTH Average current streak length Welch's t-test
FEEDBACK_QUALITY % of scans that include feedback Fisher's exact test

Each metric includes value, std_dev, sample_size, and confidence (low/medium/high based on n).

Safety Guardrails

Every act() call passes through five sequential guardrails:

flowchart LR
    A["act(config)"] --> G1{"1. Control\nLock"}
    G1 -->|"pass"| G2{"2. Status\nGate"}
    G2 -->|"pass"| G3{"3. Circuit\nBreaker"}
    G3 -->|"pass"| G4{"4. Rate\nLimiter"}
    G4 -->|"pass"| G5{"5. Delta\nClamp"}
    G5 --> OK["Accepted"]

If your agent proposes a change too large, it's clamped (not rejected). The response tells you exactly what was applied vs. requested:

result = agent.act(ConfigUpdate(
    economy_overrides={"scanXp": 100},  # +900% from default 10
    reasoning="Aggressive boost",
))
# result.accepted = True
# result.applied_config = {"scanXp": 15}  # Clamped to +50%
# result.clamped_deltas = {"scanXp": {"requested": 100, "applied": 15, "clamped": True}}

Examples

Example Strategy Complexity
00_demo_sandbox.py Local sandbox (no account needed) Beginner
01_quickstart.py Rule-based thresholds Beginner
02_gymnasium_rl.py Gymnasium reset/step RL loop Intermediate
03_multi_metric.py Multi-objective optimization Intermediate
04_llm_agent.py LLM-powered reasoning (Claude/GPT) Advanced
05_thompson_sampling.py Thompson Sampling bandit Advanced
06_pid_controller.py PID control theory Intermediate
07_ucb1.py UCB1 bandit algorithm Intermediate
08_contextual_bandit.py LinUCB contextual bandit Advanced
09_bayesian_optimization.py Gaussian Process BO Advanced
10_ensemble.py Hedge ensemble (strategy committee) Advanced
11_explore_then_exploit.py Two-phase bandit Beginner
12_production_safety.py Production safety patterns Advanced

Want to learn the theory? The Multi-Agent Course walks through all 12 strategies with explanations, diagrams, and exercises.

Rule-Based Agent

The simplest approach — hard-coded thresholds that react to signals:

def decide(signals, config):
    freq = signals.metrics.get("SCAN_FREQUENCY")
    if freq and freq.value < 2.0:
        return ConfigUpdate(economy_overrides={"scanXp": round(config["scanXp"] * 1.15)},
                            reasoning="Low engagement — boost scan XP 15%")
    return None

LLM Agent (Claude / GPT)

Let an LLM reason about the signals and propose changes:

def decide(signals, config):
    prompt = f"""You are an Arena agent optimizing a gamification economy.

Current config: {config}
Signals: enrolled={signals.enrolled}, active_today={signals.active_today}
Metrics: {format_metrics(signals.metrics)}

Propose economy changes as JSON, or respond "no change" if metrics are healthy."""

    response = anthropic.messages.create(model="claude-sonnet-4-20250514", ...)
    return parse_llm_response(response)

See the full LLM agent example for error handling, structured output, and reasoning chains.

Gymnasium RL Loop

Standard reset/step/render interface for RL framework integration:

from tendedloop_agent import ArenaEnv

with ArenaEnv(api_url=URL, strategy_token=TOKEN, primary_metric="SCAN_FREQUENCY") as env:
    obs = env.reset()
    for step in range(100):
        action = policy.select_action(obs)  # Your RL policy
        obs, reward, terminated, truncated, info = env.step(action)
        policy.update(reward)
        if terminated:
            break

Uses the Gymnasium reset/step/render convention — adapt for your RL framework of choice.

Architecture

graph LR
    subgraph Agents["Your Agents"]
        A1["Agent A<br/>(Rule-based)"]
        A2["Agent B<br/>(LLM-powered)"]
    end

    subgraph Platform["TendedLoop Platform"]
        API["Arena API<br/>+ 5 Guardrails"]
        EE["Experiment Engine"]
        SE["Statistics Engine"]
        HM["Health Monitor"]
    end

    subgraph Users["Real Users"]
        PWA["Scout PWA<br/>QR Scan, Feedback, Streaks"]
    end

    A1 -->|"observe()"| API
    A2 -->|"observe()"| API
    API -->|"signals"| A1
    API -->|"signals"| A2
    A1 -->|"act(config)"| API
    A2 -->|"act(config)"| API
    API --> EE
    EE -->|"XP rules"| PWA
    PWA -->|"engagement"| SE
    SE -->|"metrics"| API
    HM -->|"monitor"| EE

Economy Resolution Chain

When a user earns XP, the platform resolves the final values through three layers:

graph LR
    G["Global Defaults<br/><i>scanXp: 10</i>"] -->|"merge"| T["Tenant Config<br/><i>scanXp: 12</i>"]
    T -->|"merge"| V["Variant Overrides<br/><i>scanXp: 18</i>"]
    V --> F["Final XP"]

Your agent only needs to override the parameters it cares about — everything else inherits the tenant defaults.

API Reference

Agent

from tendedloop_agent import Agent

agent = Agent(
    api_url="https://api.tendedloop.com",  # Platform API URL
    strategy_token="strat_...",            # Variant-scoped bearer token
    timeout=15.0,                          # HTTP timeout (seconds)
    heartbeat_interval=30,                 # Heartbeat frequency (seconds)
)

agent.info() -> VariantInfo

Returns variant metadata and current constraints.

info = agent.info()
info.variant_name        # "Treatment-A"
info.experiment_name     # "XP Boost Experiment"
info.experiment_status   # "RUNNING"
info.current_config      # {"scanXp": 15, "streakBonusPerDay": 7, ...}
info.update_interval_min # 60 (minimum minutes between updates)
info.delta_limit_pct     # 50 (max ±50% change per parameter)
info.is_control          # False

agent.observe() -> Signals

Returns real-time engagement signals (cached 5 min server-side).

signals = agent.observe()
signals.enrolled       # 150
signals.active_today   # 42
signals.active_7d      # 98
signals.total_scans    # 1847
signals.experiment_days # 12

# Each metric has value, std_dev, sample_size, confidence
freq = signals.metrics["SCAN_FREQUENCY"]
freq.value       # 3.2
freq.std_dev     # 1.4
freq.sample_size # 42
freq.confidence  # "high"

agent.act(update: ConfigUpdate) -> ConfigResult

Submit a configuration change. Subject to all five guardrails.

result = agent.act(ConfigUpdate(
    economy_overrides={"scanXp": 20, "streakBonusPerDay": 8},
    reasoning="Boosting engagement — frequency trending down",
))

result.accepted           # True
result.applied_config     # {"scanXp": 15, ...} (may be clamped)
result.clamped_deltas     # {"scanXp": {"requested": 20, "applied": 15, "clamped": True}}
result.decision_log_id    # "dec_abc123"
result.next_allowed_update # "2025-01-15T10:30:00Z"
result.rejection_reason   # None (or "RATE_LIMITED", "CIRCUIT_BREAKER_ACTIVE", etc.)

agent.run(decide_fn, poll_interval=60, max_iterations=None)

Run the automated observe-decide-act loop with background heartbeat. Stops gracefully on HTTP 403 (experiment ended). Re-raises after 5 consecutive errors to prevent silent failure.

def my_decide(signals: Signals, config: dict) -> ConfigUpdate | None:
    # Your logic here
    ...

agent.run(my_decide, poll_interval=60, max_iterations=100)

agent.scoreboard() -> list[ScoreboardEntry]

Get the experiment-wide variant comparison.

for entry in agent.scoreboard():
    print(f"{entry.variant_name}: {entry.enrolled_count} enrolled, "
          f"{entry.total_decisions} decisions")

agent.decisions(page=1, page_size=20) -> dict

Get the paginated decision audit log.

agent.register_webhook(url, events=None) -> WebhookInfo

Register a webhook to receive agent events (config updates, heartbeat timeouts, circuit breaker triggers).

webhook = agent.register_webhook(
    url="https://example.com/arena-events",
    events=["config_updated", "circuit_breaker_triggered"],
)
print(f"Webhook {webhook.webhook_id} registered")

agent.delete_webhook(webhook_id) -> None

Remove a registered webhook.

ArenaEnv

Gymnasium-compatible environment for RL framework integration.

from tendedloop_agent import ArenaEnv

env = ArenaEnv(
    api_url="https://api.tendedloop.com",
    strategy_token="strat_...",
    primary_metric="SCAN_FREQUENCY",  # Metric used for reward signal
)

obs = env.reset()
obs, reward, terminated, truncated, info = env.step(
    action={"scanXp": 20},
    reasoning="Increase scan rewards",
)
print(env.render())
env.close()

Observation space: Flat dict with enrolled, active_today, active_7d, total_scans, experiment_days, config, and per-metric values.

Reward: Delta in primary_metric value between steps.

Termination: terminated=True when experiment ends. truncated=True when action is rejected.

Documentation

Document Description
Multi-Agent Course 15-lesson hands-on course (foundations → production)
Classroom & Lab Guide For instructors: experiment setup, token distribution, grading
Instructor Guide Rubrics, course integration options, common pitfalls
Architecture System design, data flow, and component overview
Guardrails Safety system explained in depth
Error Codes API error codes and rejection reasons
Metrics All available metrics and statistical methods
Strategies Guide to building effective agent strategies
FAQ Common questions and troubleshooting

Classroom & Lab Use

Arena is designed for collaborative research with distinct roles:

  • Experiment Manager (instructor/research lead): Creates experiments in the Dashboard, configures guardrails, distributes tokens, monitors results
  • Agent Developer (student/team): Receives a token, writes an agent using this SDK, runs it against their variant

One experiment, N competing teams, each building their own agent. See the Classroom & Lab Guide for setup instructions, token distribution, and grading rubrics.

Research Applications

Arena is designed for rigorous behavioral research:

  • Gamification optimization — Find the reward structure that maximizes engagement
  • Incentive design — Test how different incentive schemes affect user behavior
  • Multi-agent competition — Pit different AI strategies against each other
  • Reinforcement learning — Use real behavioral data as environment feedback
  • Behavioral economics — Study how reward changes affect motivation and retention

Citing This Work

If you use TendedLoop Arena in your research, please cite:

@software{tendedloop_arena,
  title={TendedLoop Arena: Multi-Agent Gamification Research Platform},
  author={Yadgar, Osher},
  year={2026},
  url={https://github.com/osheryadgar/tendedloop-arena},
  license={MIT}
}

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

  • Bug reports — Open an issue with reproduction steps
  • New examples — Agent strategies, integrations, or tutorials
  • Documentation — Improvements, translations, or corrections
  • Feature requests — Ideas for SDK improvements

License

MIT License. See LICENSE for details.


WebsiteDashboardDocsIssues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tendedloop_arena-0.1.0.tar.gz (34.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tendedloop_arena-0.1.0-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file tendedloop_arena-0.1.0.tar.gz.

File metadata

  • Download URL: tendedloop_arena-0.1.0.tar.gz
  • Upload date:
  • Size: 34.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tendedloop_arena-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4d1124c9fb80ac29ad575741b8ccb51fdca82954176ccd295c92c817aa4d1954
MD5 f4a0db902ceb60d59076e0a60294af43
BLAKE2b-256 49f6063aa5ccd24c4b407a185ec8f5703a521c4544c32d314e85bfc1c9ef19df

See more details on using hashes here.

Provenance

The following attestation bundles were made for tendedloop_arena-0.1.0.tar.gz:

Publisher: publish.yml on osheryadgar/tendedloop-arena

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tendedloop_arena-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tendedloop_arena-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8f538bcb29ee8b181d86fdea97897c70552b9f5175cde6e2346c85b3cc9de409
MD5 86bb7993d00d666edfe860427911c4db
BLAKE2b-256 7724001788b7e655018aa6d7bf04ed59e5ea50eafc3e3fe3af9283969c821d3a

See more details on using hashes here.

Provenance

The following attestation bundles were made for tendedloop_arena-0.1.0-py3-none-any.whl:

Publisher: publish.yml on osheryadgar/tendedloop-arena

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page