Skip to main content

A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

Project description

TextArena logo

A suite of 100+ {single,two,multi}-Player texted based games for benchmarking and training of LLMs.

Play | Leaderboard | Games | Examples

GitHub Repo stars PyPI Downloads Discord PyPI version

Updates

  • 31/07/2025 We added SettlersOfCatan to TextArena!
  • 14/07/2025 Announcing MindGames a NeurIPS2025 competition for training LLMs on various TextArena games that require theory of mind.
  • 01/07/2025 Release of v0.6.9 with 100 games and simplified states, new observation wrappers for training and default wrappers for environments.
  • 01/07/2025 Release of SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning introducing RL via self-play on TextArena games as a potential new training paradigm.
  • 22/06/2025 Release of UnstableBaselines a light weight async online RL library for training LLMs on TextArena games.
  • 16/04/2025 Release of the TextArena paper
  • 14/02/2025 Release of the new, stable version for both pip and the website
  • 31/01/2025 Initial demo release highlighted by Andrej Karpathy (crashing all our servers)

Introduction

TextArena is a flexible and extensible framework for training, evaluating, and benchmarking models in text-based games. It follows an OpenAI Gym-style interface, making it straightforward to integrate with a wide range of reinforcement learning and language model frameworks.

Getting Started

Installation

Install TextArena directly from PyPI:

pip install textarena

Offline Play

The only requirement Agents need to fulfill is having a call function that accepts string observations and returns string action. We have implemented a number of basic agents that you can find here. In this example, we show how you can let GPT-4o-mini play against anthropic/claude-3.5-haiku in a game of TicTacToe.

We will be using the OpenRouterAgent, so first you need to set you OpenRouter API key:

export OPENROUTER_API_KEY="YOUR_OPENROUTER_API_KEY"

Now we can build the models and let them play:

import textarena as ta

# Initialize agents
agents = {
    0: ta.agents.OpenRouterAgent(model_name="GPT-4o-mini"),
    1: ta.agents.OpenRouterAgent(model_name="anthropic/claude-3.5-haiku"),
}

# Initialize the environment
env = ta.make(env_id="TicTacToe-v0")

# wrap it for additional visualizations
env = ta.wrappers.SimpleRenderWrapper(env=env) 

env.reset(num_players=len(agents))

done = False
while not done:
    player_id, observation = env.get_observation()
    action = agents[player_id](observation)
    done, step_info = env.step(action=action)

rewards, game_info = env.close()

Citation arXiv

If you use TextArena in your research, please cite:

@misc{guertler2025textarena,
    title={TextArena}, 
    author={Leon Guertler and Bobby Cheng and Simon Yu and Bo Liu and Leshem Choshen and Cheston Tan},
    year={2025},
    eprint={2504.11442},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2504.11442}, 
}

How to Contribute:

If you have any questions at all, feel free to reach out on discord. The below issues are great starting points if you want to contribute:

  • Transfer the 'How to Contribute' from here to individual issues
  • Make RushHour board generation algorithmic
  • extend Fifteenpuzzel to arbitrary sizes
  • Add a nice end-of-game screen to the SimpleRenderWrapper visualizations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textarena-0.7.4.tar.gz (955.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textarena-0.7.4-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file textarena-0.7.4.tar.gz.

File metadata

  • Download URL: textarena-0.7.4.tar.gz
  • Upload date:
  • Size: 955.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for textarena-0.7.4.tar.gz
Algorithm Hash digest
SHA256 28bb9170d7718f2ae05e4515bea82262422731e563fc7318a9e7983de0cadd4f
MD5 1623c7326b0256451015f1d595b47654
BLAKE2b-256 ba044a3ca42093d0be2a9c377ae3335a6c6baac1d278ae932562ec69f339d172

See more details on using hashes here.

File details

Details for the file textarena-0.7.4-py3-none-any.whl.

File metadata

  • Download URL: textarena-0.7.4-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for textarena-0.7.4-py3-none-any.whl
Algorithm Hash digest
SHA256 684784e78278e518066f67557ee93b47c238d16cbbd15d3abdaa3147562d3024
MD5 381d3792efc8357ad2d4d7a14ab23f2f
BLAKE2b-256 26b49a9ba65154aff853c75b3d7324319d168ad9c69c6097f4aa3c16da7d9ef3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page