squadro

Play the Squadro Board Game against Someone Else or an AI

These details have not been verified by PyPI

Project links

Project description

from squadro.state.evaluators.ml import ModelConfigfrom squadro.agents.montecarlo_agent import MonteCarloDeepQLearningAgent# Squadro

Documentation

Squadro is a two-player board game on a 5x5 board. The goal is to have four of our pawns perform a return trip before the opponent. Each pawn has a respective speed given by the number of dots (1–3) at their starting position. If an opponent's pawn crosses one of my pawns, then my pawn returns to the side of the board.

Visit my website for a visual and qualitative description.

Demo

Other games?

The code is modular enough to be easily applied to other games. To do so, you must implement its state in state.py, and make a few other changes in the code base depending on your needs. Please raise an issue if discussion is needed.

Installation

[!TIP] If running on a Linux machine without intent to use a GPU, run this beforehand to install only the CPU version of the pytorch library:
pip install torch --index-url https://download.pytorch.org/whl/cpu

The most straightforward way is to simply install it from PyPI via:

pip install squadro

If you want to install it from source, which is necessary for development, follow the instructions here.

If some dependencies release changes that break the code, you can install the project from its lock file—which fixes the dependency versions to ensure reproducibility:

pip install -r requirements.txt

Usage

This package can be used in the following ways:

Play

You can play against someone else or many different types of computer algorithms. See the Agents section below for more details.

[!TIP] If you run into the following error on a Linux machine when launching the game:

libGL error: failed to load driver

Then try setting the following environment variable beforehand:
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6

Play against another human

To play the game with someone else, run the following command:

import squadro

squadro.GamePlay(n_pawns=5, first=None).run()

To access all the parameters to play, see the doc:

help(squadro.GamePlay.__init__)  # for the arguments to RealTimeAnimatedGame

Play against the computer

To play against the computer, set agent_1 to one of the squadro.AVAILABLE_AGENTS.

For instance:

squadro.GamePlay(agent_1='random').run()

[!TIP] To play against our best algorithm, run:
squadro.GamePlay(agent_1='best').run()
Let us know if you ever beat it!

Play against your trained AI

After training your AI as described in the Training section, you can play against her using:

import squadro

agent = squadro.MonteCarloDeepQLearningAgent(model_path='path/to/model')
squadro.GamePlay(agent_1=agent).run()

Agents

Most computer algorithms discretize the game into states and actions. Here, the state is the position of the pawns and the available actions are the possible moves of the pawns.

Squadro is a finite state machine, meaning that the next state of the game is completely determined by the current state and the action played. With this definition, one can see that the game is a Markov Decision Process (MDP). At each state, the current player can play different actions, which lead to different states. Then the next player can play different actions from any of those new states, etc. The future of the game can be represented as a tree, whose branches are the actions that lead to different states.

An algorithm can explore that space of possibilities to infer the best move to play now. As the tree is huge, it is not possible to explore all the possible paths until the end of the game. Typically, they explore only a small fraction of the tree and then use the information gathered from those states to make a decision. More precisely, those two phases are:

State exploration: exploring the space of states by a careful choice of actions. The most common exploration methods are Minimax and Monte Carlo Tree Search (MCTS). Minimax explores all the states up to a specific depth, while MCTS navigates until it finds a state that has not been visited yet. Minimax can be sped up by skipping the search in the parts of the tree that won't affect the final decision; this method is called alpha-beta pruning.
State evaluation: evaluating a state. If we have a basic understanding of the game and how to win, one can design a heuristic (state evaluation function) that gives an estimate of how good it is to be in that state / position. Otherwise, it can often be better to use a computer algorithm to evaluate the state.
- The simplest algorithm to estimate the state is to randomly let the game play until it is over (i.e., pick random actions for both players). When played enough times, it can give the probability to win in that state.
- More complex, and hence accurate, algorithms are using reinforcement learning (AI). They learn from experience by storing information about each state/action in one of:
  - Q value function, a lookup table for each state and action;
  - deep Q network (DQN), a neural network that approximates the Q value function, which is necessary when the state space is huge (i.e., cannot be stored in memory).

List of available agents:

human: another local human player (i.e., both playing on the same computer)
random: a computer that plays randomly among all available moves
ab_relative_advancement: a computer that lists the possible moves from the current position and evaluates them directly (i.e., it "thinks" only one move ahead), where the evaluation function is the player's advancement
relative_advancement: a computer that lists the possible moves from the current position and evaluates them directly (i.e., it "thinks" only one move ahead), where the evaluation function is the player's advancement compared to the other player
ab_relative_advancement: a computer that plays minimax with alpha-beta pruning (depth ~4), where the evaluation function is the player's advancement compared to the other player
mcts_advancement: Monte Carlo tree search, where the evaluation function is the player's advancement compared to the other player
mcts_rollout: Monte Carlo tree search, where the evaluation function is determined by a random playout until the end of the game
mcts_q_learning: Monte Carlo tree search, where the evaluation function is determined by a lookup table
mcts_deep_q_learning: Monte Carlo tree search, where the evaluation function is determined by a convolutional neural network

You can also access the most updated list of available agents with:

import squadro

print(squadro.AVAILABLE_AGENTS)

Training

One can train a model using reinforcement learning (RL) algorithms. Currently, Squadro supports two such algorithms:

Q-Learning

One needs to train a lookup table mapping each state to its value.

import squadro

squadro.logger.setup(section='training')

trainer = squadro.QLearningTrainer(
    n_pawns=3,
    lr=.3,
    eval_steps=100,
    eval_interval=300,
    n_steps=100_000,
    parallel=8,
    model_path='path/to/model'
)
trainer.run()

It should take a few hours to train on a typical CPU (8-16 cores).

Note that there are many more parameters to tweak, if desired. See all of them in the doc:

help(squadro.QLearningTrainer)

Deep Q-Learning

Here the state-action value is approximated by a neural network.

import squadro

squadro.logger.setup(section=['training', 'benchmark'])

trainer = squadro.DeepQLearningTrainer(
    eval_games=50,
    eval_interval=300,
    backprop_interval=20,
    model_path='path/to/model',
    model_config=squadro.ModelConfig(),
    init_from=None,
    n_pawns=5,
)
trainer.run()

For three pawns, it should take a few hours to train on a typical CPU (8–16 cores), and it is much faster on a GPU. For five pawns, it may take a few days.

Once done, one can use the model; see the next section below (setting the appropriate value for model_path, e.g., '...').

Simulations

You can simulate a game between two computer algorithms. Set agent_0 and agent_1 to any of the AVAILABLE_AGENTS above and run:

game = squadro.Game(agent_0='random', agent_1='random')
game.run()
print(game)
game.to_file('game_results.json')

Animations

You can render an animation of a game between two computer algorithms. Press the left and right keys to navigate through the game.

game = squadro.Game(agent_0='random', agent_1='random')
squadro.GameAnimation(game).show()

Tests

pytest squadro

Feedback

For any issue / bug report / feature request, open an issue.

Contributions

To provide upgrades or fixes, open a pull request.

Contributors

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.4

Jul 18, 2025

1.0.3

Jun 15, 2025

This version

1.0.2

Jun 15, 2025

1.0.1

Jun 15, 2025

1.0.0

Jun 15, 2025

0.1.2

Jun 15, 2025

0.1.1

Jun 15, 2025

0.1.0

Jun 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

squadro-1.0.2.tar.gz (88.8 kB view details)

Uploaded Jun 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

squadro-1.0.2-py3-none-any.whl (114.2 kB view details)

Uploaded Jun 15, 2025 Python 3

File details

Details for the file squadro-1.0.2.tar.gz.

File metadata

Download URL: squadro-1.0.2.tar.gz
Upload date: Jun 15, 2025
Size: 88.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for squadro-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`e58d78f74de0ae51867f5a40bfb9b4b110e78257664c9c64e7968c28cd134bed`
MD5	`c77ad47b6e2f52f15f2f764e19684413`
BLAKE2b-256	`1408142e7df853c3c5cd9b422116ebf8201995947762e1f8f9f83bee98921c69`

See more details on using hashes here.

File details

Details for the file squadro-1.0.2-py3-none-any.whl.

File metadata

Download URL: squadro-1.0.2-py3-none-any.whl
Upload date: Jun 15, 2025
Size: 114.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for squadro-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6a093a9ece8957ce4315f5dfdef150e48080c5f800cd88d95ab82613a08a8338`
MD5	`3fdef1aafa51218a789fea1acbeff208`
BLAKE2b-256	`9122a6a5ab98ac1802421b6a0177469471f636c5dcbf4848bd5d365e14df3331`

See more details on using hashes here.

squadro 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Documentation

Demo

Other games?

Installation

Usage

Play

Play against another human

Play against the computer

Play against your trained AI

Agents

Training

Q-Learning

Deep Q-Learning

Simulations

Animations

Tests

Feedback

Contributions

Contributors

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes