Hex board game AI with self-play learning based on the AlphaZero algorithm
Project description
Azalea
playing to learn to play
Azalea is a reinterpretation of the AlphaZero game AI learning algorithm for the Hex board game.
Features
- Straightforward reimplementation of the AlphaZero algorithm except for MCTS parallelization (see below)
- Pre-trained model for Hex board game
- Fast MCTS implementation through Numba JIT acceleration.
- Fast Hex game move generation implementation through Numba.
- Parallelized self play to saturate Nvidia V100 GPU during training
- AI policy evaluation through round robin tournament, also parallelized
- Tested on Ubuntu 16.04
- Requires Python 3.6 and PyTorch 0.4
Differences to published AlphaZero
- Single GPU implementation only - tested on Nvidia V100, with 8 CPU's for move generation and MCTS, and 1 GPU for the policy network.
- Only Hex game is implemented, though the code supports adding more games. Two components are needed for a new game: move generator and policy network, with board input and moves output adjusted to the new game.
- MCTS simulations are not run in parallel threads, but instead, self-play games are played in parallel processes. This is to avoid the need for a multi-threaded MCTS implementation while still maintaining fast training speed and saturating the GPU.
- MCTS simulation and board evaluations are batched according to
search_batch_size
config parameter. "Virtual loss" is used as in AlphaZero, to increase search diversity.
Installation
Clone the repository and install dependencies with Conda:
git clone https://github.com/jseppanen/azalea.git
conda env create -n azalea
source activate azalea
The default environment.yml
installs GPU packages but you can choose
environment-cpu.yml
for testing on a laptop.
Playing against pretrained model
python play.py models/hex11-20180712-3362.policy.pth
This will load the model and start playing, asking for your move. The
columns are labeled a–k and rows 1–11. The first player, playing X
's,
is trying to draw a vertical connected path through the board, while the
second player, with O
's, is drawing a horizontal path.
O O O O X . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . X . . . . . .
. . . . . X . . . . .
. . . . . . . . . . .
. . . . X . . . . . .
. . . . . . . . . . .
. . . X . . . . . . .
x . . . . . . . . . . .
o\\ . . . . . . . . . . .
last move: e1
Your move?
Model training
python train.py --config config/hex11_train_config.yml --rundir runs/train
Model comparison
python compare.py --config config/hex11_eval_config.yml --rundir runs/compare <mode1> <model2> [model3] ...
Model selection
python tune.py
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file azalea-0.1.0.tar.gz
.
File metadata
- Download URL: azalea-0.1.0.tar.gz
- Upload date:
- Size: 22.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 510629ce818a6cbede9d2837b42779c682b4f3a7d17f9c40199d6238b638cd45 |
|
MD5 | c558c419e4d36d6b33996d8e06a18372 |
|
BLAKE2b-256 | a6b20f8876fd07cdace277d2f6c26005b75a1db0ae12410de16646f933b05a38 |
File details
Details for the file azalea-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: azalea-0.1.0-py3-none-any.whl
- Upload date:
- Size: 29.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26d37f27c2e118c2f3d742efb87f152a9c24b1c5b691d06e9d4d2c2ea37be9c4 |
|
MD5 | 862f162e30c023903fdc136022dd6c1c |
|
BLAKE2b-256 | 8310f4a97bbd32aaf1467e489c658ceb7ef841f5ae00dbe9b04375db5f0971f4 |