AlphaZero implementation for a triangle puzzle game.
Project description
AlphaTriangle Project
Overview
AlphaTriangle is a project implementing an artificial intelligence agent based on AlphaZero principles to learn and play a custom puzzle game involving placing triangular shapes onto a grid. The agent learns through self-play reinforcement learning, guided by Monte Carlo Tree Search (MCTS) and a deep neural network (PyTorch).
The project includes:
- A playable version of the triangle puzzle game using Pygame.
- An implementation of the MCTS algorithm tailored for the game.
- A deep neural network (policy and value heads) implemented in PyTorch, featuring convolutional layers and optional Transformer Encoder layers.
- A reinforcement learning pipeline coordinating parallel self-play (using Ray), data storage, and network training, managed by the
alphatriangle.trainingmodule. - Visualization tools for interactive play, debugging, and monitoring training progress (with near real-time plot updates).
- Experiment tracking using MLflow.
- Unit tests for core components.
- A command-line interface for easy execution.
Core Technologies
- Python 3.10+
- Pygame: For game visualization and interactive modes.
- PyTorch: For the deep learning model (CNNs, optional Transformers, Distributional Value Head) and training, with CUDA/MPS support.
- NumPy: For numerical operations, especially state representation.
- Ray: For parallelizing self-play data generation and statistics collection across multiple CPU cores/processes.
- Numba: (Optional, used in
features.grid_features) For performance optimization of specific grid calculations. - Cloudpickle: For serializing the experience replay buffer and training checkpoints.
- MLflow: For logging parameters, metrics, and artifacts (checkpoints, buffers) during training runs.
- Pydantic: For configuration management and data validation.
- Typer: For the command-line interface.
- Pytest: For running unit tests.
Project Structure
.
├── .github/workflows/ # GitHub Actions CI/CD
│ └── ci_cd.yml
├── .alphatriangle_data/ # Root directory for ALL persistent data (GITIGNORED)
│ ├── mlruns/ # MLflow tracking data
│ └── runs/ # Stores temporary/local artifacts per run
│ └── <run_name>/
│ ├── checkpoints/
│ ├── buffers/
│ ├── logs/
│ └── configs.json
├── alphatriangle/ # Source code for the project package
│ ├── __init__.py
│ ├── app.py
│ ├── cli.py # CLI logic
│ ├── config/ # Pydantic configuration models
│ │ └── README.md
│ ├── data/ # Data saving/loading logic
│ │ └── README.md
│ ├── environment/ # Game rules, state, actions
│ │ └── README.md
│ ├── features/ # Feature extraction logic
│ │ └── README.md
│ ├── interaction/ # User input handling
│ │ └── README.md
│ ├── mcts/ # Monte Carlo Tree Search
│ │ └── README.md
│ ├── nn/ # Neural network definition and wrapper
│ │ └── README.md
│ ├── rl/ # RL components (Trainer, Buffer, Worker)
│ │ └── README.md
│ ├── stats/ # Statistics collection and plotting
│ │ └── README.md
│ ├── structs/ # Core data structures (Triangle, Shape)
│ │ └── README.md
│ ├── training/ # Training orchestration (Loop, Setup, Runners)
│ │ └── README.md
│ ├── utils/ # Shared utilities and types
│ │ └── README.md
│ └── visualization/ # Pygame rendering components
│ └── README.md
├── tests/ # Unit tests
│ ├── ...
├── .gitignore
├── .python-version
├── LICENSE # License file (MIT)
├── MANIFEST.in # Specifies files for source distribution
├── pyproject.toml # Build system & package configuration
├── README.md # This file
├── requirements.txt # List of dependencies (also in pyproject.toml)
├── run_interactive.py # Legacy script to run interactive modes
├── run_shape_editor.py # Script to run the interactive shape definition tool
├── run_training_headless.py # Legacy script for headless training
└── run_training_visual.py # Legacy script for visual training
Key Modules (alphatriangle)
cli: Defines the command-line interface using Typer. (alphatriangle/cli.py)config: Centralized Pydantic configuration classes. (alphatriangle/config/README.md)structs: Defines core, low-level data structures (Triangle,Shape) and constants. (alphatriangle/structs/README.md)environment: Defines the game rules,GameState, action encoding/decoding, and grid/shape logic. (alphatriangle/environment/README.md)features: Contains logic to convertGameStateobjects into numerical features (StateType). (alphatriangle/features/README.md)nn: Contains the PyTorchnn.Moduledefinition (AlphaTriangleNet) and a wrapper class (NeuralNetwork). (alphatriangle/nn/README.md)mcts: Implements the Monte Carlo Tree Search algorithm (Node,run_mcts_simulations). (alphatriangle/mcts/README.md)rl: Contains RL components:Trainer(network updates),ExperienceBuffer(data storage, supports PER), andSelfPlayWorker(Ray actor for parallel self-play). (alphatriangle/rl/README.md)training: Orchestrates the training process usingTrainingLoop, managing workers, data flow, logging, and checkpoints. Includesrunners.pyfor callable training functions. (alphatriangle/training/README.md)stats: Contains theStatsCollectorActor(Ray actor) for asynchronous statistics collection and thePlotterclass for rendering plots. (alphatriangle/stats/README.md)visualization: Uses Pygame to render the game state, previews, HUD, plots, etc.DashboardRendererhandles the training visualization layout. (alphatriangle/visualization/README.md)interaction: Handles keyboard/mouse input for interactive modes viaInputHandler. (alphatriangle/interaction/README.md)data: Manages saving and loading of training artifacts (DataManager) using Pydantic schemas andcloudpickle. (alphatriangle/data/README.md)utils: Provides common helper functions, shared type definitions, and geometry helpers. (alphatriangle/utils/README.md)app: Integrates components for interactive modes (run_interactive.py). (alphatriangle/app.py)
Setup
- Clone the repository (for development):
git clone https://github.com/lguibr/alphatriangle.git cd alphatriangle
- Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the package:
- For users:
pip install alphatriangle # Or pip install git+https://github.com/lguibr/alphatriangle.git
- For developers (editable install):
pip install -e . # Install dev dependencies (optional, for running tests/linting) pip install pytest pytest-cov pytest-mock ruff mypy codecov twine build
- For users:
- (Optional) Add data directory to
.gitignore: Create or edit the.gitignorefile in your project root and add the line:.alphatriangle_data/
Running the Code (CLI)
Use the alphatriangle command:
- Show Help:
alphatriangle --help - Interactive Play Mode:
alphatriangle play [--seed 42] [--log-level INFO]
- Interactive Debug Mode:
alphatriangle debug [--seed 42] [--log-level DEBUG]
- Run Training (Visual Mode):
alphatriangle train [--seed 42] [--log-level INFO]
- Run Training (Headless Mode):
alphatriangle train --headless [--seed 42] [--log-level INFO] # or alphatriangle train -H [--seed 42] [--log-level INFO]
- Shape Editor (Run directly):
python run_shape_editor.py - Monitoring Training (MLflow UI):
While training (headless or visual), or after runs have completed, open a separate terminal in the project root and run:
mlflow ui --backend-store-uri file:./.alphatriangle_data/mlruns
Then navigate tohttp://localhost:5000(or the specified port) in your browser. - Running Unit Tests (Development):
pytest tests/
Configuration
All major parameters are defined in the Pydantic classes within the alphatriangle/config/ directory. Modify these files to experiment with different settings. The alphatriangle/config/validation.py script performs basic checks on startup.
Data Storage
All persistent data, including MLflow tracking data and run-specific artifacts, is stored within the .alphatriangle_data/ directory in the project root, managed by the DataManager and MLflow.
Maintainability
This project includes README files within each major alphatriangle submodule. Please keep these READMEs updated when making changes to the code's structure, interfaces, or core logic.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file alphatriangle-0.3.3.tar.gz.
File metadata
- Download URL: alphatriangle-0.3.3.tar.gz
- Upload date:
- Size: 179.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf191a6d8c669ca6bb9acde55041d9a2249dfe7d037d314971a73813040f57ec
|
|
| MD5 |
c5c982ba573eeb885796755a990d991a
|
|
| BLAKE2b-256 |
f3524ba8a114905d9cd9646aa48f3ec9047a20521580f269fcee9579a01f9e72
|
Provenance
The following attestation bundles were made for alphatriangle-0.3.3.tar.gz:
Publisher:
ci_cd.yml on lguibr/alphatriangle
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
alphatriangle-0.3.3.tar.gz -
Subject digest:
bf191a6d8c669ca6bb9acde55041d9a2249dfe7d037d314971a73813040f57ec - Sigstore transparency entry: 199423344
- Sigstore integration time:
-
Permalink:
lguibr/alphatriangle@ccb2f81448905b0e1d45eed49087c4ebab73a375 -
Branch / Tag:
refs/tags/v0.3.4 - Owner: https://github.com/lguibr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci_cd.yml@ccb2f81448905b0e1d45eed49087c4ebab73a375 -
Trigger Event:
push
-
Statement type:
File details
Details for the file alphatriangle-0.3.3-py3-none-any.whl.
File metadata
- Download URL: alphatriangle-0.3.3-py3-none-any.whl
- Upload date:
- Size: 240.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
131f2713ecf31b52694fe5a706771c1c94f7488509ceeafd1d27081a23d642a8
|
|
| MD5 |
554d0c9b76ff84930628f3dc1b45329f
|
|
| BLAKE2b-256 |
d2f1eaddb2a149a2b478782ba36e7e1f98b55956620ec2c5fd03e67d179d4e48
|
Provenance
The following attestation bundles were made for alphatriangle-0.3.3-py3-none-any.whl:
Publisher:
ci_cd.yml on lguibr/alphatriangle
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
alphatriangle-0.3.3-py3-none-any.whl -
Subject digest:
131f2713ecf31b52694fe5a706771c1c94f7488509ceeafd1d27081a23d642a8 - Sigstore transparency entry: 199423345
- Sigstore integration time:
-
Permalink:
lguibr/alphatriangle@ccb2f81448905b0e1d45eed49087c4ebab73a375 -
Branch / Tag:
refs/tags/v0.3.4 - Owner: https://github.com/lguibr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci_cd.yml@ccb2f81448905b0e1d45eed49087c4ebab73a375 -
Trigger Event:
push
-
Statement type: