Skip to main content

AlphaZero implementation for a triangle puzzle game.

Project description

AlphaTriangle Project

CI/CD Status codecov PyPI version License: MIT Python Version

Overview

AlphaTriangle is a project implementing an artificial intelligence agent based on AlphaZero principles to learn and play a custom puzzle game involving placing triangular shapes onto a grid. The agent learns through self-play reinforcement learning, guided by Monte Carlo Tree Search (MCTS) and a deep neural network (PyTorch).

The project includes:

  • A playable version of the triangle puzzle game using Pygame.
  • An implementation of the MCTS algorithm tailored for the game.
  • A deep neural network (policy and value heads) implemented in PyTorch, featuring convolutional layers and optional Transformer Encoder layers.
  • A reinforcement learning pipeline coordinating parallel self-play (using Ray), data storage, and network training, managed by the alphatriangle.training module.
  • Visualization tools for interactive play, debugging, and monitoring training progress (with near real-time plot updates).
  • Experiment tracking using MLflow.
  • Unit tests for core components.
  • A command-line interface for easy execution.

Core Technologies

  • Python 3.10+
  • Pygame: For game visualization and interactive modes.
  • PyTorch: For the deep learning model (CNNs, optional Transformers) and training, with CUDA/MPS support.
  • NumPy: For numerical operations, especially state representation.
  • Ray: For parallelizing self-play data generation and statistics collection across multiple CPU cores/processes.
  • Numba: (Optional, used in features.grid_features) For performance optimization of specific grid calculations.
  • Cloudpickle: For serializing the experience replay buffer and training checkpoints.
  • MLflow: For logging parameters, metrics, and artifacts (checkpoints, buffers) during training runs.
  • Pydantic: For configuration management and data validation.
  • Typer: For the command-line interface.
  • Pytest: For running unit tests.

Project Structure

.
├── .github/workflows/      # GitHub Actions CI/CD
│   └── ci_cd.yml
├── .alphatriangle_data/    # Root directory for ALL persistent data (GITIGNORED)
│   ├── mlruns/             # MLflow tracking data
│   └── runs/               # Stores temporary/local artifacts per run
│       └── <run_name>/
│           ├── checkpoints/
│           ├── buffers/
│           ├── logs/
│           └── configs.json
├── alphatriangle/                    # Source code for the project package
│   ├── alphatriangle/      # (Implicit package name after install)
│   │   ├── __init__.py
│   │   ├── app.py
│   │   ├── cli.py          # CLI logic
│   │   ├── config/
│   │   ├── data/
│   │   ├── environment/
│   │   ├── features/
│   │   ├── interaction/
│   │   ├── mcts/
│   │   ├── nn/
│   │   ├── rl/
│   │   ├── stats/
│   │   ├── structs/
│   │   ├── training/
│   │   ├── utils/
│   │   └── visualization/
├── tests/                  # Unit tests
│   ├── ...
├── .gitignore
├── .python-version
├── LICENSE                 # License file (e.g., MIT)
├── MANIFEST.in             # Specifies files for source distribution
├── pyproject.toml          # Build system & package configuration
├── README.md               # This file
├── requirements.txt        # List of dependencies (also in pyproject.toml)
├── run_interactive.py      # Legacy script to run interactive modes
├── run_shape_editor.py     # Script to run the interactive shape definition tool
├── run_training_headless.py # Legacy script for headless training
└── run_training_visual.py  # Legacy script for visual training

Key Modules (alphatriangle)

  • cli: Defines the command-line interface using Typer.
  • config: Centralized Pydantic configuration classes.
  • structs: Defines core, low-level data structures (Triangle, Shape) and constants.
  • environment: Defines the game rules, GameState, action encoding/decoding, and grid/shape logic.
  • features: Contains logic to convert GameState objects into numerical features (StateType).
  • nn: Contains the PyTorch nn.Module definition (AlphaTriangleNet) and a wrapper class (NeuralNetwork).
  • mcts: Implements the Monte Carlo Tree Search algorithm (Node, run_mcts_simulations).
  • rl: Contains RL components: Trainer (network updates), ExperienceBuffer (data storage, supports PER), and SelfPlayWorker (Ray actor for parallel self-play).
  • training: Orchestrates the training process using TrainingPipeline and TrainingLoop, managing workers, data flow, logging, and checkpoints. Includes runners.py for callable training functions.
  • stats: Contains the StatsCollectorActor (Ray actor) for asynchronous statistics collection and the Plotter class for rendering plots.
  • visualization: Uses Pygame to render the game state, previews, HUD, plots, etc. DashboardRenderer handles the training visualization layout.
  • interaction: Handles keyboard/mouse input for interactive modes via InputHandler.
  • data: Manages saving and loading of training artifacts (DataManager) using Pydantic schemas and cloudpickle.
  • utils: Provides common helper functions, shared type definitions, and geometry helpers.
  • app: Integrates components for interactive modes (run_interactive.py).

Setup

  1. Clone the repository (for development):
    git clone https://github.com/lguibr/alphatriangle.git # CHANGE THIS
    cd alphatriangle
    
  2. Create a virtual environment (recommended):
    python -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
    
  3. Install the package:
    • For users:
      pip install alphatriangle # Or pip install git+https://github.com/lguibr/alphatriangle.git
      
    • For developers (editable install):
      pip install -e .
      # Install dev dependencies (optional, for running tests/linting)
      pip install pytest pytest-cov pytest-mock ruff mypy codecov twine build
      
    Note: Ensure you have the correct PyTorch version installed for your system (CPU/CUDA/MPS). See pytorch.org. Ray may have specific system requirements.
  4. (Optional) Add data directory to .gitignore: Create or edit the .gitignore file in your project root and add the line:
    .alphatriangle_data/
    

Running the Code (CLI)

Use the alphatriangle command:

  • Show Help:
    alphatriangle --help
    
  • Interactive Play Mode:
    alphatriangle play [--seed 42] [--log-level INFO]
    
  • Interactive Debug Mode:
    alphatriangle debug [--seed 42] [--log-level DEBUG]
    
  • Run Training (Visual Mode):
    alphatriangle train [--seed 42] [--log-level INFO]
    
  • Run Training (Headless Mode):
    alphatriangle train --headless [--seed 42] [--log-level INFO]
    # or
    alphatriangle train -H [--seed 42] [--log-level INFO]
    
  • Shape Editor (Run directly):
    python run_shape_editor.py
    
  • Monitoring Training (MLflow UI): While training (headless or visual), or after runs have completed, open a separate terminal in the project root and run:
    mlflow ui --backend-store-uri file:./.alphatriangle_data/mlruns
    
    Then navigate to http://localhost:5000 (or the specified port) in your browser.
  • Running Unit Tests (Development):
    pytest tests/
    

Configuration

All major parameters are defined in the Pydantic classes within the alphatriangle/config/ directory. Modify these files to experiment with different settings. The alphatriangle/config/validation.py script performs basic checks on startup.

Data Storage

All persistent data, including MLflow tracking data and run-specific artifacts, is stored within the .alphatriangle_data/ directory in the project root, managed by the DataManager and MLflow.

Maintainability

This project includes README files within each major alphatriangle submodule. Please keep these READMEs updated when making changes to the code's structure, interfaces, or core logic.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alphatriangle-0.3.2.tar.gz (168.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alphatriangle-0.3.2-py3-none-any.whl (222.2 kB view details)

Uploaded Python 3

File details

Details for the file alphatriangle-0.3.2.tar.gz.

File metadata

  • Download URL: alphatriangle-0.3.2.tar.gz
  • Upload date:
  • Size: 168.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for alphatriangle-0.3.2.tar.gz
Algorithm Hash digest
SHA256 47fafba25d5961607ebfaa2cd88f3105d38e839b3053f73789a90ab04510b083
MD5 7211dfbb3e94d091cd13e467eaec46b0
BLAKE2b-256 eaaa951a339318cb9605bb72cd41676fe5a1c5ab7c80c257105f5f1d9f9cbfe6

See more details on using hashes here.

Provenance

The following attestation bundles were made for alphatriangle-0.3.2.tar.gz:

Publisher: ci_cd.yml on lguibr/alphatriangle

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file alphatriangle-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: alphatriangle-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 222.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for alphatriangle-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6806f745ece9397d7a48010b8082406c510fddd36d3ddfddb306f108baccd4db
MD5 aaa06c4cc48b4a54526389ff211ce706
BLAKE2b-256 a5cc40f405a92da788d83806a0654b841e7aee56ee6cb5f03ef01c3950b66f9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for alphatriangle-0.3.2-py3-none-any.whl:

Publisher: ci_cd.yml on lguibr/alphatriangle

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page