AlphaZero implementation for a triangle puzzle game (uses trianglengin).

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

lgpelin92

These details have not been verified by PyPI

Project description

- - -

AlphaTriangle

Overview

AlphaTriangle is a project implementing an artificial intelligence agent based on AlphaZero principles to learn and play a custom puzzle game involving placing triangular shapes onto a grid. The agent learns through headless self-play reinforcement learning, guided by Monte Carlo Tree Search (MCTS) and a deep neural network (PyTorch). It uses the trianglengin library for core game logic.

The project includes:

An implementation of the MCTS algorithm tailored for the game.
A deep neural network (policy and value heads) implemented in PyTorch, featuring convolutional layers and optional Transformer Encoder layers.
A reinforcement learning pipeline coordinating parallel self-play (using Ray), data storage, and network training, managed by the alphatriangle.training module.
Experiment tracking and visualization using MLflow and TensorBoard.
Unit tests for RL components.
A command-line interface for running the headless training pipeline.

🎮 The Triangle Puzzle Game Guide 🧩

This project trains an agent to play the game defined by the trianglengin library. Here's a detailed explanation of the game rules:

1. Introduction: Your Mission! 🎯

The goal is to place colorful shapes onto a special triangular grid. By filling up lines of triangles, you make them disappear and score points! Keep placing shapes and clearing lines for as long as possible to get the highest score before the grid fills up and you run out of moves.

2. The Playing Field: The Grid 🗺️

Triangle Cells: The game board is a grid made of many small triangles. Some point UP (🔺) and some point DOWN (🔻). They alternate like a checkerboard pattern based on their row and column index (specifically, (row + col) % 2 != 0 means UP).
Shape: The grid itself is rectangular overall, but the playable area within it is typically shaped like a triangle or hexagon, wider in the middle and narrower at the top and bottom.
Playable Area: You can only place shapes within the designated playable area.
Death Zones 💀: Around the edges of the playable area (often at the start and end of rows), some triangles are marked as "Death Zones". You cannot place any part of a shape onto these triangles. They are off-limits! Think of them as the boundaries within the rectangular grid.

3. Your Tools: The Shapes 🟦🟥🟩

Shape Formation: Each shape is a collection of connected small triangles (🔺 and 🔻). They come in different colors and arrangements. Some might be a single triangle, others might be long lines, L-shapes, or more complex patterns.
Relative Positions: The triangles within a shape have fixed positions relative to each other. When you move the shape, all its triangles move together as one block.
Preview Area: You will always have three shapes available to choose from at any time. These are shown in a special "preview area".

4. Making Your Move: Placing Shapes 🖱️➡️▦

This is the core action! Here's exactly how to place a shape:

Step 4a: Select a Shape: Choose one of the three shapes available in the preview area.
Step 4b: Aim on the Grid: Select a target coordinate (row, col) on the main grid. This coordinate represents the anchor point for placing the shape.
Step 4c: The Placement Rules (MUST Follow!)
- 📏 Rule 1: Fit Inside Playable Area: ALL triangles of your chosen shape must land within the playable grid area. No part of the shape can land in a Death Zone 💀.
- 🧱 Rule 2: No Overlap: ALL triangles of your chosen shape must land on currently empty spaces on the grid. You cannot place a shape on top of triangles that are already filled with color from previous shapes.
- 📐 Rule 3: Orientation Match! This is crucial!
  - If a part of your shape is an UP triangle (🔺), it MUST land on an UP space (🔺) on the grid.
  - If a part of your shape is a DOWN triangle (🔻), it MUST land on a DOWN space (🔻) on the grid.
  - 🔺➡️🔺 (OK!)
  - 🔻➡️🔻 (OK!)
  - 🔺➡️🔻 (INVALID! ❌)
  - 🔻➡️🔺 (INVALID! ❌)
Step 4d: Confirm Placement: If the chosen shape can be placed at the target coordinate according to ALL three rules, the placement is valid. The shape is now placed permanently on the grid! ✨

5. Scoring Points: How You Win! 🏆

You score points in two main ways:

Placing Triangles: You get a small number of points for every single small triangle that makes up the shape you just placed. (e.g., placing a 3-triangle shape might give you 3 * tiny_score points).
Clearing Lines: This is where the BIG points come from! You get a much larger number of points for every single small triangle that disappears when you clear a line (or multiple lines at once!). See the next section for details!

6. Line Clearing Magic! ✨ (The Key to High Scores!)

This is the most exciting part! When you place a shape, the game immediately checks if you've completed any lines. This section explains how the game finds and clears these lines.

What Lines Can Be Cleared? There are three types of lines the game looks for:
- Horizontal Lines ↔️: A straight, unbroken line of filled triangles going across a single row.
- Diagonal Lines (Top-Left to Bottom-Right) ↘️: An unbroken diagonal line of filled triangles stepping down and to the right.
- Diagonal Lines (Bottom-Left to Top-Right) ↗️: An unbroken diagonal line of filled triangles stepping up and to the right.
How Lines are Found: Pre-calculation of Maximal Lines
- The Idea: Instead of checking every possible line combination all the time, the game pre-calculates all maximal continuous lines of playable triangles when it starts. A maximal line is the longest possible straight segment of playable triangles (not in a Death Zone) in one of the three directions (Horizontal, Diagonal ↘️, Diagonal ↗️).
- Tracing: For every playable triangle on the grid, the game traces outwards in each of the three directions to find the full extent of the continuous playable line passing through that triangle in that direction.
- Storing Maximal Lines: Only the complete maximal lines found are stored. For example, if tracing finds a playable sequence A-B-C-D, only the line (A,B,C,D) is stored, not the sub-segments like (A,B,C) or (B,C,D). These maximal lines represent the potential lines that can be cleared.
- Coordinate Map: The game also builds a map linking each playable triangle coordinate (r, c) to the set of maximal lines it belongs to. This allows for quick lookup.
Defining the Paths (Neighbor Logic): How does the game know which triangle is "next" when tracing? It depends on the current triangle's orientation (🔺 or 🔻) and the direction being traced:
- Horizontal ↔️:
  - Left Neighbor: (r, c-1) (Always in the same row)
  - Right Neighbor: (r, c+1) (Always in the same row)
- Diagonal ↘️ (TL-BR):
  - If current is 🔺 (Up): Next is (r+1, c) (Down triangle directly below)
  - If current is 🔻 (Down): Next is (r, c+1) (Up triangle to the right)
- Diagonal ↗️ (BL-TR):
  - If current is 🔻 (Down): Next is (r-1, c) (Up triangle directly above)
  - If current is 🔺 (Up): Next is (r, c+1) (Down triangle to the right)

Visualizing the Paths:

Horizontal ↔️:

... [🔻][🔺][🔻][🔺][🔻][🔺] ...  (Moves left/right in the same row)

Diagonal ↘️ (TL-BR): (Connects via shared horizontal edges)

...[🔺]...
...[🔻][🔺] ...
...     [🔻][🔺] ...
...         [🔻] ...
(Path alternates row/col increments depending on orientation)

Diagonal ↗️ (BL-TR): (Connects via shared horizontal edges)

...           [🔺]  ...
...      [🔺][🔻]   ...
... [🔺][🔻]        ...
... [🔻]            ...
(Path alternates row/col increments depending on orientation)

The "Full Line" Rule: After you place a piece, the game looks at the coordinates (r, c) of the triangles you just placed. Using the pre-calculated map, it finds all the maximal lines that contain any of those coordinates. For each of those maximal lines (that have at least 2 triangles), it checks: "Is every single triangle coordinate in this maximal line now occupied?" If yes, that line is complete! (Note: Single isolated triangles don't count as clearable lines).
The Poof! 💨:
- If placing your shape completes one or MORE maximal lines (of any type, length >= 2) simultaneously, all the triangles in ALL completed lines vanish instantly!
- The spaces become empty again.
- You score points for every single triangle that vanished. Clearing multiple lines at once is the best way to rack up points! 🥳

7. Getting New Shapes: The Refill 🪄

The Trigger: The game only gives you new shapes when a specific condition is met.
The Condition: New shapes appear only when all three of your preview slots become empty at the exact same time.
How it Happens: This usually occurs right after you place your last available shape (the third one).
The Refill: As soon as the third slot becomes empty, BAM! 🪄 Three brand new, randomly generated shapes instantly appear in the preview slots.
Important: If you place a shape and only one or two slots are empty, you do not get new shapes yet. You must use up all three before the refill happens.

8. The End of the Road: Game Over 😭

So, how does the game end?

The Condition: The game is over when you cannot legally place any of the three shapes currently available in your preview slots anywhere on the grid.
The Check: After every move (placing a shape and any resulting line clears), and after any potential shape refill, the game checks: "Is there at least one valid spot on the grid for Shape 1? OR for Shape 2? OR for Shape 3?"
No More Moves: If the answer is "NO" for all three shapes (meaning none of them can be placed anywhere according to the Placement Rules), then the game immediately ends.
Strategy: This means you need to be careful! Don't fill up the grid in a way that leaves no room for the types of shapes you might get later. Always try to keep options open! 🤔

Core Technologies

Python 3.10+
trianglengin: Core game engine (state, actions, rules).
PyTorch: For the deep learning model (CNNs, optional Transformers, Distributional Value Head) and training, with CUDA/MPS support.
NumPy: For numerical operations, especially state representation (used by trianglengin and features).
Ray: For parallelizing self-play data generation and statistics collection across multiple CPU cores/processes.
Numba: (Optional, used in features.grid_features) For performance optimization of specific grid calculations.
Cloudpickle: For serializing the experience replay buffer and training checkpoints.
MLflow: For logging parameters, metrics, and artifacts (checkpoints, buffers) during training runs. Provides the primary web UI dashboard for experiment management.
TensorBoard: For visualizing metrics during training (e.g., detailed loss curves). Provides a secondary web UI dashboard, often with faster graph updates.
Pydantic: For configuration management and data validation.
Typer: For the command-line interface.
Pytest: For running unit tests.

Project Structure

.
├── .github/workflows/      # GitHub Actions CI/CD
│   └── ci_cd.yml
├── .alphatriangle_data/    # Root directory for ALL persistent data (GITIGNORED)
│   ├── mlruns/             # MLflow internal tracking data & artifact store (for UI)
│   └── runs/               # Local artifacts per run (checkpoints, buffers, TB logs, configs)
│       └── <run_name>/
│           ├── checkpoints/ # Saved model weights & optimizer states
│           ├── buffers/     # Saved experience replay buffers
│           ├── logs/        # Plain text log files for the run
│           ├── tensorboard/ # TensorBoard log files (scalars, etc.)
│           └── configs.json # Copy of run configuration
├── alphatriangle/          # Source code for the AlphaZero agent package
│   ├── __init__.py
│   ├── cli.py              # CLI logic (train command - headless only)
│   ├── config/             # Pydantic configuration models (MCTS, Model, Train, Persistence)
│   │   └── README.md
│   ├── data/               # Data saving/loading logic (DataManager, Schemas)
│   │   └── README.md
│   ├── features/           # Feature extraction logic (operates on trianglengin.GameState)
│   │   └── README.md
│   ├── mcts/               # Monte Carlo Tree Search (operates on trianglengin.GameState)
│   │   └── README.md
│   ├── nn/                 # Neural network definition and wrapper
│   │   └── README.md
│   ├── rl/                 # RL components (Trainer, Buffer, Worker)
│   │   └── README.md
│   ├── stats/              # Statistics collection actor (StatsCollectorActor)
│   │   └── README.md
│   ├── training/           # Training orchestration (Loop, Setup, Runner)
│   │   └── README.md
│   └── utils/              # Shared utilities and types (specific to AlphaTriangle)
│       └── README.md
├── tests/                  # Unit tests (for alphatriangle components)
│   ├── conftest.py
│   ├── mcts/
│   ├── nn/
│   ├── rl/
│   ├── stats/
│   └── training/
├── .gitignore
├── .python-version
├── LICENSE                 # License file (MIT)
├── MANIFEST.in             # Specifies files for source distribution
├── pyproject.toml          # Build system & package configuration (depends on trianglengin)
├── README.md               # This file
└── requirements.txt        # List of dependencies (includes trianglengin)

Key Modules (`alphatriangle`)

cli: Defines the command-line interface using Typer (only train command, headless). (alphatriangle/cli.py)
config: Centralized Pydantic configuration classes (excluding EnvConfig and DisplayConfig). (alphatriangle/config/README.md)
features: Contains logic to convert trianglengin.GameState objects into numerical features (StateType). (alphatriangle/features/README.md)
nn: Contains the PyTorch nn.Module definition (AlphaTriangleNet) and a wrapper class (NeuralNetwork). (alphatriangle/nn/README.md)
mcts: Implements the Monte Carlo Tree Search algorithm (Node, run_mcts_simulations), operating on trianglengin.GameState. (alphatriangle/mcts/README.md)
rl: Contains RL components: Trainer (network updates), ExperienceBuffer (data storage, supports PER), and SelfPlayWorker (Ray actor for parallel self-play using trianglengin.GameState). (alphatriangle/rl/README.md)
training: Orchestrates the headless training process using TrainingLoop, managing workers, data flow, logging (to console, file, MLflow, TensorBoard), and checkpoints. Includes runner.py for the callable training function. (alphatriangle/training/README.md)
stats: Contains the StatsCollectorActor (Ray actor) for asynchronous statistics collection. (alphatriangle/stats/README.md)
data: Manages saving and loading of training artifacts (DataManager) using Pydantic schemas and cloudpickle. (alphatriangle/data/README.md)
utils: Provides common helper functions and shared type definitions specific to the AlphaZero implementation. (alphatriangle/utils/README.md)

Setup

Clone the repository (for development):

git clone https://github.com/lguibr/alphatriangle.git
cd alphatriangle

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the package (including trianglengin):

For users:

# This will automatically install trianglengin from PyPI if available
pip install alphatriangle
# Or install directly from Git (installs trianglengin from PyPI)
# pip install git+https://github.com/lguibr/alphatriangle.git

For developers (editable install):

First, ensure trianglengin is installed (ideally in editable mode from its own directory if developing both):
```
# From the trianglengin directory:
# pip install -e .
```

Then, install alphatriangle in editable mode:

# From the alphatriangle directory:
pip install -e .
# Install dev dependencies (optional, for running tests/linting)
pip install -e .[dev] # Installs dev deps from pyproject.toml

Note: Ensure you have the correct PyTorch version installed for your system (CPU/CUDA/MPS). See pytorch.org. Ray may have specific system requirements.

(Optional) Add data directory to .gitignore: Create or edit the .gitignore file in your project root and add the line:
```
.alphatriangle_data/
```

Running the Code (CLI)

Use the alphatriangle command for training:

Show Help:
```
alphatriangle --help
```

Run Training (Headless Only):

alphatriangle train [--seed 42] [--log-level INFO]

Interactive Play/Debug (Use trianglengin CLI): Note: Interactive modes are part of the trianglengin library, not this alphatriangle package.

# Ensure trianglengin is installed
trianglengin play [--seed 42] [--log-level INFO]
trianglengin debug [--seed 42] [--log-level DEBUG]

Monitoring Training (Web Dashboards): This project uses MLflow and TensorBoard to provide web-based dashboards for monitoring. It's recommended to run both concurrently for the best experience:
- MLflow UI (Experiment Overview & Artifacts): Provides the main dashboard for comparing runs, viewing parameters, high-level metrics, and accessing saved artifacts (checkpoints, buffers). Updates occur as data is logged, but may require a browser refresh for the latest points.
```
# Run from the project root directory
mlflow ui --backend-store-uri file:./.alphatriangle_data/mlruns
```
  Access via http://localhost:5000.
- TensorBoard (Near Real-Time Graphs): Offers more frequently updated graphs of scalar metrics (losses, rates, etc.) during a run, making it ideal for closely monitoring training progress.
```
# Run from the project root directory, pointing to the *specific run's* TB logs
tensorboard --logdir .alphatriangle_data/runs/<your_run_name>/tensorboard
# Replace <your_run_name> with the actual name (e.g., train_20240101_120000)
# You can also point to the parent 'runs' directory to see all runs:
# tensorboard --logdir .alphatriangle_data/runs
```
  Access via http://localhost:6006.
Running Unit Tests (Development):
```
pytest tests/
```

Configuration

All major parameters for the AlphaZero agent (MCTS, Model, Training, Persistence) are defined in the Pydantic classes within the alphatriangle/config/ directory. Modify these files to experiment with different settings. Environment configuration (EnvConfig) is defined within the trianglengin library.

Data Storage

All persistent data is stored within the .alphatriangle_data/ directory in the project root.

.alphatriangle_data/mlruns/: Managed by MLflow. Contains MLflow's internal tracking data (parameters, metrics) and its own copy of logged artifacts. This is the source for the MLflow UI.
.alphatriangle_data/runs/: Managed by DataManager. Contains locally saved artifacts for each run (checkpoints, buffers, TensorBoard logs, configs) before/during logging to MLflow. This directory is used for auto-resuming and direct access to TensorBoard logs during a run.

Maintainability

This project includes README files within each major alphatriangle submodule. Please keep these READMEs updated when making changes to the code's structure, interfaces, or core logic.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

lgpelin92

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Apr 20, 2025

0.4.0

Apr 18, 2025

0.3.3

Apr 18, 2025

0.3.2

Apr 18, 2025

0.1.0

Apr 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alphatriangle-1.0.0.tar.gz (117.0 kB view details)

Uploaded Apr 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

alphatriangle-1.0.0-py3-none-any.whl (141.6 kB view details)

Uploaded Apr 20, 2025 Python 3

File details

Details for the file alphatriangle-1.0.0.tar.gz.

File metadata

Download URL: alphatriangle-1.0.0.tar.gz
Upload date: Apr 20, 2025
Size: 117.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for alphatriangle-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b6c287b5c8e16540fe7be85aa8f15662d205d84e89c34e7ccb810ba5838d4114`
MD5	`d6e07995202404287acacbbdcdbbdf01`
BLAKE2b-256	`cf720e91a565fa6627920fa761fc62872f5e5cd031aaf784b3fd053f04d7d097`

See more details on using hashes here.

Provenance

The following attestation bundles were made for alphatriangle-1.0.0.tar.gz:

Publisher: ci_cd.yml on lguibr/alphatriangle

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: alphatriangle-1.0.0.tar.gz
- Subject digest: b6c287b5c8e16540fe7be85aa8f15662d205d84e89c34e7ccb810ba5838d4114
- Sigstore transparency entry: 199962173
- Sigstore integration time: Apr 20, 2025
Source repository:
- Permalink: lguibr/alphatriangle@99509e8fa4a8e03dcdad3cc8dd55affab48d54b3
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/lguibr
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci_cd.yml@99509e8fa4a8e03dcdad3cc8dd55affab48d54b3
- Trigger Event: push

File details

Details for the file alphatriangle-1.0.0-py3-none-any.whl.

File metadata

Download URL: alphatriangle-1.0.0-py3-none-any.whl
Upload date: Apr 20, 2025
Size: 141.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for alphatriangle-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cdbd275c97d9e9aecc07297fe2118da4efc2b22db0e5477d71712ef6c4903a46`
MD5	`9050662e25aefed635b7e20845ad22fa`
BLAKE2b-256	`021649409d6a8914fd05502bd6f3c16a13edb01f368a6d2e7bd69fca16188643`

See more details on using hashes here.

Provenance

The following attestation bundles were made for alphatriangle-1.0.0-py3-none-any.whl:

Publisher: ci_cd.yml on lguibr/alphatriangle

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: alphatriangle-1.0.0-py3-none-any.whl
- Subject digest: cdbd275c97d9e9aecc07297fe2118da4efc2b22db0e5477d71712ef6c4903a46
- Sigstore transparency entry: 199962174
- Sigstore integration time: Apr 20, 2025
Source repository:
- Permalink: lguibr/alphatriangle@99509e8fa4a8e03dcdad3cc8dd55affab48d54b3
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/lguibr
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci_cd.yml@99509e8fa4a8e03dcdad3cc8dd55affab48d54b3
- Trigger Event: push

alphatriangle 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AlphaTriangle

Overview

🎮 The Triangle Puzzle Game Guide 🧩

1. Introduction: Your Mission! 🎯

2. The Playing Field: The Grid 🗺️

3. Your Tools: The Shapes 🟦🟥🟩

4. Making Your Move: Placing Shapes 🖱️➡️▦

5. Scoring Points: How You Win! 🏆

6. Line Clearing Magic! ✨ (The Key to High Scores!)

7. Getting New Shapes: The Refill 🪄

8. The End of the Road: Game Over 😭

Core Technologies

Project Structure

Key Modules (alphatriangle)

Setup

Running the Code (CLI)

Configuration

Data Storage

Maintainability

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Key Modules (`alphatriangle`)