Multi-game puzzle server for telnet - LLMs with MCP solvers welcome!

These details have not been verified by PyPI

Project links

Project description

Puzzle Arcade Server

A multi-game puzzle server and LLM reasoning benchmark arcade hosting 24 different logic puzzle types, built using the chuk-protocol-server framework.

Perfect for:

🤖 LLM Agent Testing - Benchmark reasoning capabilities across constraint types
🎯 CP-SAT Education - Learn constraint programming through progressive puzzles
💼 Business Demos - Map puzzle patterns to real scheduling, optimization, and allocation problems
🔧 MCP Tool Integration - Showcase CHUK + constraint solver workflows

Each puzzle demonstrates specific constraint patterns (AllDifferent, Optimization, Connectivity, Boolean SAT, etc.) and maps to business use cases (scheduling, resource allocation, routing, etc.).

Try It Now

A live demo server is running on Fly.io. Try it instantly:

# Connect via Telnet (IPv6)
telnet 2a09:8280:1::b8:79f4:0 8023

# WebSocket connections
ws://puzzle-arcade-server.fly.dev:8025/ws

Once connected, type help to see available games, or sudoku easy to start playing!

Features

24 Puzzle Games with three difficulty levels each (easy, medium, hard)
- 7 Classic Logic Puzzles - Sudoku, KenKen, Kakuro, Binary, Futoshiki, Nonogram, Logic Grid
- 7 Advanced CP-SAT Puzzles - Killer Sudoku, Lights Out, Mastermind, Slitherlink, Bridges, Hitori, Shikaku
- 5 Specialized Constraint Puzzles - Hidato, Tents and Trees, Fillomino, Star Battle, Sokoban
- 2 Optimization Challenges - Knapsack, Task Scheduler
- 3 Advanced Reasoning Puzzles - Nurikabe, Einstein's Puzzle, Minesweeper
Agent-Friendly Mode - Structured output with clear markers for AI agents and tools
- Enable with mode agent command
- Machine-parseable grid format with clear start/end markers
- Compact output optimized for LLM tool integration
Evaluation Harness (puzzle-arcade-eval) - Built-in benchmarking CLI
- Batch evaluation with configurable episodes
- Multiple output formats (JSON, CSV, Markdown)
- Metrics: moves, invalid moves, hints, solve time
- Reproducible with deterministic seeds
Multiple transport protocols:
- Telnet (port 8023) - Classic telnet protocol
- TCP (port 8024) - Raw TCP connections
- WebSocket (port 8025) - Modern WebSocket protocol
- WebSocket-Telnet (port 8026) - WebSocket with telnet negotiation
Interactive menu-driven interface with game selection
Hint system for when you're stuck
Solution checker and auto-solver for all games
Clean ASCII art grids - perfectly aligned for easy parsing
Deterministic seeding - Replay any puzzle with the same seed
Gymnasium-compatible RL Environment (PuzzleEnv) for training agents
Comprehensive test suite (1067 tests, 94% coverage)
Modern Python best practices:
- Pydantic v2 native - All models use ConfigDict for type safety
- Async native - Full async/await support throughout
- Type-safe - No dict["key"] patterns, only typed models
- Enum-based - No magic strings, proper enum constants
Modern Python packaging with pyproject.toml
Docker and Fly.io deployment ready

Available Games

Classic Logic Puzzles

Game	Grid Size	Constraint Types	Status
Sudoku	9×9	AllDifferent (rows, cols, boxes)	✅ Complete
KenKen	4×4 to 6×6	Arithmetic cages + AllDifferent	✅ Complete
Kakuro	5×5 to 8×8	Sum constraints + AllDifferent	✅ Complete
Binary Puzzle	6×6 to 10×10	Adjacency limits + Equal counts	✅ Complete
Futoshiki	4×4 to 6×6	Inequalities + AllDifferent	✅ Complete
Nonogram	5×5 to 10×10	Line sum constraints + Blocks	✅ Complete
Logic Grid	Variable	Category associations + Logic	✅ Complete

Advanced CP-SAT Puzzles

Game	Grid Size	Constraint Types	Status
Killer Sudoku	9×9	Linear constraints + AllDifferent + Cages	✅ Complete
Lights Out	5×5 to 7×7	Boolean XOR constraints (SAT)	✅ Complete
Mastermind	4-6 pegs	Deduction + Feedback constraints	✅ Complete
Slitherlink	5×5 to 10×10	Global loop + Edge constraints	✅ Complete
Bridges	7×7 to 11×11	Connectivity + Degree constraints	✅ Complete
Hitori	5×5 to 9×9	AllDifferent + Adjacency + Connectivity	✅ Complete
Shikaku	6×6 to 10×10	Area partitioning + Rectangle covering	✅ Complete

Specialized Constraint Puzzles

Game	Grid Size	Constraint Types	Status
Hidato	5×5 to 9×9	Sequential adjacency + Hamiltonian path	✅ Complete
Tents and Trees	6×6 to 10×10	Bipartite matching + Adjacency avoidance	✅ Complete
Fillomino	6×6 to 10×10	Region growth + Self-referential constraints	✅ Complete
Star Battle	6×6 to 10×10	Multi-region placement + Adjacency avoidance	✅ Complete
Sokoban	6×6 to 10×10	Spatial planning + Irreversible actions (optimization)	✅ Complete

Optimization Challenges

Game	Problem Size	Constraint Types	Status
Knapsack	5-12 items	Value maximization + Capacity constraint	✅ Complete
Task Scheduler	4-8 tasks	Makespan minimization + Dependencies + Resources	✅ Complete

Advanced Reasoning Puzzles

Game	Grid Size	Constraint Types	Status
Nurikabe	6×6 to 10×10	Connectivity + Island sizes + No 2×2 blocks	✅ Complete
Einstein's Puzzle	5 houses × 5 attributes	Multi-attribute deduction + Logic chains	✅ Complete
Minesweeper	6×6 to 10×10	Probabilistic reasoning + Safe deduction	✅ Complete

Solver Profiles & Business Mapping

Each game includes metadata for constraint types, business analogies, and complexity profiles, making it easy to:

Select puzzles by constraint pattern - Need to demonstrate Boolean SAT? → Lights Out
Map to business use cases - Task Scheduler → Sprint Planning, Knapsack → Portfolio Selection
Benchmark LLM reasoning - Compare model performance across different constraint densities

Example: Query Games by Profile

from puzzle_arcade_server.games import AVAILABLE_GAMES

# Find all optimization problems
optimization_games = [
    name for name, game_class in AVAILABLE_GAMES.items()
    if "optimization" in game_class().constraint_types
]
# → ['knapsack', 'scheduler']

# Find games that model resource allocation
resource_games = [
    name for name, game_class in AVAILABLE_GAMES.items()
    if "resource_allocation" in game_class().business_analogies
]
# → ['scheduler', 'knapsack']

Quick Reference: Constraint Types to Business Problems

Constraint Pattern	Puzzle Examples	Business Use Cases
Optimization	Knapsack, Scheduler	Portfolio selection, Sprint planning, Budget allocation
Precedence	Scheduler	Project dependencies, Workflow sequencing
Sequential Adjacency	Hidato	Path planning, Route sequencing, Tour optimization
Hamiltonian Path	Hidato	Traveling salesman, Circuit design
Bipartite Matching	Tents and Trees	Job assignment, Resource pairing
Region Growth	Fillomino	Territory expansion, Cluster formation
Spatial Planning	Sokoban	Warehouse logistics, Movement planning
Connectivity	Nurikabe, Slitherlink	Network design, Routing, Zone planning
Global Loop	Slitherlink	Circuit design, Path finding
Boolean SAT	Lights Out	Feature dependencies, Toggle systems
Cage Sums	Killer Sudoku, Kakuro	Team budgets, Grouped constraints
AllDifferent	Sudoku, KenKen	Resource uniqueness, Assignment problems

Quick Start

Prerequisites

Python 3.11 or higher
UV (recommended) or pip

Installation

From Source (Development)

Using UV (Recommended)

# Clone the repository
git clone https://github.com/chrishayuk/puzzle-arcade-server.git
cd puzzle-arcade-server

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install development dependencies
make dev-install

# Run the server
make run

Using pip

# Clone the repository
git clone https://github.com/chrishayuk/puzzle-arcade-server.git
cd puzzle-arcade-server

# Install in development mode with dev dependencies
pip install -e ".[dev]"

# Run the server
PYTHONPATH=. uv run --with chuk-protocol-server chuk-protocol-server server-launcher -c config.yaml

Using Make (All Commands)

# See all available commands
make help

# Development workflow
make dev-install      # Install dev dependencies
make run              # Run the server
make test             # Run tests
make test-cov         # Run tests with coverage report
make check            # Run linting and type checking
make format           # Format code with ruff
make security         # Run security checks

# Docker workflow
make docker-build     # Build Docker image
make docker-run       # Run in Docker container

# Examples
make example-telnet              # Browse games via telnet
make example-telnet-sudoku       # Sudoku demo
make example-telnet-kenken       # KenKen demo
make example-ws                  # WebSocket tour
make example-ws-interactive      # Interactive WebSocket mode

# Deployment
make fly-deploy       # Deploy to Fly.io
make fly-logs         # View Fly.io logs

Docker Setup

Build and run with Docker:

# Using Make
make docker-run

# Or manually
docker build -t puzzle-arcade-server .
docker run -p 8023:8023 -p 8024:8024 -p 8025:8025 -p 8026:8026 puzzle-arcade-server

Connecting to the Server

Local Development

Via Telnet:

telnet localhost 8023

Via Netcat (TCP):

nc localhost 8024

Via WebSocket:

ws://localhost:8025/ws
ws://localhost:8026/ws

Game Menu

When you connect, you'll see the main menu:

==================================================
       WELCOME TO THE PUZZLE ARCADE!
==================================================

CLASSIC LOGIC PUZZLES:
  1) Sudoku          - Classic logic puzzle - fill 9x9 grid with digits 1-9
  2) KenKen          - Arithmetic cage puzzle - combine math and logic
  3) Kakuro          - Crossword math puzzle - fill runs with unique digits that sum to clues
  4) Binary Puzzle   - Fill grid with 0s and 1s - no three in a row, equal counts
  5) Futoshiki       - Inequality number puzzle - fill grid with constraints
  6) Nonogram        - Picture logic puzzle - reveal image from number clues
  7) Logic Grid      - Deductive reasoning puzzle - match attributes using logic

ADVANCED CP-SAT PUZZLES:
  8) Killer Sudoku   - Sudoku + Kakuro - regions must sum to targets
  9) Lights Out      - Toggle lights to turn all off - XOR constraint puzzle
 10) Mastermind      - Code-breaking with logical deduction and feedback
 11) Slitherlink     - Draw a single loop - numbers show edge counts
 12) Bridges         - Connect islands with bridges - satisfy all numbers
 13) Hitori          - Shade cells to eliminate duplicates - no adjacent shading
 14) Shikaku         - Divide grid into rectangles matching areas

SPECIALIZED CONSTRAINT PUZZLES:
 15) Hidato          - Sequential path puzzle - connect numbers adjacently
 16) Tents           - Place tents next to trees - bipartite matching puzzle
 17) Fillomino       - Fill regions with numbers matching region size
 18) Star Battle     - Place stars avoiding adjacency - multi-region placement
 19) Sokoban         - Push boxes to targets - spatial planning puzzle

OPTIMIZATION CHALLENGES:
 20) Knapsack        - Maximize value within capacity constraints
 21) Task Scheduler  - Minimize makespan with dependencies and resources

ADVANCED REASONING PUZZLES:
 22) Nurikabe        - Island and sea puzzle - connectivity constraints
 23) Einstein's Puzzle - Who owns the fish? Multi-attribute deduction
 24) Minesweeper     - Find all mines using logical deduction

Commands:
  <number>  - Select game by number
  <name>    - Select game by name (e.g., 'sudoku')
  help      - Show this menu again
  quit      - Exit the server
==================================================

Agent-Friendly Mode

The server includes a special agent mode designed for AI tools and LLM integration:

Enabling Agent Mode

> mode agent
Output mode set to: agent

Agent Mode Features

Structured Output - Grid data is wrapped with clear start/end markers:

---GAME-START---
GAME: Sudoku
DIFFICULTY: medium
MOVES: 3
---GRID-START---
  | 1 2 3 | 4 5 6 | 7 8 9 |
  -------------------------
1 | . . 3 | . 2 . | 6 . . |
...
---GRID-END---
---GAME-END---

Benefits for AI Agents:

Easy parsing with regex: ---GRID-START---(.*?)---GRID-END---
Consistent metadata format (GAME, DIFFICULTY, MOVES)
No decorative text or banners to filter out
Minimal token usage compared to normal mode

Switching Modes:

mode normal - Human-friendly output (default)
mode agent - Machine-parseable structured output
mode compact - Reserved for future use

Gymnasium-Compatible RL Environment

The project includes a Gymnasium-compatible environment for training reinforcement learning agents:

Quick Start

from puzzle_arcade_server.gym_env import PuzzleEnv

# Create environment for any of the 24 games
env = PuzzleEnv("sudoku", difficulty="easy", seed=42)

# Reset to start a new episode
obs, info = await env.reset()

# Take actions (text commands or tuples)
obs, reward, terminated, truncated, info = await env.step("place 1 1 5")

# Or use tuple format
obs, reward, terminated, truncated, info = await env.step(("place", 1, 1, 5))

# Get available games
games = PuzzleEnv.available_games()
# → ['sudoku', 'kenken', 'minesweeper', ...]

Features

All 24 games accessible through unified API
Configurable rewards for correct moves, invalid attempts, completion bonuses
Hint system with optional budget limits
Solver-free mode for pure reasoning benchmarks
Efficiency scoring based on optimal step counts
Deterministic seeding for reproducible experiments

Observation Space

obs = {
    "game": "sudoku",
    "difficulty": "easy",
    "seed": 42,
    "moves": 5,
    "invalid_moves": 1,
    "hints_used": 2,
    "is_complete": False,
    "grid": [[4, 0, 8, ...], ...]  # Game-specific state
}

Reward Configuration

env = PuzzleEnv("kenken", reward_config={
    "correct_placement": 1.0,      # Reward for valid moves
    "invalid_attempt": -0.5,       # Penalty for invalid moves
    "completion_bonus": 10.0,      # Bonus for solving
    "hint_penalty": -0.1,          # Penalty for using hints
    "efficiency_multiplier": 2.0,  # Scales completion bonus by efficiency
})

Solver Configuration

from puzzle_arcade_server.models import SolverConfig

# Solver-free mode (no hints allowed)
config = SolverConfig.solver_free()
env = PuzzleEnv("sudoku", solver_config=config)

# Limited hints
config = SolverConfig(hint_budget=5, hint_penalty=0.1)
env = PuzzleEnv("sudoku", solver_config=config)

Evaluation Harness

The project includes a built-in evaluation harness for benchmarking puzzle-solving agents:

Quick Start

# List all available games
puzzle-arcade-eval --list-games

# Evaluate a specific game (10 episodes, medium difficulty)
puzzle-arcade-eval sudoku -d medium -n 10 -v

# Evaluate all games (5 episodes each)
puzzle-arcade-eval --all -d easy -n 5

# Output as JSON for analysis
puzzle-arcade-eval sudoku -n 20 -o json > results.json

Using Make Targets

make eval           # Quick evaluation (3 episodes per game)
make eval-sudoku    # Evaluate Sudoku (10 episodes)
make eval-all       # Evaluate all games (10 episodes each)
make eval-json      # Output as JSON
make list-games     # List available games

Sample Output

Sudoku Medium Evaluation (10 episodes)
==================================================
Solved:     10/10 (100.0%)
Avg Moves:  45.3
Avg Invalid: 0.0
Avg Time:   12ms

Output Formats

text (default) - Human-readable summary
json - Structured JSON for programmatic analysis
csv - Spreadsheet-compatible format
markdown - Documentation-ready tables

Metrics Collected

Metric	Description
`solved`	Whether the puzzle was solved
`moves_made`	Number of valid moves
`invalid_moves`	Number of rejected moves
`hints_used`	Number of hints requested
`wall_time_ms`	Time to solve in milliseconds
`seed`	Puzzle seed for reproducibility

Universal Game Commands

All games support these commands:

Starting and Managing Games

<number> [difficulty] - Select game by number (e.g., 1 medium)
<name> [difficulty] - Select game by name (e.g., sudoku hard)
show - Display the current grid
mode <normal|agent|compact> - Set output mode
help - Show game-specific commands and rules
menu - Return to main menu
quit - Exit the server

Playing Games

place <row> <col> <value> - Place a number/value on the grid
- Example: place 1 5 7 (places 7 at row 1, column 5)
clear <row> <col> - Clear a cell you've filled
hint - Get a hint for the next move
check - Check your progress
solve - Show the solution (ends current game)

Special Commands (Game-Specific)

Logic Grid: connect and exclude commands for associations
See in-game help for game-specific commands

Example Gameplay Sessions

Sudoku

> sudoku medium

==================================================
SUDOKU - MEDIUM MODE
==================================================
Fill the grid so that every row, column, and 3x3 box
contains the digits 1-9 without repetition.

Type 'help' for commands or 'hint' for a clue.
==================================================

  | 1 2 3 | 4 5 6 | 7 8 9 |
  -------------------------
1 | . . 3 | . 2 . | 6 . . |
2 | 9 . . | 3 . 5 | . . 1 |
3 | . . 1 | 8 . 6 | 4 . . |
  -------------------------
4 | . . 8 | 1 . 2 | 9 . . |
5 | 7 . . | . . . | . . 8 |
6 | . . 6 | 7 . 8 | 2 . . |
  -------------------------
7 | . . 2 | 6 . 9 | 5 . . |
8 | 8 . . | 2 . 3 | . . 9 |
9 | . . 5 | . 1 . | 3 . . |
  -------------------------
Moves made: 0
==================================================

> hint
Hint: Try placing 4 at row 1, column 1

> place 1 1 4
Number placed successfully!

> check
Puzzle not yet complete. Keep going!
Moves made: 1

KenKen

> kenken easy

==================================================
KENKEN - EASY MODE
==================================================
KENKEN RULES:
- Fill 4x4 grid with 1-4
- No repeats in rows or columns
- Satisfy cage arithmetic constraints
- Operations: + - * /
==================================================

  | 1  | 2  | 3  | 4  |
  +----+----+----+----+
1 | .8+| .  | .3 | .2 |
  +----+----+----+----+
2 | .  | .6+| .  | .3-|
  +----+----+----+----+
3 | .2 | .6+| .8+| .  |
  +----+----+----+----+
4 | .  | .  | .  | .  |
  +----+----+----+----+

Cages:
  8+: (1,1), (1,2), (2,1)
  3: (1,3)
  2: (1,4)
  ...

> place 1 3 3
Number placed successfully!

Architecture

This server is built on the chuk-protocol-server framework, which provides:

Multiple transport protocol support (Telnet, TCP, WebSocket, WS-Telnet)
Telnet protocol negotiation (IAC, WILL, WONT, DO, DONT)
WebSocket handling with ping/pong keepalive
Connection management and monitoring
Asynchronous I/O with Python asyncio

Game Architecture

Each game is a self-contained module with all logic co-located:

games/
├── _base/              # Base classes
│   ├── game.py         # PuzzleGame ABC
│   └── commands.py     # GameCommandHandler ABC
├── sudoku/
│   ├── __init__.py     # Exports SudokuGame
│   ├── game.py         # Game logic
│   ├── config.py       # SudokuConfig
│   └── commands.py     # Command handler
├── minesweeper/
│   ├── __init__.py
│   ├── game.py
│   └── config.py
└── ... (24 games total)

All games extend the PuzzleGame abstract base class with deterministic seeding:

from puzzle_arcade_server.games._base import PuzzleGame

class PuzzleGame(ABC):
    def __init__(self, difficulty: str = "easy", seed: int | None = None):
        self.seed = seed if seed is not None else random.randint(0, 2**32 - 1)
        self._rng = random.Random(self.seed)  # Deterministic RNG
        # ...

    @property
    @abstractmethod
    def name(self) -> str: ...

    @property
    @abstractmethod
    def constraint_types(self) -> list[str]: ...

    @property
    @abstractmethod
    def business_analogies(self) -> list[str]: ...

    @abstractmethod
    async def generate_puzzle(self) -> None: ...

    @abstractmethod
    async def validate_move(self, *args) -> MoveResult: ...

    @abstractmethod
    def is_complete(self) -> bool: ...

    @abstractmethod
    def render_grid(self) -> str: ...

Handler Architecture

The ArcadeHandler class manages:

Menu-driven game selection
Command parsing and routing (delegating to game-specific handlers)
Grid display with proper formatting
Game state management per connection
Multi-game support

Development

Setup Development Environment

# Clone the repository
git clone https://github.com/chrishayuk/puzzle-arcade-server.git
cd puzzle-arcade-server

# Install development dependencies (with UV)
make dev-install

# Or with pip
pip install -e ".[dev]"

Testing

The project has comprehensive test coverage (94%, 1067 tests):

# Run all tests
make test

# Run tests with coverage report
make test-cov

# Run tests in watch mode
make test-watch

# View coverage report in browser
make serve-coverage

Coverage by Module

src/puzzle_arcade_server/games/_base/             86%   # Base classes (abstract defaults)
src/puzzle_arcade_server/games/sudoku/            92%   # Sudoku module
src/puzzle_arcade_server/games/kenken/            90%   # KenKen module
src/puzzle_arcade_server/games/minesweeper/       96%   # Minesweeper module
src/puzzle_arcade_server/games/sokoban/           83%   # Sokoban (complex pathfinding)
src/puzzle_arcade_server/games/.../               90%+  # All other games
src/puzzle_arcade_server/gym_env.py               90%   # Gymnasium environment
src/puzzle_arcade_server/models/                  90%+  # Pydantic models
------------------------------------------------------
TOTAL                                              94%  🎯

Most modules meet the 90%+ coverage threshold. The remaining gaps are in abstract base class defaults and complex pathfinding algorithms.

Code Quality

The project follows modern Python best practices with a 9.8/10 compliance score:

Tooling

Ruff: Fast linter and formatter (replaces black + flake8)
MyPy: Static type checking
Pytest: Testing framework with async support
Bandit: Security vulnerability scanning

Code Standards

✅ Pydantic v2 Native (10/10) - All models use ConfigDict, zero deprecation warnings
✅ Async Native (9.5/10) - All I/O operations use async/await properly
✅ Type-Safe (10/10) - No dict["key"] patterns, only typed Pydantic models
✅ No Magic Strings (10/10) - All constants use enums or typed constants
✅ Test Coverage (9.5/10) - 94% overall, most files ≥90%

Quality Metrics

1067 tests - All passing ✅
94% coverage - Exceeds 90% threshold ✅
Zero linting errors - Clean codebase ✅
Full type safety - MyPy passes ✅
Deterministic seeding - Reproducible puzzles ✅

# Run all checks (lint + typecheck + test + security)
make check

# Run linter
make lint

# Format code
make format

# Type checking
make typecheck

# Security scanning
make security

Running Example Clients

# Telnet client examples
make example-telnet              # Browse all games
make example-telnet-sudoku       # Sudoku demo
make example-telnet-kenken       # KenKen demo
make example-telnet-interactive  # Interactive mode

# WebSocket client examples
make example-ws                  # Tour all games
make example-ws-sudoku           # Sudoku demo
make example-ws-binary           # Binary puzzle demo
make example-ws-solve            # Solve with hints
make example-ws-interactive      # Interactive mode

CI/CD

The project includes GitHub Actions workflows:

test.yml: Runs tests on Ubuntu, Windows, macOS with Python 3.11, 3.12, 3.13
publish.yml: Publishes to PyPI on release
release.yml: Creates GitHub releases
fly-deploy.yml: Auto-deploys to Fly.io on main branch push

Coverage threshold is set to 90% - builds fail if coverage drops below this.

Deployment to Fly.io

Using Make (Recommended)

# Deploy to Fly.io
make fly-deploy

# Check status
make fly-status

# View logs
make fly-logs

Manual Deployment

Install the Fly CLI: https://fly.io/docs/hands-on/install-flyctl/
Login to Fly:

fly auth login

Create and deploy the app:

# First deployment (creates the app)
fly launch --config fly.toml --now

# Subsequent deployments
fly deploy

Important: Allocate a public IPv6 address for TCP services:

# Allocate IPv6 (free)
fly ips allocate-v6

# Verify IP is allocated
fly ips list

Check the status:

fly status

View logs:

fly logs

Connect to your Puzzle Arcade server:

# Get your app's IPv6 address
fly ips list

# Connect via telnet using IPv6 (free tier)
telnet <your-ipv6> 8023

# WebSocket connections work with hostname
# ws://<your-app>.fly.dev:8025/ws

Note: TCP services (Telnet, raw TCP) require a public IP address on Fly.io. We use IPv6 which is free. IPv4 costs $2/month and is not needed for most users.

Project Structure

puzzle-arcade-server/
├── src/
│   └── puzzle_arcade_server/
│       ├── __init__.py           # Package initialization
│       ├── server.py             # Main arcade handler
│       ├── constants.py          # Game constants
│       ├── models/               # Pydantic models
│       │   ├── __init__.py
│       │   ├── base.py           # GridPosition, MoveResult
│       │   ├── config.py         # Base GameConfig
│       │   ├── enums.py          # DifficultyLevel, GameCommand, etc.
│       │   └── games.py          # Game-specific models (Cage, Task, etc.)
│       └── games/                # Self-contained game modules
│           ├── __init__.py       # AVAILABLE_GAMES registry
│           ├── _base/            # Base classes
│           │   ├── __init__.py
│           │   ├── game.py       # PuzzleGame ABC
│           │   └── commands.py   # GameCommandHandler ABC
│           ├── sudoku/           # Example game module
│           │   ├── __init__.py   # Exports SudokuGame
│           │   ├── game.py       # SudokuGame class
│           │   ├── config.py     # SudokuConfig
│           │   └── commands.py   # SudokuCommandHandler
│           ├── minesweeper/      # Each game is self-contained
│           │   ├── __init__.py
│           │   ├── game.py
│           │   └── config.py
│           └── ... (24 games total)
├── tests/
│   ├── test_puzzle_game.py       # Base class tests
│   ├── test_deterministic_seeding.py  # Seeding tests
│   ├── test_sudoku_game.py       # Sudoku tests
│   ├── test_minesweeper.py       # Minesweeper tests
│   └── ... (tests for all 24 games)
├── examples/
│   ├── simple_client.py          # Telnet client example
│   ├── websocket_client.py       # WebSocket client example
│   └── README.md                 # Example usage guide
├── .github/workflows/            # CI/CD workflows
├── pyproject.toml                # Modern Python project config
├── config.yaml                   # Multi-transport server configuration
├── Dockerfile                    # Docker build instructions
├── fly.toml                      # Fly.io deployment config
├── Makefile                      # Development commands (50+ targets)
└── README.md                     # This file

Key Statistics

Test Coverage: 94% overall (1067 tests, all passing)
Code Quality Score: 9.8/10 (near perfect compliance)
Games Implemented: 24 complete puzzle types
- 7 Classic Logic Puzzles
- 7 Advanced CP-SAT Puzzles
- 5 Specialized Constraint Puzzles
- 2 Optimization Challenges
- 3 Advanced Reasoning Puzzles
Supported Transports: 4 (Telnet, TCP, WebSocket, WS-Telnet)
Agent-Friendly Mode: Structured output for AI tools
Gymnasium API: RL-compatible environment for all games
Deterministic Seeding: Reproducible puzzles for testing

Use Cases

1. LLM Reasoning Demonstration

Perfect for demonstrating LLM reasoning capabilities:

LLM connects via telnet: telnet localhost 8023
Selects a puzzle: sudoku hard
Receives puzzle in clean ASCII format
Analyzes constraints and generates solution
Submits moves: place 1 5 7
Server validates each move
Puzzle solved! Proof of reasoning capability

2. Constraint Solver Testing

Test the generality of constraint solvers (like MCP solvers):

Different puzzle types → Same underlying solver
Clean ASCII output → Easy for solver parsing
Simple interface → Focus on solving, not UI
Pure validation → Server validates, doesn't solve

3. Educational Tool

Learn about constraint satisfaction problems:

24 different puzzle types demonstrating various constraint types:
- AllDifferent constraints (Sudoku, KenKen, Futoshiki)
- Arithmetic constraints (KenKen, Kakuro, Killer Sudoku)
- Boolean/SAT constraints (Lights Out, Binary Puzzle)
- Loop/Edge constraints (Slitherlink)
- Deduction constraints (Mastermind, Logic Grid, Einstein's Puzzle)
- Optimization objectives (Knapsack, Task Scheduler)
- Temporal reasoning (Task Scheduler)
- Connectivity constraints (Nurikabe, Slitherlink)
- Probabilistic reasoning (Minesweeper)
- And more!
Well-documented code showing puzzle generation algorithms
Comprehensive tests (1067 tests, 94% coverage) demonstrating validation
Deterministic seeding - Reproduce any puzzle for debugging/testing
Production-ready - 9.8/10 code quality score
Type-safe - Full Pydantic v2 and MyPy compliance
Modular architecture - Each game is self-contained in its own folder

Adding New Puzzle Games

Create a new game folder in src/puzzle_arcade_server/games/:

games/
└── my_puzzle/
    ├── __init__.py     # Export the game class
    ├── game.py         # Game logic
    └── config.py       # Game configuration

Create the config in config.py:

from pydantic import Field
from ...models import DifficultyLevel, GameConfig

class MyPuzzleConfig(GameConfig):
    grid_size: int = Field(default=5, description="Grid size")

    @classmethod
    def from_difficulty(cls, difficulty: DifficultyLevel) -> "MyPuzzleConfig":
        sizes = {DifficultyLevel.EASY: 5, DifficultyLevel.MEDIUM: 7, DifficultyLevel.HARD: 9}
        return cls(difficulty=difficulty, grid_size=sizes[difficulty])

Create the game in game.py:

from .._base import PuzzleGame
from ...models import MoveResult
from .config import MyPuzzleConfig

class MyPuzzleGame(PuzzleGame):
    def __init__(self, difficulty: str = "easy", seed: int | None = None):
        super().__init__(difficulty, seed)
        self.config = MyPuzzleConfig.from_difficulty(self.difficulty)
        # Use self._rng for all randomness (deterministic seeding)

    @property
    def name(self) -> str:
        return "My Puzzle"

    @property
    def constraint_types(self) -> list[str]:
        return ["all_different", "sum_constraint"]

    @property
    def business_analogies(self) -> list[str]:
        return ["resource_allocation", "scheduling"]

    async def generate_puzzle(self) -> None:
        # Use self._rng.randint(), self._rng.choice(), etc.
        self.game_started = True

    async def validate_move(self, row: int, col: int, num: int) -> MoveResult:
        # Validate and apply move
        return MoveResult(success=True, message="Number placed!")

    def is_complete(self) -> bool:
        return all(cell != 0 for row in self.grid for cell in row)

    def render_grid(self) -> str:
        return "  | 1 | 2 | 3 |\n" + ...

    def get_stats(self) -> str:
        return f"Moves: {self.moves_made} | Seed: {self.seed}"

Export in __init__.py:

from .game import MyPuzzleGame
__all__ = ["MyPuzzleGame"]

from .my_puzzle import MyPuzzleGame

AVAILABLE_GAMES = {
    # ... other games
    "mypuzzle": MyPuzzleGame,
}

Add tests in tests/test_my_puzzle_game.py:

from puzzle_arcade_server.games.my_puzzle import MyPuzzleGame

class TestMyPuzzleGame:
    async def test_deterministic_seeding(self):
        game1 = MyPuzzleGame("easy", seed=12345)
        game2 = MyPuzzleGame("easy", seed=12345)
        await game1.generate_puzzle()
        await game2.generate_puzzle()
        assert game1.render_grid() == game2.render_grid()

    def test_seed_in_stats(self):
        game = MyPuzzleGame("easy", seed=42)
        assert "Seed: 42" in game.get_stats()

Run tests and verify:

make test-cov
make check

Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-puzzle)
Make your changes
Run tests and checks (make check)
Ensure coverage stays above 90% (make test-cov)
Commit your changes (git commit -m 'Add amazing puzzle')
Push to the branch (git push origin feature/amazing-puzzle)
Open a Pull Request

Development Guidelines

Follow PEP 8 style guide (enforced by ruff)
Add type hints to all functions
Write tests for new features (>90% coverage)
Update documentation as needed
Ensure all grid headers align properly with rows

Troubleshooting

Server won't start

Ensure chuk-protocol-server is installed: uv pip install chuk-protocol-server
Check ports aren't already in use: lsof -i :8023,8024,8025,8026
Verify Python version is 3.11+: python --version

Tests failing

Install dev dependencies: make dev-install
Clear cache: make clean
Check Python version compatibility

Coverage too low

Run coverage report: make test-cov
View HTML report: make serve-coverage
Add tests for uncovered code

Grid alignment issues

All grid headers must align with row pipes
Use the format " |" for headers to match row format "N |"
Test visually: make example-telnet-kenken

Roadmap

See ROADMAP.md for the full development roadmap.

Highlights

Benchmarking & Metrics

Puzzle complexity metrics (constraint count, variable count, branching factor)
Episode model for tracking game sessions
Trace logging for offline analysis

Agent Evaluation Tools

Batch evaluation harness CLI
Solver vs Model comparison mode
JSON protocol for structured agent communication

Learning & Curriculum

Constraint concept progression graph
Tagged puzzle sets for educators
Difficulty scaling based on constraint complexity

Ecosystem Integrations

MCP native mode for agent frameworks
Python client library
REST/WebSocket API documentation

UX & Community

Interactive web viewer with replay mode
Public benchmark packs (versioned, citable)
Community leaderboards

License

MIT License - see the main chuk-protocol-server project for details.

Credits

Built using the chuk-protocol-server framework
Puzzle generation algorithms based on backtracking and constraint propagation
Uses modern Python tooling: UV, Ruff, MyPy, Pytest

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.8

Dec 22, 2025

This version

0.7

Dec 21, 2025

0.6

Dec 21, 2025

0.5

Dec 20, 2025

0.4.3

Dec 14, 2025

0.4.2

Dec 14, 2025

0.4.1

Dec 12, 2025

0.4

Dec 12, 2025

0.3

Dec 11, 2025

0.2

Dec 10, 2025

0.1.1

Dec 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

puzzle_arcade_server-0.7.tar.gz (207.0 kB view details)

Uploaded Dec 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

puzzle_arcade_server-0.7-py3-none-any.whl (180.8 kB view details)

Uploaded Dec 21, 2025 Python 3

File details

Details for the file puzzle_arcade_server-0.7.tar.gz.

File metadata

Download URL: puzzle_arcade_server-0.7.tar.gz
Upload date: Dec 21, 2025
Size: 207.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for puzzle_arcade_server-0.7.tar.gz
Algorithm	Hash digest
SHA256	`039682b5f347463f2f23ed54d61bfe9a1a51146e6ce0807a1df9abe48ee6baf3`
MD5	`3cd1df816d232f3ec347d02bea7d9949`
BLAKE2b-256	`c36306fcdde49cbb94a41998984f91ac51b366995d59d74b46a8ced2ae74207a`

See more details on using hashes here.

File details

Details for the file puzzle_arcade_server-0.7-py3-none-any.whl.

File metadata

Download URL: puzzle_arcade_server-0.7-py3-none-any.whl
Upload date: Dec 21, 2025
Size: 180.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for puzzle_arcade_server-0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2f4a170a0b891597baf14a7fecf52ba9edf002d3af188b302e684543c2522685`
MD5	`4d2d064aa285d25ac375153584435b94`
BLAKE2b-256	`2411643f4ad1575db1127a16c8d59560dbc2b559a8550a9fd6e569f48b054866`

See more details on using hashes here.

puzzle-arcade-server 0.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Puzzle Arcade Server

Try It Now

Features

Available Games

Classic Logic Puzzles

Advanced CP-SAT Puzzles

Specialized Constraint Puzzles

Optimization Challenges

Advanced Reasoning Puzzles

Solver Profiles & Business Mapping

Example: Query Games by Profile

Quick Reference: Constraint Types to Business Problems

Quick Start

Prerequisites

Installation

From Source (Development)

Using UV (Recommended)

Using pip

Using Make (All Commands)

Docker Setup

Connecting to the Server

Local Development

Game Menu

Agent-Friendly Mode

Enabling Agent Mode

Agent Mode Features

Gymnasium-Compatible RL Environment

Quick Start

Features

Observation Space

Reward Configuration

Solver Configuration

Evaluation Harness

Quick Start

Using Make Targets

Sample Output

Output Formats

Metrics Collected

Universal Game Commands

Starting and Managing Games

Playing Games

Special Commands (Game-Specific)

Example Gameplay Sessions

Sudoku

KenKen

Architecture

Game Architecture

Handler Architecture

Development

Setup Development Environment

Testing

Coverage by Module

Code Quality

Tooling

Code Standards

Quality Metrics

Running Example Clients

CI/CD

Deployment to Fly.io

Using Make (Recommended)

Manual Deployment

Project Structure

Key Statistics

Use Cases

1. LLM Reasoning Demonstration

2. Constraint Solver Testing

3. Educational Tool

Adding New Puzzle Games

Contributing

Development Guidelines

Troubleshooting