Pickomino (Heckmeck) Gymnasium environment

Project description

Pickomino-Env

Animated demo of the Pickomino game played manually.

Description

An environment conforming to the Gymnasium API for the dice game Pickomino (Heckmeck am Bratwurmeck) Goal: train a Reinforcement Learning agent for optimal play. Meaning, decide which face of the dice to collect, when to roll and when to stop.

Differences from the Physical Game

If you know the physical game, note the following simplifications:

Failed Attempt: the highest tile on the table is removed, not turned face-down.
Tile selection: the best reachable tile is always taken automatically, you cannot choose a lower-valued tile like in the physical game.
Stealing: always performed when possible, you cannot choose.
Win condition: determined correctly when playing manually with GUI (most worms win, ties broken by the highest tile). When training without a renderer, no winner is declared; use total reward as your metric. But take care, stolen tiles do not reduce your reward, total reward can exceed your final score.
Stack height: not included in the observation (visible in the physical game).

Action Space

The action space is MultiDiscrete([6, 2]). The step() method accepts both the ndarray returned by action_space.sample() and a plain Python tuple.

action = (die_face (0–5), action_type (0=roll, 1=stop))

Index	die_face	action_type
0–5	Die face to collect: 0→1 eye, 1→2 eyes, 2→3 eyes, 3→4 eyes, 4→5 eyes, 5→worm	—
0–1	—	0 = roll again, 1 = stop and take a tile

Observation Space

The observation is a dict with four keys:

Key	Max	Shape
dice_collected	8	(6,)
dice_rolled	8	(6,)
tiles_table	1	(16,)
tile_players	36	(number_of_players,)

There are eight dice, each with faces 1–5 plus a worm. The worm is a sixth distinct die face, but it scores 5 points. The same as the 5-eye face — so it is not a sixth distinct point value.

Note: There are eight dice to roll and collect. A die has six sides with the number of eyes one through five, but a worm instead of a six. The values correspond to the number of eyes, with the worm also having the value five (and not six!). The 16 tiles are numbered 21 to 36 and have worm values from one to four spread in four groups. The game is for two to seven players. Here your Reinforcement Learning Agent is the first player. The other players are computer bots. The bots play, according to a heuristic. When you create the environment, you have to define the number of bots.

For a more detailed description of the rules, see the file pickomino-rulebook.pdf. You can play the game online here: https://www.maartenpoirot.com/pickomino/. The heuristic used by the bots is described here: https://frozenfractal.com/blog/2015/5/3/how-to-win-at-pickomino/.

Rewards

The goal is to collect tiles in a stack. The winner is the player, which at the end of the game has the most worms on her tiles. For the Reinforcement Learning Agent a reward equal to the value (worms) of a tile is given when the tile is picked. For a failed attempt (see rulebook), a corresponding negative reward is given. When a bot steals your tile, no negative reward is given. Hence, the total reward at the end of the game can be greater than the score.

For the full rules see the Pickomino rulebook or play online. To try the environment manually, see Play manually. The bot heuristic is described here.

Info Dictionary

The info dictionary is returned at every step. It is intended for debugging and logging, not for learning.

Key	Type	Description
`dice_collected`	`list[int]`	Counts of each die face collected this turn
`dice_rolled`	`list[int]`	Counts of each die face in the current roll
`terminated`	`bool`	Whether the episode has terminated
`truncated`	`bool`	Whether the game was truncated due to the last action
`tiles_table_vec`	`numpy.ndarray[int8]`, shape `(16,)`	Binary vector of tiles currently available on the table
`smallest_tile`	`int`	Lowest-numbered tile still on the table
`explanation`	`str`	Reason for the last termination, truncation, or failed attempt
`player_stack`	`list[int]`	All tiles currently held by the agent
`player_score`	`int`	Agent's current score (sum of worm values)
`current_player_index`	`int`	Index of the player whose turn it is
`bot_scores`	`list[int]`	Scores of all bots, in order

Starting State

dice_collected = [0, 0, 0, 0, 0, 0].
dice_rolled = [3, 0, 1, 2, 0, 2] Random dice, sum = 8.
tiles_table = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1].
tile_players = [0, 0, 0] (with number_of_bots = 2).

Episode End

Termination occurs when there are no more tiles to take on the table — Game Over.

Truncation

Truncation occurs when the agent attempts an illegal action during dice selection or rolling (for example, selecting a face that was not rolled, selecting a face already collected this turn, or choosing to roll when no dice remain). The game continues, and a new valid action is required.

Invalid Actions

Out-of-range actions (outside [0–5] or [0–1]) raise a ValueError and do not affect the episode state.

Failed Attempt

A Failed Attempt occurs when the agent fails to secure a tile. If the agent has a stack of already picked tiles, then the top tile is returned to the table, and a negative reward is applied. If the stack is empty, nothing happens, and the reward is zero. The game continues — the episode does not end.

Arguments

These must be specified.

Parameter	Type	Default	Description
`number_of_bots`	int	1	Number of bot opponents (1-6) you want to play against
`render_mode`	str or None	None	Visualization mode: None (training), "human" (display), or "rgb_array" (recording)

Bot Heuristic

The bots use the following heuristic, inspired by Frozen Fractal's strategy:

Take the highest-contributing face. Select the die face where count × face value is greatest. Worms count as 5.
Tie-breaking. When two faces contribute equally: prefer worms over 5s. If still tied, prefer the face with the fewest dice keeping more dice available for future rolls. Hence, for example, three 4s are preferred over four 3s.
Worm priority on early rolls. If no dice have been collected yet and this is the third roll or later, take worms if available, regardless of contribution.
Stop as soon as a tile is reachable. Once the running total meets or exceeds the lowest available tile value, and a worm has been collected, the bot stops.

Setup

Python 3.10–3.14

Installation

We recommend installing in a virtual environment:

python -m venv .venv
# macOS/Linux
source .venv/bin/activate

# Windows PowerShell
.venv\Scripts\Activate.ps1

# Windows cmd.exe
.venv\Scripts\activate.bat

# Windows Git Bash
source .venv/Scripts/activate
pip install pickomino-env

Verify your installation:

pickomino-play

Play manually

Playing a few games manually is a great way to understand the rules and game dynamics before training a Reinforcement Learning agent. Launch the game with the pygame GUI:

pickomino-play

To play against more bots:

pickomino-play --number-of-bots=3

Valid range: 1–6 bots.

To change the bot play speed, adjust the RENDER_DELAY constant in constants.py. A higher value slows the bots down, a lower value speeds them up.

RENDER_DELAY: Final[float] = 2

Usage example

import gymnasium as gym

# render_mode options:
#   None         — no rendering, fastest (default, recommended for training)
#   "human"      — pygame window, requires a display
#   "rgb_array"  — returns RGB array, useful for recording
env = gym.make("Pickomino-v0", render_mode="human", number_of_bots=2)

# Reset and get initial observation
obs, info = env.reset(seed=42)

# Run one episode
terminated = False
truncated = False
total_reward = 0

while not terminated and not truncated:
    # Agent selects action: (die_face, roll_choice)
    action = env.action_space.sample()  # Random action for demo
    # Step environment
    obs, reward, terminated, truncated, info = env.step(action)
    total_reward += reward
    if truncated:
        print(f"Invalid action: {info['explanation']}")

print(f"Episode finished. Total reward: {total_reward}")
env.close()

Security & Bug Bounty

Found a bug? Valid reports are rewarded with a physical copy of the Pickomino board game. See SECURITY.md for scope, timelines, and how to report.

Resources

Game Rules: Pickomino Rulebook
Play Online: Maarteen Poirot's Pickomino
Play Board Game Arena: Pickomino with elo system
Bot Strategy: How to Win at Pickomino
Repository: smallgig/Pickomino
Gymnasium: https://gymnasium.farama.org/

License

MIT License. See LICENSE for details.

Project details

Release history Release notifications | RSS feed

This version

1.4.1

May 1, 2026

1.4.0

Apr 24, 2026

1.3.0

Mar 9, 2026

1.2.0

Feb 16, 2026

1.1.1

Jan 19, 2026

1.0.7

Jan 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pickomino_env-1.4.1.tar.gz (264.9 kB view details)

Uploaded May 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pickomino_env-1.4.1-py3-none-any.whl (266.0 kB view details)

Uploaded May 1, 2026 Python 3

File details

Details for the file pickomino_env-1.4.1.tar.gz.

File metadata

Download URL: pickomino_env-1.4.1.tar.gz
Upload date: May 1, 2026
Size: 264.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pickomino_env-1.4.1.tar.gz
Algorithm	Hash digest
SHA256	`8e956ebc2079b4e2d91dcde441c056e019c571f343804e32f785b6b16b0923d2`
MD5	`2b889e4b175f581dd07d3c5aa09c1d08`
BLAKE2b-256	`a439ae48788cd6e8887331f3373f93d59aa8c54a9921814ada3bbb37f17db24f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pickomino_env-1.4.1.tar.gz:

Publisher: python-publish.yml on smallgig/Pickomino

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pickomino_env-1.4.1.tar.gz
- Subject digest: 8e956ebc2079b4e2d91dcde441c056e019c571f343804e32f785b6b16b0923d2
- Sigstore transparency entry: 1420375281
- Sigstore integration time: May 1, 2026
Source repository:
- Permalink: smallgig/Pickomino@774f11cd27e852a6f384c72e023e81a3afa4bfb7
- Branch / Tag: refs/tags/v1.4.1
- Owner: https://github.com/smallgig
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@774f11cd27e852a6f384c72e023e81a3afa4bfb7
- Trigger Event: release

File details

Details for the file pickomino_env-1.4.1-py3-none-any.whl.

File metadata

Download URL: pickomino_env-1.4.1-py3-none-any.whl
Upload date: May 1, 2026
Size: 266.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pickomino_env-1.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c52d907273fb4611b610a93051059cbcad9f3fcb7a07397b8d07174339abcebf`
MD5	`c6c4021a8e5cb05aff2627cd33841b42`
BLAKE2b-256	`095ded7982be4556bd9e843cc2225a8c5eb784df448e8d281de1ff7a338a34cc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pickomino_env-1.4.1-py3-none-any.whl:

Publisher: python-publish.yml on smallgig/Pickomino

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pickomino_env-1.4.1-py3-none-any.whl
- Subject digest: c52d907273fb4611b610a93051059cbcad9f3fcb7a07397b8d07174339abcebf
- Sigstore transparency entry: 1420375326
- Sigstore integration time: May 1, 2026
Source repository:
- Permalink: smallgig/Pickomino@774f11cd27e852a6f384c72e023e81a3afa4bfb7
- Branch / Tag: refs/tags/v1.4.1
- Owner: https://github.com/smallgig
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@774f11cd27e852a6f384c72e023e81a3afa4bfb7
- Trigger Event: release

pickomino-env 1.4.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Pickomino-Env

Description

Differences from the Physical Game

Action Space

Observation Space

Rewards

Info Dictionary

Starting State

Episode End

Truncation

Invalid Actions

Failed Attempt

Arguments

Bot Heuristic

Setup

Installation

Play manually

Usage example

Security & Bug Bounty

Resources

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance