Ants! On your GPU! With JAX!
Project description
Thanks ants!
Thants
Multi-agent and multi-team reinforcement-learning environment modelling ant foraging
Introduction
Thants is a rl environment library based on models of ant colonies foraging for food, also supporting environments with multiple competing colonies.
Thants is implemented using JAX, allowing the environments to be run on GPU enabling large scale performant simulation, and the ability to run environments alongside JAX and Pytorch ML tools.
The environment is implemented using the Jumanji RL environment API, with some modification for the multi-colony case.
Usage
Installation
Thants can be installed from pypi using
pip install thants
Examples
Single Colony
The single colony environment follows the Jumanji environment API, with actions provided as an array of individual actions:
from thants.envs import ThantsMono
import jax
env = ThantsMono(dims=(50, 50))
key = jax.random.PRNGKey(101)
state, obs = env.reset(key)
state_history = [state]
for _ in range(50):
key, action_key = jax.random.split(key, 2)
actions = jax.random.choice(
action_key, env.num_actions, (env.num_agents,)
)
state, obs = env.step(state, actions)
state_history.append(state)
env.animate(state_history, 100, "mono_colony.gif")
Multi-Colony
In the multi-colony case each colony is treated independently (and can be different sizes), so actions, observations, timesteps are list/tuples of arrays/structs:
from thants.envs import Thants
import jax
import jax.numpy as jnp
env = Thants((50, 100))
key = jax.random.PRNGKey(101)
state, obs = env.reset(key)
state_history = [state]
for _ in range(50):
key, k1, k2 = jax.random.split(key, 3)
# List of action arrays per colony
actions = [
jax.random.choice(k1, env.num_actions, (env.num_agents[0],)),
jax.random.choice(k2, env.num_actions, (env.num_agents[1],)),
]
state, obs = env.step(state, actions)
state_history.append(state)
env.animate(state_history, 100, "multi_colony.gif")
Preset simple environments can be imported from thants.envs.ThantsDual and
thants.envs.ThantsQuad with 2 and 4 colonies respectively.
Environment
A Thants environment with two competing colonies.
The environment is modelled as a grid, wrapped at the boundaries. Ants (the agents) occupy individual cells on the grid (and cannot overlap). Ants can pick up, carry, and deposit food, or deposit persistent signals that can be observed by other ants in the same colony.
State
Colonies
The state of the ant colonies is represented by a single struct:
- Ants: Individual ants themselves have several components:
- Positions: 2d indices of ant positions on the environment grid.
- Carrying: The amount of food being carried by each ant.
- Health: Ant health (currently unused).
- Colony Index: The index of the colony each ant belongs to
- Nests: 2d array indicating the index of the colony each cell belongs to
(
0in the case a cell is not the nest of any colony). - Signals: 4d array of signal deposits at each cell for each colony. Signals have multiple channels to facilitate communication between ants (i.e. the 2nd dimension of the array is the signal channel).
Environment
The state of the environment then consists of the colonies and state shared by all the colonies
- Colonies: Ant colonies state
- Food: 2d array representing the amount of food deposited at each cell
- Terrain: Array of flags indicating if a cell can be occupied by an ant. This allows obstacles to be placed on the environment.
Updates
Each step of the environment performs the following update to the state:
- Convert integer action choices into state updates
- Apply ant position updates
- Apply food pick-up and deposit actions
- Drop any new food deposits
- Dissipate and propagate signals
- Apply signal deposit actions
- Clear any food that has been deposited on a nest (i.e. the food is consumed by the colony)
The behaviour of the dynamics of signals can be customised by implementing the
thants.signals.SignalPropagator base class and
passing it when initialising the environment.
State Generators
The initialisation of the environment can be customised by implementing the relevant base class and passing them to the environment.
Actions
Ants can select from several discrete actions, indicated by an integer value:
0: Null action (i.e. no change to the environment)1 - 4: Move in one of the four ordinal directions (if possible)5: Take a fixed amount of food from the ants location (if possible)6: Deposit a fixed amount of food from the ants location (if possible)7+: Deposit a fixed amount of signal at the ants location
Note that actions can be selected, but may not be possible e.g. attempting to move to an occupied cell, or taking food from an empty cell. In this case there will be no change in state due to the chosen action.
Observations
Individual agent observations also consist of several components. Observations are individually made for the local neighbourhood of each ant, i.e. the 8 surrounding cells on the environment grid, and their own cell:
ants: Flag indicating if a cell in the neighbourhood is occupied by an ant, with the shape[n-ants, n-colonies, 9]where the second dimensions indicates the individual colonies. The ants from the same colony will always be on the first row.signals: Signal deposits in the neighbourhood (across all channels), signals are observed individually for each colony.food: Food deposits within the neighbourhood.nest: Flag indicating if a neighbouring cell is designated as a nest for the ants colony.terrain: Flag indicating if a neighbouring cell can be occupied.carrying: Amount of food currently being carried by each ant.
Rewards
By default, rewards are granted to ants when they deposit food on their colonies nest.
Reward signals can be customised by implementing the respective base classes
thants.rewards.RewardFn.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file thants-0.3.2.tar.gz.
File metadata
- Download URL: thants-0.3.2.tar.gz
- Upload date:
- Size: 19.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c579569315adfa2fb12039346e6e4a6461eaab12efa695ea82a3929d47457cba
|
|
| MD5 |
b8ec8b587df7eecfb29858aa68ea7887
|
|
| BLAKE2b-256 |
fdd1326304014b8b95695a4f8acbccafc1e52e108036e4fb48b7ca58198caaaf
|
Provenance
The following attestation bundles were made for thants-0.3.2.tar.gz:
Publisher:
release.yaml on zombie-einstein/thants
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
thants-0.3.2.tar.gz -
Subject digest:
c579569315adfa2fb12039346e6e4a6461eaab12efa695ea82a3929d47457cba - Sigstore transparency entry: 698320355
- Sigstore integration time:
-
Permalink:
zombie-einstein/thants@2f1843f395f523ba96a050f3c9b937db49a0e0ac -
Branch / Tag:
refs/heads/main - Owner: https://github.com/zombie-einstein
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@2f1843f395f523ba96a050f3c9b937db49a0e0ac -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file thants-0.3.2-py3-none-any.whl.
File metadata
- Download URL: thants-0.3.2-py3-none-any.whl
- Upload date:
- Size: 25.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e70088c7f4b71b2608598ae5e985ec218b152788a69a08736a17418e36708eb3
|
|
| MD5 |
13e847d332bf7ae7148545fdc4cfe5c4
|
|
| BLAKE2b-256 |
c78b540ecb22cff9e82bd3204f3e11d2acf32101b13fd2015d8482316c98a5d1
|
Provenance
The following attestation bundles were made for thants-0.3.2-py3-none-any.whl:
Publisher:
release.yaml on zombie-einstein/thants
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
thants-0.3.2-py3-none-any.whl -
Subject digest:
e70088c7f4b71b2608598ae5e985ec218b152788a69a08736a17418e36708eb3 - Sigstore transparency entry: 698320357
- Sigstore integration time:
-
Permalink:
zombie-einstein/thants@2f1843f395f523ba96a050f3c9b937db49a0e0ac -
Branch / Tag:
refs/heads/main - Owner: https://github.com/zombie-einstein
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@2f1843f395f523ba96a050f3c9b937db49a0e0ac -
Trigger Event:
workflow_dispatch
-
Statement type: