Skip to main content

Ants! On your GPU! With JAX!

Project description


Thanks ants!

Thants

Multi-agent and multi-team reinforcement-learning environment modelling ant foraging

Introduction

Thants is a rl environment library based on models of ant colonies foraging for food, also supporting environments with multiple competing colonies.

Thants is implemented using JAX, allowing the environments to be run on GPU enabling large scale performant simulation, and the ability to run environments alongside JAX and Pytorch ML tools.

The environment is implemented using the Jumanji RL environment API, with some modification for the multi-colony case.

Usage

Installation

Thants can be installed from pypi using

pip install thants

Examples

Single Colony

The single colony environment follows the Jumanji environment API, with actions provided as an array of individual actions:

from thants.envs import ThantsMono
import jax

env = ThantsMono(dims=(50, 50))
key = jax.random.PRNGKey(101)
state, obs = env.reset(key)
state_history = [state]

for _ in range(50):
    key, action_key = jax.random.split(key, 2)
    actions = jax.random.choice(
        action_key, env.num_actions, (env.num_agents,)
    )
    state, obs = env.step(state, actions)
    state_history.append(state)

env.animate(state_history, 100, "mono_colony.gif")

Multi-Colony

In the multi-colony case each colony is treated independently (and can be different sizes), so actions, observations, timesteps are list/tuples of arrays/structs:

from thants.envs import Thants
import jax
import jax.numpy as jnp

env = Thants((50, 100))
key = jax.random.PRNGKey(101)
state, obs = env.reset(key)
state_history = [state]

for _ in range(50):
    key, k1, k2 = jax.random.split(key, 3)
    # List of action arrays per colony
    actions = [
        jax.random.choice(k1, env.num_actions, (env.num_agents[0],)),
        jax.random.choice(k2, env.num_actions, (env.num_agents[1],)),
    ]
    state, obs = env.step(state, actions)
    state_history.append(state)

env.animate(state_history, 100, "multi_colony.gif")

Preset simple environments can be imported from thants.envs.ThantsDual and thants.envs.ThantsQuad with 2 and 4 colonies respectively.

Environment


A Thants environment with two competing colonies.

The environment is modelled as a grid, wrapped at the boundaries. Ants (the agents) occupy individual cells on the grid (and cannot overlap). Ants can pick up, carry, and deposit food, or deposit persistent signals that can be observed by other ants in the same colony.

State

Colonies

The state of the ant colonies is represented by a single struct:

  • Ants: Individual ants themselves have several components:
    • Positions: 2d indices of ant positions on the environment grid.
    • Carrying: The amount of food being carried by each ant.
    • Health: Ant health (currently unused).
  • Colony Index: The index of the colony each ant belongs to
  • Nests: 2d array indicating the index of the colony each cell belongs to (0 in the case a cell is not the nest of any colony).
  • Signals: 4d array of signal deposits at each cell for each colony. Signals have multiple channels to facilitate communication between ants (i.e. the 2nd dimension of the array is the signal channel).

Environment

The state of the environment then consists of the colonies and state shared by all the colonies

  • Colonies: Ant colonies state
  • Food: 2d array representing the amount of food deposited at each cell
  • Terrain: Array of flags indicating if a cell can be occupied by an ant. This allows obstacles to be placed on the environment.

Updates

Each step of the environment performs the following update to the state:

  • Convert integer action choices into state updates
  • Apply ant position updates
  • Apply food pick-up and deposit actions
  • Drop any new food deposits
  • Dissipate and propagate signals
  • Apply signal deposit actions
  • Clear any food that has been deposited on a nest (i.e. the food is consumed by the colony)

The behaviour of the dynamics of signals can be customised by implementing the thants.signals.SignalPropagator base class and passing it when initialising the environment.

State Generators

The initialisation of the environment can be customised by implementing the relevant base class and passing them to the environment.

Actions

Ants can select from several discrete actions, indicated by an integer value:

  • 0: Null action (i.e. no change to the environment)
  • 1 - 4: Move in one of the four ordinal directions (if possible)
  • 5: Take a fixed amount of food from the ants location (if possible)
  • 6: Deposit a fixed amount of food from the ants location (if possible)
  • 7+: Deposit a fixed amount of signal at the ants location

Note that actions can be selected, but may not be possible e.g. attempting to move to an occupied cell, or taking food from an empty cell. In this case there will be no change in state due to the chosen action.

Observations

Individual agent observations also consist of several components. Observations are individually made for the local neighbourhood of each ant, i.e. the 8 surrounding cells on the environment grid, and their own cell:

  • ants: Flag indicating if a cell in the neighbourhood is occupied by an ant, with the shape [n-ants, n-colonies, 9] where the second dimensions indicates the individual colonies. The ants from the same colony will always be on the first row.
  • signals: Signal deposits in the neighbourhood (across all channels), signals are observed individually for each colony.
  • food: Food deposits within the neighbourhood.
  • nest: Flag indicating if a neighbouring cell is designated as a nest for the ants colony.
  • terrain: Flag indicating if a neighbouring cell can be occupied.
  • carrying: Amount of food currently being carried by each ant.

Rewards

By default, rewards are granted to ants when they deposit food on their colonies nest. Reward signals can be customised by implementing the respective base classes thants.rewards.RewardFn.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thants-0.3.2.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thants-0.3.2-py3-none-any.whl (25.9 kB view details)

Uploaded Python 3

File details

Details for the file thants-0.3.2.tar.gz.

File metadata

  • Download URL: thants-0.3.2.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thants-0.3.2.tar.gz
Algorithm Hash digest
SHA256 c579569315adfa2fb12039346e6e4a6461eaab12efa695ea82a3929d47457cba
MD5 b8ec8b587df7eecfb29858aa68ea7887
BLAKE2b-256 fdd1326304014b8b95695a4f8acbccafc1e52e108036e4fb48b7ca58198caaaf

See more details on using hashes here.

Provenance

The following attestation bundles were made for thants-0.3.2.tar.gz:

Publisher: release.yaml on zombie-einstein/thants

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file thants-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: thants-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 25.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thants-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e70088c7f4b71b2608598ae5e985ec218b152788a69a08736a17418e36708eb3
MD5 13e847d332bf7ae7148545fdc4cfe5c4
BLAKE2b-256 c78b540ecb22cff9e82bd3204f3e11d2acf32101b13fd2015d8482316c98a5d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for thants-0.3.2-py3-none-any.whl:

Publisher: release.yaml on zombie-einstein/thants

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page