Skip to main content

Ants! On your GPU! With JAX!

Project description


Thanks ants!

Thants

Multi-agent and multi-team reinforcement-learning environment modelling ant foraging

Introduction

Thants is a rl environment library based on models of ant colonies foraging for food, also supporting environments with multiple competing colonies.

Thants is implemented using JAX, allowing the environments to be run on GPU enabling large scale performant simulation, and the ability to run environments alongside JAX and Pytorch ML tools.

The environment is implemented using the Jumanji RL environment API, with some modification for the multi-colony case.

Usage

Installation

Thants can be installed from pypi using

pip install thants

Examples

Single Colony

The single colony environment follows the Jumanji environment API, with actions provided as an array of individual actions:

from thants.envs import ThantsMono
import jax

env = ThantsMono(dims=(50, 50))
key = jax.random.PRNGKey(101)
state, obs = env.reset(key)
state_history = [state]

for _ in range(50):
    key, action_key = jax.random.split(key, 2)
    actions = jax.random.choice(
        action_key, env.num_actions, (env.num_agents,)
    )
    state, obs = env.step(state, actions)
    state_history.append(state)

env.animate(state_history, 100, "mono_colony.gif")

Multi-Colony

In the multi-colony case each colony is treated independently (and can be different sizes), so actions, observations, timesteps are list/tuples of arrays/structs:

from thants.envs import Thants
import jax
import jax.numpy as jnp

env = Thants((50, 100))
key = jax.random.PRNGKey(101)
state, obs = env.reset(key)
state_history = [state]

for _ in range(50):
    key, k1, k2 = jax.random.split(key, 3)
    # List of action arrays per colony
    actions = [
        jax.random.choice(k1, env.num_actions, (env.num_agents[0],)),
        jax.random.choice(k2, env.num_actions, (env.num_agents[1],)),
    ]
    state, obs = env.step(state, actions)
    state_history.append(state)

env.animate(state_history, 100, "multi_colony.gif")

Preset simple environments can be imported from thants.envs.ThantsDual and thants.envs.ThantsQuad with 2 and 4 colonies respectively.

Environment


A Thants environment with two competing colonies.

The environment is modelled as a grid, wrapped at the boundaries. Ants (the agents) occupy individual cells on the grid (and cannot overlap). Ants can pick up, carry, and deposit food, or deposit persistent signals that can be observed by other ants in the same colony.

State

Colonies

The state of the ant colonies is represented by a single struct:

  • Ants: Individual ants themselves have several components:
    • Positions: 2d indices of ant positions on the environment grid.
    • Carrying: The amount of food being carried by each ant.
    • Health: Ant health (currently unused).
  • Colony Index: The index of the colony each ant belongs to
  • Nests: 2d array indicating the index of the colony each cell belongs to (0 in the case a cell is not the nest of any colony).
  • Signals: 4d array of signal deposits at each cell for each colony. Signals have multiple channels to facilitate communication between ants (i.e. the 2nd dimension of the array is the signal channel).

Environment

The state of the environment then consists of the colonies and state shared by all the colonies

  • Colonies: Ant colonies state
  • Food: 2d array representing the amount of food deposited at each cell
  • Terrain: 2d array of flags indicating if a cell can be occupied by an ant. This allows obstacles to be placed on the environment.

Updates

Each step of the environment performs the following update to the state:

  • Convert integer action choices into state updates
  • Apply ant position updates
  • Apply food pick-up and deposit actions
  • Drop any new food deposits
  • Dissipate and propagate signals
  • Apply signal deposit actions
  • Clear any food that has been deposited on a nest (i.e. the food is consumed by the colony)

The behaviour of the dynamics of signals can be customised by implementing the thants.signals.SignalPropagator base class and passing it when initialising the environment.

State Generators

The initialisation of the environment can be customised by implementing the relevant base class and passing them to the environment.

Actions

Ants can select from several discrete actions, indicated by an integer value:

  • 0: Null action (i.e. no change to the environment)
  • 1 - 4: Move in one of the four ordinal directions (if possible)
  • 5: Take a fixed amount of food from the ants location (if possible)
  • 6: Deposit a fixed amount of food from the ants location (if possible)
  • 7+: Deposit a fixed amount of signal at the ants location

Note that actions can be selected, but may not be possible e.g. attempting to move to an occupied cell, or taking food from an empty cell. In this case there will be no change in state due to the chosen action.

Observations

Individual agent observations also consist of several components. Observations are individually made for the local neighbourhood of each ant, i.e. the surrounding cells on the environment grid, and their own cell:

  • ants: Flag indicating if a cell in the neighbourhood is occupied by an ant, with the shape [n-ants, n-colonies, 9] where the second dimensions indicates the individual colonies. The ants from the same colony will always be on the first row.
  • signals: Signal deposits in the neighbourhood (across all channels), signals are observed individually for each colony.
  • food: Food deposits within the neighbourhood.
  • nest: Flag indicating if a neighbouring cell is designated as a nest for the ants colony.
  • terrain: Flag indicating if a neighbouring cell can be occupied.
  • carrying: Amount of food currently being carried by each ant.

the number of local cells observed by each agent can be customised with the view_distance environment parameter.

Rewards

By default, rewards are granted to ants when they deposit food on their colonies nest. Reward signals can be customised by implementing the respective base class thants.rewards.RewardFn.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thants-0.3.4.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thants-0.3.4-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file thants-0.3.4.tar.gz.

File metadata

  • Download URL: thants-0.3.4.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thants-0.3.4.tar.gz
Algorithm Hash digest
SHA256 f3d3551a829de463700d30f15a40cc25c97aa7177048dc7dae4db034a6803094
MD5 30ed03a401cafabd0aedbf8bec89a6f4
BLAKE2b-256 822bb8d1104d5416ab556241be90aff41ca1281e4a67264b6c5c296bf274c012

See more details on using hashes here.

Provenance

The following attestation bundles were made for thants-0.3.4.tar.gz:

Publisher: release.yaml on zombie-einstein/thants

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file thants-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: thants-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 26.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thants-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8368748fe91a9eebeafd93a12d4ca0ae920649909bd135b371bf8ebc6ae53b0e
MD5 9064c023b06106a0258117909e8a81c7
BLAKE2b-256 76d967ea871813810f359a02a53dc154c1446a147f2e86a572e4e90cdca3718a

See more details on using hashes here.

Provenance

The following attestation bundles were made for thants-0.3.4-py3-none-any.whl:

Publisher: release.yaml on zombie-einstein/thants

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page