Skip to main content

A strongly typed Multi-Agent Reinforcement Learning framework

Project description

marlenv - A unified framework for multi-agent reinforcement learning

Documentation: https://yamoling.github.io/multi-agent-rlenv

marlenv is a strongly typed library for multi-agent and multi-objective reinforcement learning.

Install the library with:

$ pip install multi-agent-rlenv      # Basics
$ pip install multi-agent-rlenv[all] # With all optional dependencies
$ pip install multi-agent-rlenv[smac,overcooked] # Only SMAC & Overcooked

It aims to provide a simple and consistent interface for reinforcement learning environments by providing abstraction models such as Observations or Episodes. marlenv provides adapters for popular libraries such as gym or pettingzoo and provides utility wrappers to add functionalities such as video recording or limiting the number of steps.

Most classes are dataclasses, which makes serialization straightforward (for example with orjson).

Fundamentals

States & Observations

MARLEnv.reset() returns a pair of (Observation, State) and MARLEnv.step() returns a Step.

  • Observation contains:
    • data: shape [n_agents, *observation_shape]
    • available_actions: boolean mask [n_agents, n_actions]
    • extras: extra features per agent (default shape (n_agents, 0))
  • State represents the environment state and can also carry extras.
  • Step bundles obs, state, reward, done, truncated, and info.

Rewards are stored as np.float32 arrays. Multi-objective envs use reward vectors with reward_space.size > 1.

Extras

Extras are auxiliary features appended by wrappers (agent id, last action, time ratio, available actions, ...). Wrappers that add extras must update both extras_shape and extras_meanings so downstream users can interpret them. State extras should stay in sync with Observation extras when applicable.

Environment catalog

marlenv.catalog exposes curated environments and lazily imports optional dependencies.

from marlenv import catalog

env1 = catalog.overcooked().from_layout("scenario4")
env2 = catalog.lle().level(6)
env3 = catalog.DeepSea(max_depth=5)
env4 = catalog.connect_n()(width=7, height=6, n=4)

Catalog entries require their corresponding extras at install time (e.g., multi-agent-rlenv[overcooked], multi-agent-rlenv[lle]).

Wrappers & builders

Wrappers are composable through RLEnvWrapper and can be chained via Builder for fluent configuration.

from marlenv import Builder
from marlenv.adapters import SMAC

env = (
    Builder(SMAC("3m"))
    .agent_id()
    .time_limit(20)
    .available_actions()
    .build()
)

Common wrappers include time limits, delayed rewards, masking available actions, and video recording.

Using the library

Adapters for existing libraries

Adapters normalize external APIs into MARLEnv:

import marlenv

gym_env = marlenv.make("CartPole-v1", seed=25)

from marlenv.adapters import SMAC
smac_env = SMAC("3m", debug=True, difficulty="9")

from pettingzoo.sisl import pursuit_v4
from marlenv.adapters import PettingZoo
env = PettingZoo(pursuit_v4.parallel_env())

For deterministic behavior, seed the environment:

env.seed(123)
obs, state = env.reset()

Designing a custom environment

Create a custom environment by inheriting from MARLEnv and implementing reset, step, get_observation, and get_state.

import numpy as np
from marlenv import MARLEnv, DiscreteSpace, MultiDiscreteSpace, Observation, State, Step

class CustomEnv(MARLEnv[MultiDiscreteSpace]):
    def __init__(self):
        super().__init__(
            n_agents=3,
            action_space=DiscreteSpace.action(5).repeat(3),
            observation_shape=(4,),
            state_shape=(2,),
        )
        self.t = 0

    def reset(self, * seed:int|None=None):
        if seed is not None:
            self.seed(seed)
        self.t = 0
        return self.get_observation(), self.get_state()

    def step(self, action):
        self.t += 1
        return Step(self.get_observation(), self.get_state(), reward=0.0, done=False)

    def get_observation(self):
        return Observation(np.zeros((3, 4), dtype=np.float32), self.available_actions())

    def get_state(self):
        return State(np.array([self.t, 0], dtype=np.float32))

Related projects

Project details


Release history Release notifications | RSS feed

This version

4.2.4

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multi_agent_rlenv-4.2.4.tar.gz (56.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multi_agent_rlenv-4.2.4-py3-none-any.whl (62.2 kB view details)

Uploaded Python 3

File details

Details for the file multi_agent_rlenv-4.2.4.tar.gz.

File metadata

  • Download URL: multi_agent_rlenv-4.2.4.tar.gz
  • Upload date:
  • Size: 56.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for multi_agent_rlenv-4.2.4.tar.gz
Algorithm Hash digest
SHA256 4f79c289c871fbb6e2ee6c698aa4b40fcf2eeae704568177c4d50401d97913da
MD5 f2b173094e258a4c7e8dcd63524732ed
BLAKE2b-256 e3bd3740b170babbfe4e7a98e27a98e4efb72b8a6b22e8a079f81dd14a0b9e63

See more details on using hashes here.

Provenance

The following attestation bundles were made for multi_agent_rlenv-4.2.4.tar.gz:

Publisher: release.yaml on yamoling/multi-agent-rlenv

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file multi_agent_rlenv-4.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for multi_agent_rlenv-4.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3489f8fec50fbf79dc05c1f980c9b15a4569e8c4a79d7d37795018f0f411a7db
MD5 486d2baa464699ef634829fa8abac157
BLAKE2b-256 2acc9173f164bd524d96d4ff943642ec625ce02e6b5ba2a634c519796c54fec0

See more details on using hashes here.

Provenance

The following attestation bundles were made for multi_agent_rlenv-4.2.4-py3-none-any.whl:

Publisher: release.yaml on yamoling/multi-agent-rlenv

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page