Skip to main content

No project description provided

Project description

Marlenv

Marlenv is a multi-agent environment for reinforcement learning, based on the OpenAI gym convention.

The function names such as reset(), step() are consistent but the return format is different. Unlike the single agent environments, the multi-agent environments included in this repo formats all returns in a list format, where each element corresponds to each agent in the environment. A similar rule applies to the input action where the action should be a list of actions with a length of number of agents.

Marlenv is an ongoing project and modifications and new environments are expected in the future.

Installation

clone marlenv repo and use pip to install

git clone https://github.com/kc-ml2/marlenv.git
cd marlenv
pip install -e .

Rules

Snake Game

Multiple snakes battle on a fixed size grid map.

Each snake is spawned at a random location on the map, with a random pose and direction at reset().

The map may be initialized with a different walls upon instantiation of the environment.

Snake dies when its head hits a wall or body of another snake. Here, the other snake receives a reward for kill and the dead snake receives a reward for death ('lose').

When multiple snakes collide head to head, all dies without receiving the kill score.

When there is only one snake remaining, it receives a win reward for every unit time of survival.

The snake grows by one pixel when it has eatten a fruit.

Observation Types

Image grid : The order is 'NHWC'

Examples Input Arguments

Snake Game

Creating an environment

import gym
import marlenv
env = gym.make(
    'Snake-v1',
    height=20,       # Height of the grid map
    width=20,        # Width of the grid map
    num_snakes=4,    # Number of snakes to spawn on grid
    snake_length=3,  # Initial length of the snake at spawn time
    vision_range=5,  # Vision range (both width height), map returned if None
    frame_stack=1,   # Number of observations to stack on return
)

Single-agent wrapper

env = gym.make('Snake-v1', num_snakes=1)
env = marlenv.wrappers.SingleAgent(env)

This will unwrap the returned the observation, reward, etc from a list

Using the make_snake() function

# Automatically chooses wrappers to handle single agent, multi-agent, vector_env, etc.
env, observation_space, action_space, properties = marlenv.wrappers.make_snake(
    num_envs=1,  # Number of environments. Used to decided vector env or not
    num_snakes=1,  # Number of players. Used to determine single/multi agent
    **kwargs  # Other input parameters to the environment
)

The returned values are

  • env : The environment object
  • observation_space : The processed observation space (according to env type)
  • action_space : The processed action space
  • properties : The properties is a dict that includes
    • high: highest value that observation can have
    • low: lowest value that the observation can have
    • num_envs: number of environments
    • num_snakes: number of snakes to be spawned
    • discrete: True if action space is discrete, categorical
    • action_info
      • {action_high, action_low} if continuous action or {action_n} if discrete

Custom reward function

The user can change the reward function structure of the snake-game upon instantiation.

The reward function can be defined using python dictionary as the following

custom_reward_func = {
    'fruit': 1.0,
    'kill': 0.0,
    'lose': 0.0,
    'time': 0.0,
    'win': 0.0
}
env = gym.make('snake-v1', reward_func=custom_reward_func)

Each of the each of the keys represent

  • fruit : reward received when the snake eats a fruit
  • kill : reward received when the snake kills another snake
  • lose : reward (or penalty) received when the snake dies
  • time : reward received for each unit of time of survival
  • win : reward received during the snake's time of survival as the last one standing

Each reward can be both + and - float number

Testing

pytest

Citation

@MISC{marlenv2021,
author =   {ML2},
title =    {Marlenv, Multi-agent Reinforcement Learning Environment},
howpublished = {\url{http://github.com/kc-ml2/marlenv}},
year = {2021}
}

Updates

Currently, there is only one environment of multi-agent snake game.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marlenv-1.0.1.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

marlenv-1.0.1-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file marlenv-1.0.1.tar.gz.

File metadata

  • Download URL: marlenv-1.0.1.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for marlenv-1.0.1.tar.gz
Algorithm Hash digest
SHA256 9dfc950954fdfdd1247eadd9d509aeb30d39d80aa52cdc88b6b5d919bfa9d2ca
MD5 8f1ceb5a289d043b1864b1d88df4fd4f
BLAKE2b-256 c94460536f422e9bdd33afeae0c14f7ad18bf4f69968fc22353c641f799289ce

See more details on using hashes here.

File details

Details for the file marlenv-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: marlenv-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for marlenv-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 93f3534af8c6495989de4ab04882c8ee0585676143e5ded506abc1d5b6856cd4
MD5 cb80e2f02eb863ff882b1a8d1dadaa76
BLAKE2b-256 a61239870de4cf565801204c905360bf1908004c7c6e3359ead7d3c4b76d4754

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page