Skip to main content

An OpenAI gym / Gymnasium environment to seamlessly create discrete MDPs from matrices.

Project description

Matrix MDP

Downloads

Easily generate an MDP from transition and reward matricies.

Want to learn more on the story behind this repo? Check the blog post here!

Installation

Assuming you are in the root directory of the project, run the following command:

pip install matrix-mdp-gym

Usage

import gymnasium as gym
import matrix_mdp
env = gym.make('matrix_mdp/MatrixMDP-v0')

Environment documentation

Description

A flexible environment to have a gym API for discrete MDPs with N_s states and N_a actions given:

  • A vector of initial state distribution vector P_0(S)
  • A transition probability matrix P(S' | S, A)
  • A reward matrix R(S', S, A) of the reward for reaching S' after having taken action A in state S

Action Space

The action is a ndarray with shape (1,) representing the index of the action to execute.

Observation Space

The observation is a ndarray with shape (1,) representing index of the state the agent is in.

Rewards

The reward function is defined according to the reward matrix given at the creation of the environment.

Starting State

The starting state is a random state sampled from $P_0$.

Episode Truncation

The episode truncates when a terminal state is reached. Terminal states are inferred from the transition probability matrix as $\sum_{s' \in S} \sum_{s \in S} \sum_{a \in A} P(s' | s, a) = 0$

Arguments

  • p_à: ndarray of shape (n_states, ) representing the initial state probability distribution.
  • p: ndarray of shape (n_states, n_states, n_actions) representing the transition dynamics $P(S' | S, A)$.
  • r: ndarray of shape (n_states, n_states, n_actions) representing the reward matrix.
import gymnasium as gym
import matrix_mdp

gym.make('MatrixMDP-v0', p_0=p_0, p=p, r=r)

Version History

  • v0: Initial versions release

Acknowledgements

Thanks to Will Dudley for his help on learning how to put a Python package together/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matrix-mdp-gym-1.1.1.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

matrix_mdp_gym-1.1.1-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file matrix-mdp-gym-1.1.1.tar.gz.

File metadata

  • Download URL: matrix-mdp-gym-1.1.1.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.16

File hashes

Hashes for matrix-mdp-gym-1.1.1.tar.gz
Algorithm Hash digest
SHA256 a1e3e2ec1805f5e12c5f2d774e95e7acc3e133a69f0c690bb659c6807fa8bd08
MD5 c542cea29c6db526ed3d95498d3cb2ae
BLAKE2b-256 da9f126c5807efcac8cd4b0b09ab907c5437ccc76eb01c749a0a3b1e0ed97fe5

See more details on using hashes here.

File details

Details for the file matrix_mdp_gym-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for matrix_mdp_gym-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a647ce63d11ad8bf9391a9816f1358fbb13100a1f733c6bd40b908062b581177
MD5 3715743d5c78802e54ec330ef482e448
BLAKE2b-256 d359b599f21b2db76259549fe3599327e2aa5aee6145a9be56e7e2d8aacaa40c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page