An efficient solver for Markov Decision Processes

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

`mdpsolver`: An efficient solver for Markov Decision Processes

mdpsolver is the Python package for deriving the policies of Markov Decision Processes (MDPs) with infinite-horizon and the expected total discounted reward optimality criterion.

Features

Available on PyPI.
Solver-engine developed in C++.
Available optimization algorithms: Value iteration, Policy iteration, and Modified policy iteration.
Includes a variety of input formats available for users to choose from.

Quick start guide

The following shows how to get quickly started with mdpsolver.

Installation

Download and install mdpsolver directly from PyPI.

pip install mdpsolver

Usage

Start by specifying the reward function and transition probabilities as lists. The following is an example of a simple MDP containing three states and two actions in each state.

#Import packages
import mdpsolver

#Rewards (3 states x 2 actions)
#e.g. choosing second action in first state gives reward=-1
rewards = [[5,-1],
           [1,-2],
           [50,0]]

#Transition probabilities (3 from_states x 2 actions x 3 to_states)
#e.g. choosing first action in third state gives a probability of 0.6 of staying in third state
tranMatWithZeros = [[[0.9,0.1,0.0],[0.1,0.9,0.0]],
                    [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                    [[0.2,0.2,0.6],[0.5,0.5,0.0]]]

Now, create the model object and insert the problem parameters.

#Create model object
mdl = mdpsolver.model()

#Insert the problem parameters
mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatWithZeros=tranMatWithZeros)

We can now optimize the policy.

mdl.solve()

The optimized policy can be returned in a variety of ways. Here, we return the policy as a list and print directly in the terminal.

print(mdl.getPolicy())
#[1, 1, 0]

Large transition matrix?

mdpsolver has three alternative formats for large and highly sparse transition probability matrices.

(1) Elementwise representation (excluding elements containing zeros):

#[from_state,action,to_state,probability]
tranMatElementwise = [[0,0,0,0.9],
                      [0,0,1,0.1],
                      [0,1,0,0.1],
                      [0,1,1,0.9],
                      [1,0,0,0.4],
                      [1,0,1,0.5],
                      [1,0,2,0.1],
                      [1,1,0,0.3],
                      [1,1,1,0.5],
                      [1,1,2,0.2],
                      [2,0,0,0.2],
                      [2,0,1,0.2],
                      [2,0,2,0.6],
                      [2,1,0,0.5],
                      [2,1,1,0.5]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatElementwise=tranMatElementwise)

(2) Probabilities and column (to_state) indices in separate lists:

tranMatProbs = [[[0.9,0.1],[0.1,0.9]],
                [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                [[0.2,0.2,0.6],[0.5,0.5]]]

tranMatColumns = [[[0,1],[0,1]],
                [[0,1,2],[0,1,2]],
                [[0,1,2],[0,1]]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatProbs=tranMatProbs,
        tranMatColumns=tranMatColumns)

(3) Load the elementwise representation from a file:

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatFromFile="transitions.csv")

User manual

Further information can be found in the wiki for mdpsolver on Github.

How to cite

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.9.7

Jun 16, 2024

0.9.6

May 17, 2024

0.9.5

May 12, 2024

0.9.4

May 6, 2024

0.9.3

May 5, 2024

0.9.2

May 1, 2024

This version

0.9.1

Apr 28, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdpsolver-0.9.1.tar.gz (282.7 kB view hashes)

Uploaded Apr 28, 2024 Source

Built Distribution

mdpsolver-0.9.1-py3-none-any.whl (283.0 kB view hashes)

Uploaded Apr 28, 2024 Python 3

Hashes for mdpsolver-0.9.1.tar.gz

Hashes for mdpsolver-0.9.1.tar.gz
Algorithm	Hash digest
SHA256	`cb9da737c49b5215385ed2bdf5f015123004789370c07972d8340bc91a5526a4`
MD5	`80d88ac8b65b4fcdaafcc1ed39fd67e3`
BLAKE2b-256	`d2f7878a60299e39921fb4fe63fd6f8e1e6a103111b5d7568bdf783f15eb5729`

Hashes for mdpsolver-0.9.1-py3-none-any.whl

Hashes for mdpsolver-0.9.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1ad965cd5adcd69e2440539d48e22be98b005b0dc4a4e0278529c54b0824dc98`
MD5	`150c6699e62304a421b3f6332bfdcbc1`
BLAKE2b-256	`0af2e87228531a048b4bf8a1a32c368114d3da5e28e379f2b997208cd77a2d73`

mdpsolver 0.9.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

`mdpsolver`: An efficient solver for Markov Decision Processes

Features

Quick start guide

Installation

Usage

Large transition matrix?

User manual

How to cite

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

mdpsolver 0.9.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

mdpsolver: An efficient solver for Markov Decision Processes

Features

Quick start guide

Installation

Usage

Large transition matrix?

User manual

How to cite

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`mdpsolver`: An efficient solver for Markov Decision Processes