Skip to main content

A fast solver for Markov Decision Processes

Project description

MDPSolver

Logo

MDPSolver is a Python package for large Markov Decision Processes (MDPs) with infinite-horizons.

Features

  • Fast solver: Our C++-based solver is substantially faster than other MDP packages available for Python. See details in the documentation.
  • Two optimality criteria: Discounted and Average reward.
  • Three optimization algorithms: Value iteration, Policy iteration, and Modified policy iteration.
  • Three value-update methods: Standard, Gauss–Seidel, and Successive over-relaxation.
  • Supports sparse matrices.
  • Employs parallel computing.

Installation

Linux

Install directly from PyPI with:

pip install mdpsolver

MDPSolver works out of the box on Linux.

Windows

Requires Visual Studio 2022 (17.9) with MSVC C++ compiler and libraries installed.

After installing Visual Studio (incl. MSVC C++ compiler and libraries), install directly from PyPI with:

pip install mdpsolver

Quick start guide

The following shows how to get quickly started with mdpsolver.

Usage

Start by specifying the reward function and transition probabilities as lists. The following is an example of a simple MDP containing three states and two actions in each state.

#Import packages
import mdpsolver

#Rewards (3 states x 2 actions)
#e.g. choosing second action in first state gives reward=-1
rewards = [[5,-1],
           [1,-2],
           [50,0]]

#Transition probabilities (3 from_states x 2 actions x 3 to_states)
#e.g. choosing first action in third state gives a probability of 0.6 of staying in third state
tranMatWithZeros = [[[0.9,0.1,0.0],[0.1,0.9,0.0]],
                    [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                    [[0.2,0.2,0.6],[0.5,0.5,0.0]]]

Now, create the model object and insert the problem parameters.

#Create model object
mdl = mdpsolver.model()

#Insert the problem parameters
mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatWithZeros=tranMatWithZeros)

We can now optimize the policy.

mdl.solve()

The optimized policy can be returned in a variety of ways. Here, we return the policy as a list and print directly in the terminal.

print(mdl.getPolicy())
#[1, 1, 0]

Sparse transition matrix?

mdpsolver has three alternative formats for large and highly sparse transition probability matrices.

(1) Elementwise representation (excluding elements containing zeros):

#[from_state,action,to_state,probability]
tranMatElementwise = [[0,0,0,0.9],
                      [0,0,1,0.1],
                      [0,1,0,0.1],
                      [0,1,1,0.9],
                      [1,0,0,0.4],
                      [1,0,1,0.5],
                      [1,0,2,0.1],
                      [1,1,0,0.3],
                      [1,1,1,0.5],
                      [1,1,2,0.2],
                      [2,0,0,0.2],
                      [2,0,1,0.2],
                      [2,0,2,0.6],
                      [2,1,0,0.5],
                      [2,1,1,0.5]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatElementwise=tranMatElementwise)

(2) Probabilities and column (to_state) indices in separate lists:

tranMatProbs = [[[0.9,0.1],[0.1,0.9]],
                [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                [[0.2,0.2,0.6],[0.5,0.5]]]

tranMatColumns = [[[0,1],[0,1]],
                [[0,1,2],[0,1,2]],
                [[0,1,2],[0,1]]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatProbs=tranMatProbs,
        tranMatColumns=tranMatColumns)

(3) Load the elementwise representation from a file:

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatFromFile="transitions.csv")

Documentation

The documentation can be found in the wiki for MDPSolver (https://github.com/areenberg/MDPSolver/wiki).

How to cite

DOI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdpsolver-0.9.7.tar.gz (858.7 kB view details)

Uploaded Source

Built Distribution

mdpsolver-0.9.7-py3-none-any.whl (862.0 kB view details)

Uploaded Python 3

File details

Details for the file mdpsolver-0.9.7.tar.gz.

File metadata

  • Download URL: mdpsolver-0.9.7.tar.gz
  • Upload date:
  • Size: 858.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.7.tar.gz
Algorithm Hash digest
SHA256 58278dc55bebe3e06f5f26d9a6c6f505438185d80e3b6be804fbcdaf6b98d011
MD5 22fd2a00c3b2fc6d27ae8c69b84db3f7
BLAKE2b-256 d44902881a7c42495a380f4691f4224fe97c2dd462a69ce2bafcc0c8b91f6990

See more details on using hashes here.

File details

Details for the file mdpsolver-0.9.7-py3-none-any.whl.

File metadata

  • Download URL: mdpsolver-0.9.7-py3-none-any.whl
  • Upload date:
  • Size: 862.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.7-py3-none-any.whl
Algorithm Hash digest
SHA256 80b718705cba07fbf3ecf0e350717989e9de21251603fc81c78ada738546cb6d
MD5 0e3072cd107a7c1174a7cb1952826871
BLAKE2b-256 ee95fd955b5210cea8a0fbe3ec93c2f73fc3523f66bf91599684dc665ff27bd4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page