Skip to main content

A fast solver for Markov Decision Processes

Project description

MDPSolver

Logo

MDPSolver is a Python package for large Markov Decision Processes (MDPs) with infinite-horizons.

Features

  • Fast solver: Our C++-based solver is substantially faster than other MDP packages available for Python. See details in the documentation.
  • Two optimality criteria: Discounted and Average reward.
  • Three optimization algorithms: Value iteration, Policy iteration, and Modified policy iteration.
  • Three value-update methods: Standard, Gauss–Seidel, and Successive over-relaxation.
  • Supports sparse matrices.
  • Employs parallel computing.

Installation

Linux

Install directly from PyPI with:

pip install mdpsolver

MDPSolver works out of the box on Ubuntu 22 and newer.

GLIBC not found

Some users will encounter the version 'GLIBC_2.32' not found error when attempting to import MDPSolver in Python. In this case, it might help to manually compile and replace the SO-file for the optimization module in the MDPSolver package. See the steps on how to solve the issue in the documentation.

Windows

Requires Visual Studio 2022 (17.9) with MSVC C++ compiler and libraries installed.

After installing Visual Studio (incl. MSVC C++ compiler and libraries), install directly from PyPI with:

pip install mdpsolver

Quick start guide

The following shows how to get quickly started with mdpsolver.

Usage

Start by specifying the reward function and transition probabilities as lists. The following is an example of a simple MDP containing three states and two actions in each state.

#Import packages
import mdpsolver

#Rewards (3 states x 2 actions)
#e.g. choosing second action in first state gives reward=-1
rewards = [[5,-1],
           [1,-2],
           [50,0]]

#Transition probabilities (3 from_states x 2 actions x 3 to_states)
#e.g. choosing first action in third state gives a probability of 0.6 of staying in third state
tranMatWithZeros = [[[0.9,0.1,0.0],[0.1,0.9,0.0]],
                    [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                    [[0.2,0.2,0.6],[0.5,0.5,0.0]]]

Now, create the model object and insert the problem parameters.

#Create model object
mdl = mdpsolver.model()

#Insert the problem parameters
mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatWithZeros=tranMatWithZeros)

We can now optimize the policy.

mdl.solve()

The optimized policy can be returned in a variety of ways. Here, we return the policy as a list and print directly in the terminal.

print(mdl.getPolicy())
#[1, 1, 0]

Sparse transition matrix?

mdpsolver has three alternative formats for large and highly sparse transition probability matrices.

(1) Elementwise representation (excluding elements containing zeros):

#[from_state,action,to_state,probability]
tranMatElementwise = [[0,0,0,0.9],
                      [0,0,1,0.1],
                      [0,1,0,0.1],
                      [0,1,1,0.9],
                      [1,0,0,0.4],
                      [1,0,1,0.5],
                      [1,0,2,0.1],
                      [1,1,0,0.3],
                      [1,1,1,0.5],
                      [1,1,2,0.2],
                      [2,0,0,0.2],
                      [2,0,1,0.2],
                      [2,0,2,0.6],
                      [2,1,0,0.5],
                      [2,1,1,0.5]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatElementwise=tranMatElementwise)

(2) Probabilities and column (to_state) indices in separate lists:

tranMatProbs = [[[0.9,0.1],[0.1,0.9]],
                [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                [[0.2,0.2,0.6],[0.5,0.5]]]

tranMatColumns = [[[0,1],[0,1]],
                [[0,1,2],[0,1,2]],
                [[0,1,2],[0,1]]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatProbs=tranMatProbs,
        tranMatColumns=tranMatColumns)

(3) Load the elementwise representation from a file:

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatFromFile="transitions.csv")

Documentation

The documentation can be found in the wiki for MDPSolver.

How to cite

DOI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdpsolver-0.9.9.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mdpsolver-0.9.9-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file mdpsolver-0.9.9.tar.gz.

File metadata

  • Download URL: mdpsolver-0.9.9.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.9.tar.gz
Algorithm Hash digest
SHA256 907d26cc6ff0e0661be42fd6e55235385dafde7cbf2b5abf0952d4e276ac8dcc
MD5 6f33a89dd95624562d75dd74de5757a9
BLAKE2b-256 4137dd345c53db6df43ddca84dd3f8cff5c34c47aa084a22959677f5ba98a8ce

See more details on using hashes here.

File details

Details for the file mdpsolver-0.9.9-py3-none-any.whl.

File metadata

  • Download URL: mdpsolver-0.9.9-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.9-py3-none-any.whl
Algorithm Hash digest
SHA256 b16c36802f94ca6a9adb5bbfd1733ffbde3758677f9b50a9ca38814e5d959367
MD5 f4988f9996759c3505cf56b5177807be
BLAKE2b-256 f726f778a170d0ecedef543e5bad5c99776e04530ee59ae7fba259ab26201aa9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page