Skip to main content

A fast solver for Markov Decision Processes

Project description

MDPSolver

Logo

MDPSolver is a Python package for large Markov Decision Processes (MDPs) with infinite-horizons.

Features

  • Fast solver: Our C++-based solver is substantially faster than other MDP packages available for Python. See details in the documentation.
  • Two optimality criteria: Discounted and Average reward.
  • Three optimization algorithms: Value iteration, Policy iteration, and Modified policy iteration.
  • Three value-update methods: Standard, Gauss–Seidel, and Successive over-relaxation.
  • Supports sparse matrices.
  • Employs parallel computing.

Installation

Linux

Install directly from PyPI with:

pip install mdpsolver

MDPSolver works out of the box on Ubuntu 22 and newer.

GLIBC not found

Some users will encounter the version 'GLIBC_2.32' not found error when attempting to import MDPSolver in Python. In this case, it might help to manually compile and replace the SO-file for the optimization module in the MDPSolver package. See the steps on how to solve the issue in the documentation.

Windows

Requires Visual Studio 2022 (17.9) with MSVC C++ compiler and libraries installed.

After installing Visual Studio (incl. MSVC C++ compiler and libraries), install directly from PyPI with:

pip install mdpsolver

Quick start guide

The following shows how to get quickly started with mdpsolver.

Usage

Start by specifying the reward function and transition probabilities as lists. The following is an example of a simple MDP containing three states and two actions in each state.

#Import packages
import mdpsolver

#Rewards (3 states x 2 actions)
#e.g. choosing second action in first state gives reward=-1
rewards = [[5,-1],
           [1,-2],
           [50,0]]

#Transition probabilities (3 from_states x 2 actions x 3 to_states)
#e.g. choosing first action in third state gives a probability of 0.6 of staying in third state
tranMatWithZeros = [[[0.9,0.1,0.0],[0.1,0.9,0.0]],
                    [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                    [[0.2,0.2,0.6],[0.5,0.5,0.0]]]

Now, create the model object and insert the problem parameters.

#Create model object
mdl = mdpsolver.model()

#Insert the problem parameters
mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatWithZeros=tranMatWithZeros)

We can now optimize the policy.

mdl.solve()

The optimized policy can be returned in a variety of ways. Here, we return the policy as a list and print directly in the terminal.

print(mdl.getPolicy())
#[1, 1, 0]

Sparse transition matrix?

mdpsolver has three alternative formats for large and highly sparse transition probability matrices.

(1) Elementwise representation (excluding elements containing zeros):

#[from_state,action,to_state,probability]
tranMatElementwise = [[0,0,0,0.9],
                      [0,0,1,0.1],
                      [0,1,0,0.1],
                      [0,1,1,0.9],
                      [1,0,0,0.4],
                      [1,0,1,0.5],
                      [1,0,2,0.1],
                      [1,1,0,0.3],
                      [1,1,1,0.5],
                      [1,1,2,0.2],
                      [2,0,0,0.2],
                      [2,0,1,0.2],
                      [2,0,2,0.6],
                      [2,1,0,0.5],
                      [2,1,1,0.5]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatElementwise=tranMatElementwise)

(2) Probabilities and column (to_state) indices in separate lists:

tranMatProbs = [[[0.9,0.1],[0.1,0.9]],
                [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                [[0.2,0.2,0.6],[0.5,0.5]]]

tranMatColumns = [[[0,1],[0,1]],
                [[0,1,2],[0,1,2]],
                [[0,1,2],[0,1]]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatProbs=tranMatProbs,
        tranMatColumns=tranMatColumns)

(3) Load the elementwise representation from a file:

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatFromFile="transitions.csv")

Documentation

The documentation can be found in the wiki for MDPSolver.

How to cite

DOI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdpsolver-0.9.8.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mdpsolver-0.9.8-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file mdpsolver-0.9.8.tar.gz.

File metadata

  • Download URL: mdpsolver-0.9.8.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.8.tar.gz
Algorithm Hash digest
SHA256 cd859ec7a4fc41cc30c47cda55fd3850b36985761c7f122143d6b16d65488001
MD5 549dd122f4528fe2e29122b6b8c78bbc
BLAKE2b-256 3e31911a39706706e32c9d368bdd2a3d6b392aeb927f0ae1d09159aaedfc70eb

See more details on using hashes here.

File details

Details for the file mdpsolver-0.9.8-py3-none-any.whl.

File metadata

  • Download URL: mdpsolver-0.9.8-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.8-py3-none-any.whl
Algorithm Hash digest
SHA256 a0553a1b9d41908ba886d4666fb350d272015070ba9919b0afa2b446650e9ae3
MD5 93257a6d509917021020541ee71c4689
BLAKE2b-256 c8f501891a20dd89193e5697b1475691b2bb3aac86ddf3fb700e4e3cae3814c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page