A fast solver for Markov Decision Processes

These details have not been verified by PyPI

Project links

Development Status
- 3 - Alpha
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

MDPSolver

MDPSolver is a Python package for large Markov Decision Processes (MDPs) with infinite-horizons.

Features

Fast solver: Our C++-based solver is substantially faster than other MDP packages available for Python. See details in the documentation.
Two optimality criteria: Discounted and Average reward.
Three optimization algorithms: Value iteration, Policy iteration, and Modified policy iteration.
Three value-update methods: Standard, Gauss–Seidel, and Successive over-relaxation.
Supports sparse matrices.
Employs parallel computing.

Installation

Linux

Install directly from PyPI with:

pip install mdpsolver

MDPSolver works out of the box on Ubuntu 22 and newer.

GLIBC not found

Some users will encounter the version 'GLIBC_2.32' not found error when attempting to import MDPSolver in Python. In this case, it might help to manually compile and replace the SO-file for the optimization module in the MDPSolver package. See the steps on how to solve the issue in the documentation.

Windows

Requires Visual Studio 2022 (17.9) with MSVC C++ compiler and libraries installed.

After installing Visual Studio (incl. MSVC C++ compiler and libraries), install directly from PyPI with:

pip install mdpsolver

Quick start guide

The following shows how to get quickly started with mdpsolver.

Usage

Start by specifying the reward function and transition probabilities as lists. The following is an example of a simple MDP containing three states and two actions in each state.

#Import packages
import mdpsolver

#Rewards (3 states x 2 actions)
#e.g. choosing second action in first state gives reward=-1
rewards = [[5,-1],
           [1,-2],
           [50,0]]

#Transition probabilities (3 from_states x 2 actions x 3 to_states)
#e.g. choosing first action in third state gives a probability of 0.6 of staying in third state
tranMatWithZeros = [[[0.9,0.1,0.0],[0.1,0.9,0.0]],
                    [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                    [[0.2,0.2,0.6],[0.5,0.5,0.0]]]

Now, create the model object and insert the problem parameters.

#Create model object
mdl = mdpsolver.model()

#Insert the problem parameters
mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatWithZeros=tranMatWithZeros)

We can now optimize the policy.

mdl.solve()

The optimized policy can be returned in a variety of ways. Here, we return the policy as a list and print directly in the terminal.

print(mdl.getPolicy())
#[1, 1, 0]

Sparse transition matrix?

mdpsolver has three alternative formats for large and highly sparse transition probability matrices.

(1) Elementwise representation (excluding elements containing zeros):

#[from_state,action,to_state,probability]
tranMatElementwise = [[0,0,0,0.9],
                      [0,0,1,0.1],
                      [0,1,0,0.1],
                      [0,1,1,0.9],
                      [1,0,0,0.4],
                      [1,0,1,0.5],
                      [1,0,2,0.1],
                      [1,1,0,0.3],
                      [1,1,1,0.5],
                      [1,1,2,0.2],
                      [2,0,0,0.2],
                      [2,0,1,0.2],
                      [2,0,2,0.6],
                      [2,1,0,0.5],
                      [2,1,1,0.5]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatElementwise=tranMatElementwise)

(2) Probabilities and column (to_state) indices in separate lists:

tranMatProbs = [[[0.9,0.1],[0.1,0.9]],
                [[0.4,0.5,0.1],[0.3,0.5,0.2]],
                [[0.2,0.2,0.6],[0.5,0.5]]]

tranMatColumns = [[[0,1],[0,1]],
                [[0,1,2],[0,1,2]],
                [[0,1,2],[0,1]]]

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatProbs=tranMatProbs,
        tranMatColumns=tranMatColumns)

(3) Load the elementwise representation from a file:

mdl.mdp(discount=0.8,
        rewards=rewards,
        tranMatFromFile="transitions.csv")

Documentation

The documentation can be found in the wiki for MDPSolver.

How to cite

Project details

These details have not been verified by PyPI

Project links

Development Status
- 3 - Alpha
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.9.10

Jan 13, 2026

0.9.9

Mar 22, 2025

This version

0.9.8

Feb 18, 2025

0.9.7

Jun 16, 2024

0.9.6

May 17, 2024

0.9.5

May 12, 2024

0.9.4

May 6, 2024

0.9.3

May 5, 2024

0.9.2

May 1, 2024

0.9.1

Apr 28, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdpsolver-0.9.8.tar.gz (1.2 MB view details)

Uploaded Feb 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mdpsolver-0.9.8-py3-none-any.whl (1.2 MB view details)

Uploaded Feb 18, 2025 Python 3

File details

Details for the file mdpsolver-0.9.8.tar.gz.

File metadata

Download URL: mdpsolver-0.9.8.tar.gz
Upload date: Feb 18, 2025
Size: 1.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.8.tar.gz
Algorithm	Hash digest
SHA256	`cd859ec7a4fc41cc30c47cda55fd3850b36985761c7f122143d6b16d65488001`
MD5	`549dd122f4528fe2e29122b6b8c78bbc`
BLAKE2b-256	`3e31911a39706706e32c9d368bdd2a3d6b392aeb927f0ae1d09159aaedfc70eb`

See more details on using hashes here.

File details

Details for the file mdpsolver-0.9.8-py3-none-any.whl.

File metadata

Download URL: mdpsolver-0.9.8-py3-none-any.whl
Upload date: Feb 18, 2025
Size: 1.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mdpsolver-0.9.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a0553a1b9d41908ba886d4666fb350d272015070ba9919b0afa2b446650e9ae3`
MD5	`93257a6d509917021020541ee71c4689`
BLAKE2b-256	`c8f501891a20dd89193e5697b1475691b2bb3aac86ddf3fb700e4e3cae3814c1`

See more details on using hashes here.

mdpsolver 0.9.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MDPSolver

Features

Installation

Linux

GLIBC not found

Windows

Quick start guide

Usage

Sparse transition matrix?

Documentation

How to cite

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes