A reinforcement learning environment for the Multi Port Stowage Planning problem
Project description
This package implements the mathematical model of the "Multi Port Stowage Planning" (MPSP) problem, introduced in Avriel et. al 1998, as a reinforcement learning environment using the Gymnasium api. Please see the Gymnasiun documentation for more information on what functions are available.
The Env class has the following input parameters:
- R (int): number of rows
- C (int): number of columns
- N (int): number of ports
- skip_last_port (boolean): if true, the environment will terminate at the second to last port. This is useful for speeding up training, as the last port is trivial to solve.
Note that since the mathematical model disregards any stability constraints, the env automatically swaps the column order in a lexocographic manor, based on the containers in the columns. This reduces the state space, by a factor of almost $C!$. Furthmore, the transportation matrix, is reshifted up and to the left, for each port, to reduce the state space even further.
When using the environment, you should call env.close()
before exiting your program, to avoid memory leaks.
Minimal usage with Stable Baselines:
from MPSPEnv import Env
from sb3_contrib.ppo_mask import MaskablePPO
train_env = Env(R=8, C=4, N=5, skip_last_port=True)
model = MaskablePPO(
"MultiInputPolicy",
train_env
)
model.learn(
total_timesteps=1e6,
progress_bar=True
)
train_env.close()
If you wish to try the env yourself, placing containers using keys 0-3 and removing containers using keys 4-7, you can use the following script:
from MPSPEnv import Env
import numpy as np
import pygame
def get_action_from_key(key):
if key in [
pygame.K_0,
pygame.K_1,
pygame.K_2,
pygame.K_3,
pygame.K_4,
pygame.K_5,
pygame.K_6,
pygame.K_7,
]:
return key - pygame.K_0
return None
if __name__ == "__main__":
env = Env(3, 4, 6)
env.reset()
running = True
while running:
env.render()
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False
break
if event.type == pygame.KEYDOWN:
action = get_action_from_key(event.key)
if action is not None:
_, reward, done, _, _ = env.step(action)
if done:
running = False
pygame.time.wait(10)
pygame.time.wait(1000)
env.close()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for MPSPEnv-0.0.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6355bfb2019886b7485d6566a057ee8002cbf28a9fb5a41a781f3d8f3ae8475a |
|
MD5 | 6d9b4c9136c8405f6ddaf49f56c8eb2f |
|
BLAKE2b-256 | 5f38d766cc09c7c17323b71c41387fb2382fa9da68dc0f489a9929b428138ce4 |