Gymnasium environments for operations research reinforcement learning problems from or-gym.
Project description
or-gymnasium
or-gymnasium packages a collection of Gymnasium environments for operations research reinforcement learning problems adapted from the original or-gym project. The main changes are to the environment interfaces so they work with the latest Gymnasium API.
Installation
Install the package with pip to use the bundled environment registrations:
pip install or-gymnasium
You can also copy the environment files you need into your own project, register them with Gymnasium, and use them without installing this package.
Quickstart
Importing or-gymnasium registers the environments with Gymnasium.
import gymnasium as gym
import or_gymnasium
env = gym.make("Newsvendor-v0")
observation, info = env.reset(seed=123)
observation, reward, terminated, truncated, info = env.step(env.action_space.sample())
Using Copied Environment Files
After copying an environment file, register the environment with Gymnasium using the copied module path and class name.
import gymnasium as gym
gym.register(
id="Newsvendor-v0",
entry_point="my_project.envs.newsvendor:NewsvendorEnv",
)
env = gym.make("Newsvendor-v0")
observation, info = env.reset(seed=123)
observation, reward, terminated, truncated, info = env.step(env.action_space.sample())
Replace my_project.envs.newsvendor:NewsvendorEnv with the module path and class name for the environment file in your project.
Examples
We included examples using PPO for continous and discrete action environments with slightly adapted CleanRL for recent versions of Gymnasium. Note that CleanRL dependencies should be installed separately before running.
python ./examples/cleanrl_ppo_continous.py
python ./examples/cleanrl_ppo_discrete.py
Use TensorBoard to view the results.
tensorboard --logdir ./runs
Environments
See the src files and original or-gym repository for detailed descriptions of the environments and their operations research background.
Newsvendor-v0: Multi-period newsvendor problem with stochastic demand, lead times, holding costs, and lost-sales penalties.TSP-v0: Sparse, bidirectional traveling-salesperson graph with uniform movement costs and optional action masking.TSP-v1: Fully connected traveling-salesperson graph with Euclidean distance costs and penalties for revisiting nodes.Knapsack-v0: Unbounded knapsack problem where items can be selected repeatedly until capacity is reached or exceeded.Knapsack-v1: Binary knapsack problem where each item can be selected at most once.Knapsack-v2: Bounded knapsack problem where each item has a limited quantity available.Knapsack-v3: Online knapsack problem where randomly presented items must be accepted or rejected one at a time.BinPacking-v0: Small online bin packing instance with bounded waste, stochastic item arrivals, and capacity-based placement actions.BinPacking-v1: Large bounded-waste bin packing instance with higher bin capacity, more item sizes, and a longer horizon.BinPacking-v2: Small perfectly packable bin packing instance with linear waste rewards.BinPacking-v3: Large perfectly packable bin packing instance with linear waste rewards.BinPacking-v4: Small perfectly packable bin packing instance with bounded waste rewards.BinPacking-v5: Large perfectly packable bin packing instance with bounded waste rewards.VMPacking-v0: Online virtual-machine packing problem that assigns CPU and memory demands to physical machines without overloading them.VMPacking-v1: Temporary virtual-machine packing problem where assigned processes expire and release physical-machine capacity.InvManagement-v0: Multi-period, multi-echelon inventory management system with production capacities, lead times, and backlogged unmet demand.InvManagement-v1: Multi-period, multi-echelon inventory management system where unmet demand is treated as lost sales.NetworkManagement-v0: Multi-period supply-network inventory management over a directed graph with production, distribution, raw-material, and market nodes.NetworkManagement-v1: Supply-network inventory management variant where unmet market demand and replenishment orders are lost instead of backlogged.PortfolioOpt-v0: Multi-period portfolio optimization problem for buying and selling three risky assets with transaction costs.VehicleRouting-v0: Dynamic food-delivery vehicle routing problem with stochastic orders, pickup and delivery actions, capacity limits, and time penalties.
Reference
@misc{HubbsOR-Gym,
author={Christian D. Hubbs and Hector D. Perez and Owais Sarwar and Nikolaos V. Sahinidis and Ignacio E. Grossmann and John M. Wassick},
title={OR-Gym: A Reinforcement Learning Library for Operations Research Problems},
year={2020},
Eprint={arXiv:2008.06319}
}
@misc{towers2024gymnasium,
author={Towers, Mark and Kwiatkowski, Ariel and Terry, Jordan and Balis, John U and De Cola, Gianluca and Deleu, Tristan and Goul{\~a}o, Manuel and Kallinteris, Andreas and Krimmel, Markus and KG, Arjun and others},
title={Gymnasium: A Standard Interface for Reinforcement Learning Environments},
year={2024},
Eprint={arXiv:2407.17032}
}
@article{huang2022cleanrl,
author = {Shengyi Huang and Rousslan Fernand Julien Dossa and Chang Ye and Jeff Braga and Dipam Chakraborty and Kinal Mehta and João G.M. Araújo},
title = {CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms},
journal = {Journal of Machine Learning Research},
year = {2022},
volume = {23},
number = {274},
pages = {1--18},
url = {http://jmlr.org/papers/v23/21-1342.html}
}
To cite this repository
@misc{gao2026actorpda,
author={Gao, Ji and Ju, Caleb and Lan, Guanghui and Tong, Zhaohui},
title={Actor-Accelerated Policy Dual Averaging for Reinforcement Learning in Continuous Action Spaces},
year={2026},
Eprint={arXiv:2603.10199}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file or_gymnasium-0.1.0.tar.gz.
File metadata
- Download URL: or_gymnasium-0.1.0.tar.gz
- Upload date:
- Size: 37.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb3b8959edd1a45c1f8d8691b2098c4e97dffdec8076c42f2b9d6122aa2069eb
|
|
| MD5 |
62f97abea422acef9937ca170f2cff4c
|
|
| BLAKE2b-256 |
3b4e351019a7f43d6da49d375a873083e93e0499110b311f95a1e6ebe172a598
|
Provenance
The following attestation bundles were made for or_gymnasium-0.1.0.tar.gz:
Publisher:
workflow.yml on JGIoA/or-gymnasium
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
or_gymnasium-0.1.0.tar.gz -
Subject digest:
bb3b8959edd1a45c1f8d8691b2098c4e97dffdec8076c42f2b9d6122aa2069eb - Sigstore transparency entry: 1520459111
- Sigstore integration time:
-
Permalink:
JGIoA/or-gymnasium@2f0d78c9da4c68d9d4e4e4fb816baf83ad55c444 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/JGIoA
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@2f0d78c9da4c68d9d4e4e4fb816baf83ad55c444 -
Trigger Event:
push
-
Statement type:
File details
Details for the file or_gymnasium-0.1.0-py3-none-any.whl.
File metadata
- Download URL: or_gymnasium-0.1.0-py3-none-any.whl
- Upload date:
- Size: 40.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c8e64862fbbdb5bfbf28a141fef9f241e1b40c50903e8f6d375648f5620c0b0
|
|
| MD5 |
ed7e6382199148d968c2c40996786eba
|
|
| BLAKE2b-256 |
fd1af13dc64b41be6019c9cce48321b593d34da62fc2dbd9714f49ec206d776e
|
Provenance
The following attestation bundles were made for or_gymnasium-0.1.0-py3-none-any.whl:
Publisher:
workflow.yml on JGIoA/or-gymnasium
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
or_gymnasium-0.1.0-py3-none-any.whl -
Subject digest:
8c8e64862fbbdb5bfbf28a141fef9f241e1b40c50903e8f6d375648f5620c0b0 - Sigstore transparency entry: 1520459118
- Sigstore integration time:
-
Permalink:
JGIoA/or-gymnasium@2f0d78c9da4c68d9d4e4e4fb816baf83ad55c444 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/JGIoA
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@2f0d78c9da4c68d9d4e4e4fb816baf83ad55c444 -
Trigger Event:
push
-
Statement type: