Skip to main content

Gymnasium environments for operations research reinforcement learning problems from or-gym.

Project description

or-gymnasium

or-gymnasium packages a collection of Gymnasium environments for operations research reinforcement learning problems adapted from the original or-gym project. The main changes are to the environment interfaces so they work with the latest Gymnasium API.

Installation

Install the package with pip to use the bundled environment registrations:

pip install or-gymnasium

You can also copy the environment files you need into your own project, register them with Gymnasium, and use them without installing this package.

Quickstart

Importing or-gymnasium registers the environments with Gymnasium.

import gymnasium as gym
import or_gymnasium

env = gym.make("Newsvendor-v0")
observation, info = env.reset(seed=123)
observation, reward, terminated, truncated, info = env.step(env.action_space.sample())

Using Copied Environment Files

After copying an environment file, register the environment with Gymnasium using the copied module path and class name.

import gymnasium as gym

gym.register(
    id="Newsvendor-v0",
    entry_point="my_project.envs.newsvendor:NewsvendorEnv",
)

env = gym.make("Newsvendor-v0")
observation, info = env.reset(seed=123)
observation, reward, terminated, truncated, info = env.step(env.action_space.sample())

Replace my_project.envs.newsvendor:NewsvendorEnv with the module path and class name for the environment file in your project.

Examples

We included examples using PPO for continous and discrete action environments with slightly adapted CleanRL for recent versions of Gymnasium. Note that CleanRL dependencies should be installed separately before running.

python ./examples/cleanrl_ppo_continous.py
python ./examples/cleanrl_ppo_discrete.py

Use TensorBoard to view the results.

tensorboard --logdir ./runs

Environments

See the src files and original or-gym repository for detailed descriptions of the environments and their operations research background.

  • Newsvendor-v0: Multi-period newsvendor problem with stochastic demand, lead times, holding costs, and lost-sales penalties.
  • TSP-v0: Sparse, bidirectional traveling-salesperson graph with uniform movement costs and optional action masking.
  • TSP-v1: Fully connected traveling-salesperson graph with Euclidean distance costs and penalties for revisiting nodes.
  • Knapsack-v0: Unbounded knapsack problem where items can be selected repeatedly until capacity is reached or exceeded.
  • Knapsack-v1: Binary knapsack problem where each item can be selected at most once.
  • Knapsack-v2: Bounded knapsack problem where each item has a limited quantity available.
  • Knapsack-v3: Online knapsack problem where randomly presented items must be accepted or rejected one at a time.
  • BinPacking-v0: Small online bin packing instance with bounded waste, stochastic item arrivals, and capacity-based placement actions.
  • BinPacking-v1: Large bounded-waste bin packing instance with higher bin capacity, more item sizes, and a longer horizon.
  • BinPacking-v2: Small perfectly packable bin packing instance with linear waste rewards.
  • BinPacking-v3: Large perfectly packable bin packing instance with linear waste rewards.
  • BinPacking-v4: Small perfectly packable bin packing instance with bounded waste rewards.
  • BinPacking-v5: Large perfectly packable bin packing instance with bounded waste rewards.
  • VMPacking-v0: Online virtual-machine packing problem that assigns CPU and memory demands to physical machines without overloading them.
  • VMPacking-v1: Temporary virtual-machine packing problem where assigned processes expire and release physical-machine capacity.
  • InvManagement-v0: Multi-period, multi-echelon inventory management system with production capacities, lead times, and backlogged unmet demand.
  • InvManagement-v1: Multi-period, multi-echelon inventory management system where unmet demand is treated as lost sales.
  • NetworkManagement-v0: Multi-period supply-network inventory management over a directed graph with production, distribution, raw-material, and market nodes.
  • NetworkManagement-v1: Supply-network inventory management variant where unmet market demand and replenishment orders are lost instead of backlogged.
  • PortfolioOpt-v0: Multi-period portfolio optimization problem for buying and selling three risky assets with transaction costs.
  • VehicleRouting-v0: Dynamic food-delivery vehicle routing problem with stochastic orders, pickup and delivery actions, capacity limits, and time penalties.

Reference

@misc{HubbsOR-Gym,
    author={Christian D. Hubbs and Hector D. Perez and Owais Sarwar and Nikolaos V. Sahinidis and Ignacio E. Grossmann and John M. Wassick},
    title={OR-Gym: A Reinforcement Learning Library for Operations Research Problems},
    year={2020},
    Eprint={arXiv:2008.06319}
}
@misc{towers2024gymnasium,
  author={Towers, Mark and Kwiatkowski, Ariel and Terry, Jordan and Balis, John U and De Cola, Gianluca and Deleu, Tristan and Goul{\~a}o, Manuel and Kallinteris, Andreas and Krimmel, Markus and KG, Arjun and others},
  title={Gymnasium: A Standard Interface for Reinforcement Learning Environments},
  year={2024},
  Eprint={arXiv:2407.17032}
}
@article{huang2022cleanrl,
  author  = {Shengyi Huang and Rousslan Fernand Julien Dossa and Chang Ye and Jeff Braga and Dipam Chakraborty and Kinal Mehta and João G.M. Araújo},
  title   = {CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms},
  journal = {Journal of Machine Learning Research},
  year    = {2022},
  volume  = {23},
  number  = {274},
  pages   = {1--18},
  url     = {http://jmlr.org/papers/v23/21-1342.html}
}

To cite this repository

@misc{gao2026actorpda,
  author={Gao, Ji and Ju, Caleb and Lan, Guanghui and Tong, Zhaohui},
  title={Actor-Accelerated Policy Dual Averaging for Reinforcement Learning in Continuous Action Spaces},
  year={2026},
  Eprint={arXiv:2603.10199}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

or_gymnasium-0.1.0.tar.gz (37.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

or_gymnasium-0.1.0-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file or_gymnasium-0.1.0.tar.gz.

File metadata

  • Download URL: or_gymnasium-0.1.0.tar.gz
  • Upload date:
  • Size: 37.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for or_gymnasium-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bb3b8959edd1a45c1f8d8691b2098c4e97dffdec8076c42f2b9d6122aa2069eb
MD5 62f97abea422acef9937ca170f2cff4c
BLAKE2b-256 3b4e351019a7f43d6da49d375a873083e93e0499110b311f95a1e6ebe172a598

See more details on using hashes here.

Provenance

The following attestation bundles were made for or_gymnasium-0.1.0.tar.gz:

Publisher: workflow.yml on JGIoA/or-gymnasium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file or_gymnasium-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: or_gymnasium-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 40.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for or_gymnasium-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c8e64862fbbdb5bfbf28a141fef9f241e1b40c50903e8f6d375648f5620c0b0
MD5 ed7e6382199148d968c2c40996786eba
BLAKE2b-256 fd1af13dc64b41be6019c9cce48321b593d34da62fc2dbd9714f49ec206d776e

See more details on using hashes here.

Provenance

The following attestation bundles were made for or_gymnasium-0.1.0-py3-none-any.whl:

Publisher: workflow.yml on JGIoA/or-gymnasium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page