Skip to main content

Simulated groundwater flow environments for reinforcement learning.

Project description

banner

FloPyArcade provides simple MODFLOW-powered groundwater arcade-type simulation environments. It builds on the functionality of FloPy, empowering pre- and postprocessing of MODFLOW and its related software. The idea is to provide benchmarking environments and examples to the groundwater community that allow experimenting with algorithms in search of optimal control.

build status binder gitter made with python code coverage github version

Installation

Install in Python 3.7+ using pip:

python -m pip install flopyarcade

See in action

See an optimized policy model in control of aquifer management.

python -m flopyarcade.train_rllib_apexdqn --playbenchmark True --envtype 3s-d

The environment (editable, here 3s-d) will be machine-controlled in different environment initializations, until canceled (Alt+F4). Find benchmarks comparing performance to human control below.

To control an environment yourself, for instance the 3r-d environment, use the arrow keys:

python -m flopyarcade.play --manualcontrol True --envtype 3r-d

Rationale

These are example simulations from benchmarking in environment 3s-d - comparing different control agents:

benchmarkcontrolexample

Why this matters, in a nutshell: What is encapsulated in a game here, can be envisioned to be a real-world operation of an arbitrary groundwater system given a model (ensemble). You can similarly optimize and test policy models, e.g. for real-time operation of your sites.

Too late, with the peak of arcade games a few decades ago, you would think? Obviously. But they received renewed interest with the advent of OpenAI Gym enabling to score past human performance with reinforcement learning. FloPyArcade offers a set of simple simulated groundwater flow environments, following their style of environments. They allow to experiment with existing or new reinforcement learning algorithms to find e.g. neural networks that yield optimal control policies. Two common learning algorithms are readily provided. Many more are and become available throughout the reinforcement learning community. Try and train for yourself. Adding your own simulation environment of arbitrary complexity with your own controls or your own optimization algorithm is possible.

Getting started

Easily simulate an environment, for example with random actions:

from flopyarcade import FloPyEnv
from numpy.random import choice

env = FloPyEnv(ENVTYPE='3s-d')
reward_total = 0.
while not env.done:
  action = choice(env.actionSpace)
  observations, reward, done, info = env.step(action)
  reward_total += reward

Add the following if intending to render on screen:

from matplotlib.pyplot import switch_backend
switch_backend('TkAgg')

env = FloPyEnv(ENVTYPE='3s-d', flagRender=True)

Change to the the following if intending to simulate an environment with continuous-valued control:

from numpy.random import uniform

env = FloPyEnv(ENVTYPE='6r-c')
while not env.done:
  action = uniform(low=0., high=1., size=env.actionSpaceSize)

Benchmarked environments

Multiple environment variants are currently included, three of which can be user-controlled in a game. The objective is to safely transport a virtual particle as it follows advection while travelling from a random location at the western boundary to the eastern boundary. Wells have to be protected from capturing the particle. Furthermore, the particle must not flow into cells of specified head in the north and south. The controls you have depend on the environment. The highest score is achieved if the particle stays on the indicated shortest route, or as close as possible to it.

However, groundwater environments of arbitrary complexity can be implemented similarly, if the desired opimization target(s) can be obtained from the simulation. Feel free to modify. Change the ENVTYPE variable to switch between environments. Examples below list the available environments.

3s-d 2s-d 1s-d

Benchmarks

Below is a list of benchmarks on the simpler 1s-d, 2s-d and 3s-d environments, for 4 different types of operation:

(1) from random actions,

(2) from control through an inexperienced human,

(3) from control through an experienced human and

(4) from control a trained deep neural network as a policy model.

In these benchmarks, the optimized policy model significantly outperforms human control.

averageEvolutions operatorScores

The optimization workflows for the policy models behind these benchmarks, can be reproduced using RLLib as follows:

python -m flopyarcade.train_rllib_apexdqn --envtype 3s-d --cpus 16

Be sure to include the intended number of cpus you wish to dedicate to this process, but not more than logical processors available. Note that RLLib generally allows distributed optimization through Ray in a compute cluster to speed things up massively. This needs manual editing of the configuration, yet is relatively straightforward. Find out more in the Ray documentation. Achieving human operation level performance here might take around 1-2 days on a state-of-the-art machine with 16 cores, as of 2021.

Note that the envtype argument is interchangeable to any provided discrete-action environment. Work to optimize continuous-valued environments using RLLib is currently in progress. Similarly, any of the many reinforcement learning libraries can be used instead. The human operation benchmark data will soon be made available for completeness.

Use TensorFlow's TensorBoard to monitor the optimization progress, if desired, by starting it and providing the logdir path (here /log/dir/path) provided by RLLib during operation:

tensorboard --logdir /log/dir/path

More environments

More environments are available, yet currently remain free of benchmarks. Note: '0s-d' is an experimental environment based on MODFLOW's BMI and not yet displayed.

6s-c 6r-c 5s-c 5s-c-cost 5r-c 4s-c 4r-c 3r-d 3s-c 3r-c 2r-d 2s-c 2r-c 1r-d 1s-c 1r-c

Optimization

Two algorithms are currently provided along with the environments for training deep neural networks as policy models. These are implementions of (1) double Q-learning and (2) a weights-evolving genetic algorithm, optionally combined with a simple implementation of novelty search to help avoiding convergence towards local minima. They reside in the FloPyAgent class.

The environment formulation allows for models, controls and objectives of arbitrary complexity. Modifications or more complex environments can easily be implemented with small changes to the code.

Examples of machine-controlled actions taken in the same environment by the highest-scoring agent of genetic optimization after various generations: genetic optimization 3d genetic optimization

Usage

There are main files, that can be called as follows:

  1. play.py allows to simulate an environment with (1) manual control from keystrokes or (2) control from a given policy model located in the models subfolder. In the simplest environments (1s-d, 1r-d, 2s-d, 2r-d, 3s-d and 3r-d), test, for example, with manual control:
python -m flopyarcade.play --manualcontrol True --envtype 3r-d
  1. train_dqn.py trains a feed-forward multi-layer (i.e. deep) neural network policy model using the Double Q-learning algorithm.
python -m flopyarcade.train_dqn
  1. train_neuroevolution.py runs a search for optimal policy models following a genetic optimization - optionally with novelty search. It allows parallel execution with multiple processes, given the number of available CPU threads by the variable NAGENTSPARALLEL.
python -m flopyarcade.train_neuroevolution

Modify settings for the environment and hyperparameters for the provided optimization algorithms at the top of the files. The underlying policy model can easily be exchanged with arbitrary Keras-based models by replacing the createNNModel function within the FloPyAgent class in FloPyArcade.py. A complete description of current variables and more documentation is planned.

Compiled game (for Windows)

Easily test yourself: Steer the existing environments on Windows. Skip installation by downloading these versions:

TestOnwinENV3 TestOnwinENV2 TestOnwinENV3

Unittesting

python -m coverage run --source=flopyarcade -m unittest discover tests

Citing

To cite this repository in publications:

@misc{FloPyArcade,
  author = {Hoehn, Philipp},
  title = {FloPyArcade: Simulated groundwater environments for reinforcement learning},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {https://github.com/philipphoehn/flopyarcade},
}

Notes

This project is meant to demonstrate a new avenue of applying FloPy. It is experimental and is developed only during spare time. The code is envisioned to ultimately be PEP-8-compliant, but this has smaller priority than improving and optimizing functionality.

The plumbing for FloPy is currently not ideal as files need to be constantly written to disk as the only way to inject information into the process models. With the recent BMI compliance of MODFLOW 6, exchanging information with MODFLOW through memory, while it is running, will soon simplify that.

Contributions

Pull requests and constructive disccusions are absolutely welcome. For major changes, please open an issue first to discuss what you would like to change.

This project is heavily based on FloPy, TensorFlow, Keras, NumPy and others, and I would therefore like to acknowledge all the valuable work of developers of these outstanding libraries. Furthermore, Harrison from pythonprogramming.net indirectly contributed by making inspiring programming tutorials freely accessible to enthusiasts on his website and via the sentdex YouTube channel, as well as many posts on towardsdatascience.com.

Contact: philipp.hoehn@yahoo.com

License

GNU GPLv3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flopyarcade-0.6.18.tar.gz (51.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flopyarcade-0.6.18-py3-none-any.whl (51.2 MB view details)

Uploaded Python 3

File details

Details for the file flopyarcade-0.6.18.tar.gz.

File metadata

  • Download URL: flopyarcade-0.6.18.tar.gz
  • Upload date:
  • Size: 51.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for flopyarcade-0.6.18.tar.gz
Algorithm Hash digest
SHA256 1bbc6e211870864d64427317ab1fa03edadc0dc29fef2ba9f909f835eb252a42
MD5 38a67703a2541436efa7d8e945d7ffc8
BLAKE2b-256 de8a2ada28e74913e061b5de06fb6d00f01fefa777af174b0e24eac2f0c819a7

See more details on using hashes here.

File details

Details for the file flopyarcade-0.6.18-py3-none-any.whl.

File metadata

  • Download URL: flopyarcade-0.6.18-py3-none-any.whl
  • Upload date:
  • Size: 51.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for flopyarcade-0.6.18-py3-none-any.whl
Algorithm Hash digest
SHA256 964d16461f0b60c109c9279dc0dd22b95acfc2559a7ee9460b9278534e8de0db
MD5 fa321ce038e1354a6d4d3eba361c2d13
BLAKE2b-256 0abadd7b19d4da15487d7972a73662ca519b89ea318fb2a8ee1e0c11cd005197

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page