Benchmark dataset for learning dynamical systems from data
Project description
DynaBench
The DynaBench package started as a benchmark dataset for learning dynamical systems from lowresolution data. It has since evolved into a fullfledged package for generating synthetic data, training models, and evaluating them on various tasks concerning partial differential equations. The package is designed to be easy to use and flexible, allowing for easy extension and modification of the existing models, data generation algorithms and physical equations.
Take a look at the documentation for more details on the package and how to use it.
⚡️ Getting Started
To get started with the package, you can install it via pip:
pip install dynabench
Downloading data
The DynaBench package contains dozens of different equations that can be used to generate synthetic data. The easiest way to get started, however, is to use one of the original benchmark equations. These can be downloaded using the following command:
from dynabench.dataset import download_equation
download_equation(equation='advection', structure='cloud', resolution='low')
The original benchmark dataset consists of simulations of the following equations:
Equation  Components  Time Order  Spatial Order 

Advection  1  1  1 
Burgers'  2  1  2 
Gas Dynamics  4  1  2 
KuramotoSivashinsky  1  1  4 
ReactionDiffusion  2  1  2 
Wave  1  2  2 
Loading data
To easily load the data the dynabench package provides the DynabenchIterator
iterator:
from dynabench.dataset import DynabenchIterator
advection_iterator = DynabenchIterator(equation='advection',
structure='cloud',
resolution='low',
lookback=4,
rollout=16)
This will iterate through all downloaded simulation of the advection dataset with observation points scattered (cloud) and low resolution. Each sample will be a tuple containing a snapshot of the simulation at the past 4 time steps, the future 16 time steps as target as well as the coordinates of the observation points:
for sample in advection_iterator:
x, y, points = sample
# x is the input data with shape (lookback, n_points, n_features)
# y is the target data with shape (rollout, n_points, n_features)
# points are the observation points with shape (n_points, dim)
# for the advection equation n_features=1 and dim=2
⚙️ Usage
More advanced use cases include generating data for different equations, training models, and evaluating them. The package provides a simple interface for all these tasks.
Example: Generating data for CahnHilliard equation
The DynaBench package provides a simple interface for generating synthetic data for various physical systems. For example data for the CahnHilliard equation can be generated by running:
from dynabench.equation import CahnHilliardEquation
from dynabench.initial import RandomUniform
from dynabench.grid import Grid
from dynabench.solver import PyPDESolver
# Create an instance of the CahnHilliardEquation class with default parameters
pde_equation = CahnHilliardEquation()
# Create an instance of grid with default parameters
grid = Grid(grid_limits=((0, 64), (0, 64)), grid_size=(64, 64))
# generate an initial condition as a sum of 5 gaussians
intitial = RandomUniform()
# Solve the CahnHilliard equation with the initial condition
solver = PyPDESolver(equation=pde_equation, grid=grid, initial_generator=intitial, parameters={'method': "RK23"})
solver.solve(t_span=[0, 100], dt_eval=1)
Example: Training NeuralPDE
The DynaBench package provides several models that can be used to forecast the physical system. For example, to train the NeuralPDE model on the advection equation, you can use the following code snippet:
from dynabench.dataset import DynabenchIterator
from torch.utils.data import DataLoader
from dynabench.model import NeuralPDE
import torch.optim as optim
import torch.nn as nn
advection_train_iterator = DynabenchIterator(split="train",
equation='burgers',
structure='grid',
resolution='low',
lookback=1,
rollout=4)
train_loader = DataLoader(advection_train_iterator, batch_size=32, shuffle=True)
model = NeuralPDE(input_dim=2, hidden_channels=64, hidden_layers=3,
solver={'method': 'euler', 'options': {'step_size': 0.1}},
use_adjoint=False)
optimizer = optim.Adam(model.parameters(), lr=1e3)
criterion = nn.MSELoss()
for epoch in range(10):
model.train()
for i, (x, y, p) in enumerate(train_loader):
x, y = x[:,0].float(), y.float() # only use the first channel and convert to float32
optimizer.zero_grad()
y_pred = model(x, range(0, 5))
y_pred = y_pred.transpose(0, 1)
loss = criterion(y_pred, y)
loss.backward()
optimizer.step()
print(f"Epoch: {epoch}, Batch: {i}, Loss: {loss.item()}")
📈 Benchmark Results
The original six equations have been used to evaluate the performance of various models on the task of forecasting the physical system. For this 900 spatial points have been used. The results are shown below:
 1step MSE
model  Advection  Burgers  Gas Dynamics  KuramotoSivashinsky  ReactionDiffusion  Wave 

CNN  5.30848e05  0.0110988  0.00420368  0.000669837  0.00036918  0.00143387 
FeaSt  0.000130351  0.0116155  0.0162  0.0117867  0.000488848  0.00523298 
GAT  0.00960113  0.0439986  0.037483  0.0667057  0.00915208  0.0151498 
GCN  0.026397  0.13899  0.0842611  0.436563  0.164678  0.0382004 
GraphPDE  0.000137098  0.0107391  0.0194755  0.00719822  0.000142114  0.00207144 
KernelNN  6.31157e05  0.0106146  0.013354  0.00668698  0.000187019  0.00542925 
NeuralPDE  8.24453e07  0.0112373  0.00373416  0.000536958  0.000303176  0.00169871 
Persistence  0.0812081  0.0367688  0.186985  0.142243  0.147124  0.113805 
Point Transformer  4.41633e05  0.0103098  0.00724899  0.00489711  0.000141248  0.00238447 
PointGNN  2.82496e05  0.00882528  0.00901649  0.00673036  0.000136059  0.00138772 
ResNet  2.15721e06  0.0148052  0.00321235  0.000490104  0.000156752  0.00145884 
 16step rollout MSE:
model  Advection  Burgers  Gas Dynamics  KuramotoSivashinsky  ReactionDiffusion  Wave 

CNN  0.00161331  0.554554  0.995382  1.26011  0.0183483  0.561433 
FeaSt  1.48288  0.561197  0.819594  3.74448  0.130149  1.61066 
GAT  41364.1  0.833353  1.21436  5.68925  3.85506  2.38418 
GCN  3.51453e+13  13.0876  7.20633  1.70612e+24  1.75955e+07  7.89253 
GraphPDE  1.07953  0.729879  0.969208  2.1044  0.0800235  1.02586 
KernelNN  0.897431  0.72716  0.854015  2.00334  0.0635278  1.57885 
NeuralPDE  0.000270308  0.659789  0.443498  1.05564  0.0224155  0.247704 
Persistence  2.39393  0.679261  1.457  1.89752  0.275678  2.61281 
Point Transformer  0.617025  0.503865  0.642879  2.09746  0.0564399  1.27343 
PointGNN  0.660665  1.04342  0.759257  2.82063  0.0582293  1.30743 
ResNet  8.64621e05  1.86352  0.480284  1.0697  0.00704612  0.299457 
📃 Citing
If you use DynaBench for your research, please cite:
@inproceedings{dulny2023dynabench,
author = {Dulny, Andrzej and Hotho, Andreas and Krause, Anna},
title = {DynaBench: A Benchmark Dataset for Learning Dynamical Systems from LowResolution Data},
year = {2023},
isbn = {9783031434112},
publisher = {SpringerVerlag},
address = {Berlin, Heidelberg},
doi = {10.1007/9783031434129_26},
booktitle = {Machine Learning and Knowledge Discovery in Databases: Research Track: European Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Proceedings, Part I},
pages = {438–455},
numpages = {18},
keywords = {neuralPDE, dynamical systems, benchmark, dataset},
location = {Turin, Italy}
}
📚 More resources
 The documentation for the package can be found here.
 The original benchmark paper can be found here.
License
The content of this project itself, including the data and pretrained models, is licensed under the Creative Commons AttributionShareAlike 4.0 International Public License (CC BYSA 4.0). The underlying source code used to generate the data and train the models is licensed under the MIT license.
References
[1]
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dynabench0.3.5py3noneany.whl
Algorithm  Hash digest  

SHA256  ee2609840c19879d7ca837b3284cfed9a1aeffe9c703de7853146b59d607ef1f 

MD5  f5da1554e3cc32ec2ae3774bb0361a0e 

BLAKE2b256  2664c2fe402c970ccb9a2d7ebc4cf9009388f28d311dec50d83ea1098ccf2adc 