FluidFlow: models and training utilities for flow matching and diffusion

Project description

FluidFlow: a flow-matching generative model for fluid dynamics surrogates on unstructured meshes

David Ramos¹, Lucas Lacasa², Fermín Gutiérrez¹, Eusebio Valero¹·³, Gonzalo Rubio¹·³

¹ ETSIAE-UPM · School of Aeronautics, Universidad Politécnica de Madrid
² Institute for Cross-Disciplinary Physics and Complex Systems (IFISC, CSIC-UIB)
³ Center for Computational Simulation, Universidad Politécnica de Madrid

Usage

Look at scripts/ to see more examples.

from data.generate_synthetic_data import AnalyticalFunctionDataset
from fluidFlow.dit import DiT
from fluidFlow.trainer import Trainer
from fluidFlow.flow_matching import create_flow_matching

import numpy as np
import torch
from torch.utils.data import TensorDataset

# 1. generate synthetic dataset / load you own dataset here
data_resolution = (32, 32)
generator = AnalyticalFunctionDataset(nx=data_resolution[0], ny=data_resolution[1], x_range=(0, 2*np.pi), y_range=(0, 2*np.pi))
solutions_random, parameters_random = generator.generate_dataset(
    n_samples=1000,
    alpha1_range=(-2.0, 2.0),
    alpha2_range=(-2.0, 2.0)
)
# add channel dimension to solutions
solutions_random = solutions_random[:, None, :, :]
n_train = int(0.8 * len(solutions_random))
train_data = TensorDataset(torch.from_numpy(solutions_random[:n_train]).float(), torch.from_numpy(parameters_random[:n_train]).float())
test_data = TensorDataset(torch.from_numpy(solutions_random[n_train:]).float(), torch.from_numpy(parameters_random[n_train:]).float())

# 2. Define the DiT model and the flow-matching training procedure
model = DiT(
    depth=6,
    hidden_size=128,
    patch_size=1,
    num_heads=4,
    input_size=data_resolution, # dataset grid size
    cond_dim=2, # number of parameters (alpha1, alpha2)
    class_dropout_prob=0.2,
    in_channels=1,
    learn_sigma=False,
    use_swiglu=True,
    use_rope=True,
    # qk_norm=True, # when bf16 training
    attn_type="vanilla",  # window, linear, vanilla
    mlp_ratio=2.5,
)

flow_matching = create_flow_matching(
    neural_net=model,
    input_size=data_resolution,
    cond_scale=2.0,
    sampling_method="euler",
    num_sampling_steps=400,
)

# 3. Define trainer and training configuration
results_folder = './results'
train_steps = 100000
trainer = Trainer(
    flow_matching,
    dataset=train_data,
    dataset_test=test_data,
    train_batch_size=64,
    train_lr=2e-4,
    train_num_steps=train_steps,  # total training steps
    gradient_accumulate_every=1,  # gradient accumulation steps
    ema_decay=0.995,  # exponential moving average decay
    # amp=True,     # turn on mixed precision for faster training and reduced memory usage
    # mixed_precision_type='bf16',
    results_folder=results_folder,  # folder to save results to
    save_and_sample_every=20000,
    eta_min_scheduler=1e-6,
    max_grad_norm=1.0,
    use_cpu=True, # JUST FOR TESTING, SET TO FALSE FOR ACTUAL TRAINING
    compile_model=True,
    split_batches=True
)

# 4. Train the model
trainer.train()

Samples and model checkpoints will be logged to ./results periodically

Multi-GPU Training

The Trainer class is now equipped with 🤗 Accelerator. You can easily do multi-gpu training in two steps using their accelerate CLI

At the project root directory, run

$ accelerate config

Then, in the same directory

$ accelerate launch train.py

Flash Attention 4

The DiT architecture can be trained with Flash Attention 4 for improved speed and memory efficiency. To enable it, you need to install it and FluidFlow will autmatically use it if available. As an important note, Flash Attention 4 doesn't with with values of head_dim smaller than 128.

Abstract

Computational fluid dynamics (CFD) provides high-fidelity simulations of fluid flows but remains computationally expensive for many-query applications. In recent years deep supervised learning (DL) has been used to construct data-driven fluid-dynamic surrogate models. In this work we consider a different learning paradigm and embrace generative modelling as a framework for constructing scalable fluid-dynamics surrogate models.

We introduce FluidFlow, a generative model based on conditional flow-matching — a recent alternative to diffusion models that learns deterministic transport maps between noise and data distributions. FluidFlow is specifically designed to operate directly on CFD data defined on both structured and unstructured meshes alike, without the need to perform any mesh interpolation pre-processing and preserving geometric fidelity.

We assess the capabilities of FluidFlow using two different core neural network architectures — a U-Net and a Diffusion Transformer (DiT) — and condition their learning on physically meaningful parameters such as Mach number, angle of attack, or stagnation pressure (a proxy for Reynolds number). The methodology is validated on two benchmark problems of increasing complexity: prediction of pressure coefficients along an airfoil boundary across different operating conditions, and prediction of pressure and friction coefficients over a full three-dimensional aircraft geometry discretized on a large unstructured mesh.

In both cases, FluidFlow outperforms strong multilayer perceptron baselines, achieving significantly lower error metrics and improved generalisation across operating conditions. Notably, the transformer-based architecture enables scalable learning on large unstructured datasets while maintaining high predictive accuracy. These results demonstrate that flow-matching generative models provide an effective and flexible framework for surrogate modelling in fluid dynamics, with potential for realistic engineering and scientific applications.

Method

We trained FluidFlow with 2 different CFD datasets: airfoil Cp distribution and aircraft Cp and Cf distributions. The airfoil case is simpler, since it can be considered as 1D structured data. Here, we tested two neural network architectures: U-Net and DiT. Both models perform similarly and can work with this kind of data without significant modification.

However, some problems arise when we switch to 3D. Here, the data comes from unstructured meshes and spatial information is more difficult to capture. This makes the U-Net unsuitable for this task, since it relies on convolutional layers. To address this issue, we treated the data as a sequence of points. With this approach, the DiT could be used with only minor modifications to the patching block to accommodate this sequential data. However, the DiT presents its own challenges since it relies on the attention mechanism, which scales quadratically with the number of points — too expensive given that each aircraft has more than 260,000 points.

We propose replacing self-attention with linear attention, a different approach that does not scale quadratically and incurs only a slight loss in accuracy. The diagram below illustrates how the patching and attention components of the blocks are modified.

FluidFlow Pipeline Figure 1. Overview of the FluidFlow DiT modifications: 1D patcher and linear attention replacement.

3D Aircraft Results

FluidFlow faithfully reconstructs high-fidelity pressure and velocity fields across a wide range of Reynolds numbers and geometries directly on the native unstructured mesh, without any remeshing step.

FluidFlow vs CFD Comparison between (ground truth) CFD pressure/friction coefficient fields (top panels) and the prediction generated by the DiT flow-matching model (bottom panels) for one particular operating condition with parameters π = 1×10⁵, M = 0.3 and AoA = −6.

We evaluate FluidFlow on the ONERA 468 CRM challenge, a public benchmark for aerodynamic surrogate modeling on the Common Research Model geometry. The task consists of predicting the pressure coefficient Cp and the friction coefficients Cf,x, Cf,y, Cf,z over the aircraft surface across varying flight conditions, using the official train/test split provided by the challenge. We compare against the baseline MLP model supplied by the organizers — FluidFlow (DiT) outperforms it on every metric.

Model	R²	R²_Cp	R²_Cf,x	R²_Cf,y	R²_Cf,z
MLP	0.956	0.972	0.944	0.951	0.957
FluidFlow (DiT)	0.965	0.974	0.959	0.960	0.965

To reproduce this results, download the data and run the scripts/train_onera_crm.py script.

Airfoil Results

FluidFlow outperforms a standard multilayer perceptron (MLP). The following animations demonstrate how FluidFlow carries out the denoising process for the airfoil Cp case — starting from Gaussian noise and travelling to the data distribution for unseen test cases.


Airfoil simulation 1	Airfoil simulation 2	Airfoil simulation 3

Airfoil simulation 4	Airfoil simulation 5

Note: If the animations take to long to load, please visit the project page to watch them.

In the following table we compare the metrics extracted for the test set of a well-optimized MLP (tuned via Optuna) with the two versions of FluidFlow (U-Net and DiT):

Model	MSE	RMSE	MAE	MRE (%)	AE₉₅	AE₉₉	R²	Relative L²
Vanilla MLP	0.00129	0.03598	0.01763	16.85219	0.05716	0.14176	0.99730	0.04911
FluidFlow (U-Net)	0.00009	0.00961	0.00240	4.48810	0.00761	0.03175	0.99981	0.01325
FluidFlow (DiT)	0.00009	0.00953	0.00249	3.43723	0.00764	0.03246	0.99981	0.01314

BibTeX

If you find this work useful, please consider citing:

@article{ramos2025fluidflow,
  title     = {FluidFlow: a flow-matching generative model for
               fluid dynamics surrogates on unstructured meshes},
  author    = {Ramos, David and Lacasa, Lucas and
               Guti{\'e}rrez, Ferm{\'i}n and
               Valero, Eusebio and Rubio, Gonzalo},
  journal   = {arXiv preprint arXiv:2501.XXXXX},
  year      = {2025},
}

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Apr 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fluidflow-0.1.0.tar.gz (41.1 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fluidflow-0.1.0-py3-none-any.whl (39.6 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file fluidflow-0.1.0.tar.gz.

File metadata

Download URL: fluidflow-0.1.0.tar.gz
Upload date: Apr 1, 2026
Size: 41.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for fluidflow-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e50722ec6221dc90f92c20c2278ab8524e162ced1e4f4f1e91a65ae4b20613d9`
MD5	`33ffc2e0b3a7e8450591a628dfeed107`
BLAKE2b-256	`c9441aed05611b96d109962ca9087b63b7ed6a5316a164e2441d49a58895dbb0`

See more details on using hashes here.

File details

Details for the file fluidflow-0.1.0-py3-none-any.whl.

File metadata

Download URL: fluidflow-0.1.0-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 39.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for fluidflow-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`60ae3eb920adc05015f4d762572d6c871d000ef0c062fc10944385708d43bdb2`
MD5	`407551e139c46bae32c8f5b1bda47619`
BLAKE2b-256	`22bd4532c0392b79e2d65e1af96fbda3f4e33fb5b0215a10eee846a3f5e6bb20`

See more details on using hashes here.

fluidflow 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

FluidFlow: a flow-matching generative model for fluid dynamics surrogates on unstructured meshes

Usage

Multi-GPU Training

Flash Attention 4

Abstract

Method

3D Aircraft Results

Airfoil Results

BibTeX

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes