Symmetric Positive Definite (SPD) enforcement layers for PyTorch
Project description
spdlayers
Symmetric Positive Definite (SPD) enforcement layers for PyTorch.
Regardless of the input, the output of these layers will always be a SPD tensor!
Installation
Install with pip
python -m pip install spdlayers
About
The Cholesky
layer uses a cholesky factorization to enforce SPD, and the Eigen
layer uses an eigendecomposition to enforce SPD or symmetric semi-definite predictions. Use the n_zero_eigvals
option with the Eigen
layer to enforce the exact number of zero eigenvalues that you should expect.
Both layers take in some tensor of shape [batch_size, input_shape]
and output a SPD tensor of shape [batch_size, output_shape, output_shape]
. The relationship between input and output is defined by the following.
input_shape = sum([i for i in range(output_shape + 1)])
The layers have no learnable parameters, and merely serve to transform a vector space to a SPD matrix space.
The initialization options for each layer are:
Args:
output_shape (int): The dimension of square tensor to produce,
default output_shape=6 results in a 6x6 tensor
symmetry (str): 'anisotropic' or 'orthotropic'. Anisotropic can be
used to predict for any shape tensor, while 'orthotropic' is a
special case of symmetry for a 6x6 tensor.
positive (str): The function to perform the positive
transformation of the diagonal of the lower triangle tensor.
Choices are 'Abs' (default), 'Square', 'Softplus', 'ReLU',
'ReLU6', '4', and 'Exp'.
min_value (float): The minimum allowable value for a diagonal
component. Default is 1e-8.
Examples
This is the simplest neural network using 1 hidden layer of size 100. There are 2 input features to the model (n_features = 2
), and model outputs a 6 x 6
spd tensor.
Using the Cholesky factorization as the SPD layer:
import torch.nn as nn
import spdlayers
hidden_size = 100
n_features = 2
out_shape = 6
in_shape = spdlayers.in_shape_from(out_shape)
model = nn.Sequential(
nn.Linear(n_features, hidden_size),
nn.Linear(hidden_size, in_shape),
spdlayers.Cholesky(output_shape=out_shape)
)
Or with the eigendecomposition as the SPD layer:
import torch.nn as nn
import spdlayers
hidden_size = 100
n_features = 2
out_shape = 6
in_shape = spdlayers.in_shape_from(out_shape)
model = nn.Sequential(
nn.Linear(n_features, hidden_size),
nn.Linear(hidden_size, in_shape),
spdlayers.Eigen(output_shape=out_shape)
)
examples/train_sequential_model.py trains this model on the orthotropic stiffness trensor from the 2D Isotruss.
API
The API has the following import structure.
spdlayers
├── Cholesky
├── Eigen
├── in_shape_from
├── layers
│ ├── Cholesky
│ ├── Eigen
├── tools.py
│ ├── in_shape_from
Documentation
You can use pdoc to build API documentation, or view the online documentation.
pdoc3 --html spdlayers
Requirements
For basic usage:
python>=3.6
torch>=1.9.0
Additional dependencies for testing:
pytest
pytest-cov
numpy
Changelog
Changes are documented in CHANGELOG.md
Citations
Cholesky
If you use the Cholesky method, you should cite the following paper.
@article{XU2021110072,
title = {Learning constitutive relations using symmetric positive definite neural networks},
journal = {Journal of Computational Physics},
volume = {428},
pages = {110072},
year = {2021},
issn = {0021-9991},
doi = {https://doi.org/10.1016/j.jcp.2020.110072},
url = {https://www.sciencedirect.com/science/article/pii/S0021999120308469},
author = {Kailai Xu and Daniel Z. Huang and Eric Darve},
keywords = {Neural networks, Plasticity, Hyperelasticity, Finite element method, Multiscale homogenization}
}
What this paper proposed is that you do a Cholesky decomposition on each data point in your dataset, and then train a NN to learn the lower triangular form. This is a great paper because the method is so simple that it is very easy for every researcher to use!
What we proposed you do instead with this method is not to perform a Cholesky decomposition on your data, but rather include the LL^T
operation as a transformation within your NN. We believe this subtle difference provides for the following advantages:
- no transformation of your dataset is needed
- evaluate the training performance on the real data
- eases production use of the model (i.e. a single pytorch model does the black-box mapping of
x
→C
)
Eigendecomposition
If you use the Eigendecomposition method, you should cite the following paper.
@article{https://doi.org/10.1002/nme.2681,
author = {Amsallem, David and Cortial, Julien and Carlberg, Kevin and Farhat, Charbel},
title = {A method for interpolating on manifolds structural dynamics reduced-order models},
journal = {International Journal for Numerical Methods in Engineering},
volume = {80},
number = {9},
pages = {1241-1258},
keywords = {reduced-order modeling, matrix manifolds, real-time prediction, surrogate modeling, linear structural dynamics},
doi = {https://doi.org/10.1002/nme.2681},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/nme.2681},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1002/nme.2681}
}
What this paper proposed is that you perform an Eigendecomposition on each data point in your dataset. You could then fit a linear model in a tangent space that will always be SPD (and it is pure mathematical beauty).
We proposed you should abstract this method into a NN layer. This subtle change allows for the following advantages:
- no transformation of your dataset is needed
- evaluate the training performance in the real space
- easily fit any non-linear regression model
- allows for many more transitional maps other than just the exponential
- eases production use of the model (i.e. a single pytorch model does the black-box mapping of
x
→C
)
spdlayers
If you find our abstractions and presentation useful, please cite our paper. Our paper demonstrated that the inclusion of these spdlayers
increased model accuracy.
@article{jekel2022neural,
title={Neural Network Layers for Prediction of Positive Definite Elastic Stiffness Tensors},
author={Jekel, Charles F and Swartz, Kenneth E and White, Daniel A and Tortorelli, Daniel A and Watts, Seth E},
journal={arXiv preprint arXiv:2203.13938},
year={2022}
}
License
SPDX-License-Identifier: MIT
LLNL-CODE-829369
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file spdlayers-0.3.0.tar.gz
.
File metadata
- Download URL: spdlayers-0.3.0.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.6.0 importlib_metadata/4.11.3 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0fc3819b0e04331ab010d0b71596c9dce748682c0ff6fb930c5dbf270f15f5ba |
|
MD5 | 00505c07015ab2251ee6d0cdee9059ef |
|
BLAKE2b-256 | 0594bd4fa4050ae0f333a999ac28942f6e2d6601f250415717c395aee98e8ea8 |