Math on (Hyper-Dual) Tensors with Trailing Axes
Project description
tensortrax
_
| | ████████╗██████╗ █████╗ ██╗ ██╗
| |_ ___ _ __ ___ ___ _ __╚══██╔══╝██╔══██╗██╔══██╗╚██╗██╔╝
| __/ _ \ '_ \/ __|/ _ \| '__| ██║ ██████╔╝███████║ ╚███╔╝
| || __/ | | \__ \ (_) | | ██║ ██╔══██╗██╔══██║ ██╔██╗
\__\___|_| |_|___/\___/|_| ██║ ██║ ██║██║ ██║██╔╝ ██╗
╚═╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝
Math on (Hyper-Dual) Tensors with Trailing Axes.
Features
- Designed to operate on input arrays with trailing axes
- Essential vector/tensor Hyper-Dual number math, including limited support for
einsum
(restricted to max. two operands) - Forward Mode Automatic Differentiation (AD) using Hyper-Dual Tensors, up to second order derivatives
- Create functions in terms of Hyper-Dual Tensors
- Evaluate the function, the gradient (jacobian) and the hessian on given input arrays
- Straight-forward definition of custom functions in variational-calculus notation
Not Features
- Not imitating NumPy (like Autograd)
- No arbitrary-order gradients
Usage
Let's define a scalar-valued function which operates on a tensor.
import tensortrax as tr
import tensortrax.math as tm
def fun(F):
C = F.T() @ F
I1 = tm.trace(C)
J = tm.det(F)
return J ** (-2 / 3) * I1 - 3
The hessian of the scalar-valued function w.r.t. the function argument is evaluated by variational calculus (Forward Mode AD implemented as Hyper-Dual Tensors). The function is called once for each component of the hessian (symmetry is taken care of). The function and the gradient are evaluated with no additional computational cost.
import numpy as np
# some random input data
np.random.seed(125161)
F = np.random.rand(3, 3, 8, 50) / 10
for a in range(3):
F[a, a] += 1
# W = tr.function(fun, ntrax=2)(F)
# dWdF, W = tr.gradient(fun, ntrax=2)(F)
d2WdF2, dWdF, W = tr.hessian(fun, ntrax=2)(F)
Theory
The calculus of variation deals with variations, i.e. small changes in functions and functionals. A small-change in a function is evaluated by applying small changes on the tensor components.
\psi = \psi(\boldsymbol{E})
\delta \psi = \delta \psi(\boldsymbol{E}, \delta \boldsymbol{E})
Let's take the trace of a tensor product as an example. The variation is evaluated as follows:
\psi = tr(\boldsymbol{F}^T \boldsymbol{F}) = \boldsymbol{F} : \boldsymbol{F}
\delta \psi = \delta \boldsymbol{F} : \boldsymbol{F} + \boldsymbol{F} : \delta \boldsymbol{F} = 2 \ \boldsymbol{F} : \delta \boldsymbol{F}
The $P_{ij}$ - component of the jacobian $\boldsymbol{P}$ is now numerically evaluated by setting the respective variational component $\delta P_{ij}$ of the tensor to one and all other components to zero. In total, $i \cdot j$ function calls are necessary to assemble the full jacobian. For example, the $12$ - component is evaluated as follows:
\delta \boldsymbol{F}_{(12)} = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}
\delta_{(12)} \psi = \frac{\partial \psi}{\partial F_{12}} = 2 \ \boldsymbol{F} : \delta \boldsymbol{F}_{(12)} = 2 \ \boldsymbol{F} : \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}
The second order derivative, i.e. the partial derivative of another partial derivative is evaluated by a further small-change (for a linear map, this is equal to the linearization).
\Delta \delta \psi = 2 \ \delta \boldsymbol{F} : \Delta \boldsymbol{F} + 2 \ \boldsymbol{F} : \Delta \delta \boldsymbol{F}
Once again, each component $A_{ijkl}$ of the fourth-order hessian is numerically evaluated. In total, $i \cdot j \cdot k \cdot l$ function calls are necessary to assemble the full hessian (without considering symmetry). For example, the $1223$ - component is evaluated as follows:
\delta \boldsymbol{F}_{(12)} = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}
\Delta \boldsymbol{F}_{(23)} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}
\Delta \delta \boldsymbol{F} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}
\Delta_{(23)} \delta_{(12)} \psi = \Delta_{(12)} \delta_{(23)} \psi = \frac{\partial^2 \psi}{\partial F_{12}\ \partial F_{23}} = 2 \ \delta \boldsymbol{F}_{(12)} : \Delta \boldsymbol{F}_{(23)} + 2 \ \boldsymbol{F} : \Delta \delta \boldsymbol{F}
Numeric calculus of variation in tensortrax
Each Tensor has four attributes: the (real) tensor array and the (hyper-dual) variational arrays. To obtain the above mentioned $12$ - component of the gradient and the $1223$ - component of the hessian, a tensor has to be created with the appropriate small-changes of the tensor components (dual arrays).
from tensortrax import Tensor, f, δ, Δ, Δδ
from tensortrax.math import trace
δF_12 = np.array([
[0, 1, 0],
[0, 0, 0],
[0, 0, 0],
], dtype=float)
ΔF_23 = np.array([
[0, 0, 0],
[0, 0, 1],
[0, 0, 0],
], dtype=float)
x = np.eye(3) + np.arange(9).reshape(3, 3) / 10
F = Tensor(x=x, δx=δF_12, Δx=ΔF_23, Δδx=None)
I1_C = trace(F.T() @ F)
The function as well as the gradient and hessian components are accessible as:
ψ = f(I1_C)
P_12 = δ(I1_C) # (= Δ(I1_C))
A_1223 = Δδ(I1_C)
To obtain full gradients and hessians in one function call, tensortrax
provides helpers which handle the multiple function calls.
# input data with 0 trailing axes
gradient(I1, ntrax=0)(F)
hessian(I1, ntrax=0)(F)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tensortrax-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 332257ef9a7ba24a4642576f87ea24956c48184e95996f40c78647dd2c706489 |
|
MD5 | 8290fdf01bc91e67a47b8ec8de83fce5 |
|
BLAKE2b-256 | 16064b7c23bcbe3ce02d184ebf511b5944f9a5b0134832e0d23b852c01b6c3bb |