Wrapper to train a tf.Module or tf.keras.Model using (L)BFGS optimizer from Tensorflow Probability
Project description
tf2-bfgs
Use BFGS optimizer in Tensorflow 2 almost as it was with Tensorflow 1 and
tf.contrib
tf2-bfgs is a wrapper to train a tf.Module or tf.keras.Model using (L-)BFGS optimizer from Tensorflow Probability. This work uses the code snippet from Pi-Yueh Chuang available here.
Install
You can run the following command in a dedicated virtual environment:
pip install tf2-bfgs
or the following one to get the latest version
pip install git+https://github.com/mBarreau/tf2-bfgs
You might also need to install extra dependencies such as
pip install tensorflow-probabilities tf_keras
Examples
We will train a regression model to fit a cos. Here are the training data.
import tensorflow as tf
import numpy as np
from tf2_bfgs import LBFGS
t = np.linspace(0, 1, 10).reshape((-1, 1)).astype(np.float32)
x = np.cos(t)
Using Keras
We define the model first, a multilayer perceptron.
omega = tf.keras.Sequential(
[tf.keras.Input(shape=[1,]),
tf.keras.layers.Dense(10, "tanh"),
tf.keras.layers.Dense(10, "tanh"),
tf.keras.layers.Dense(10, "tanh"),
tf.keras.layers.Dense(1, None)])
The loss function is the traditional mean squared error:
def get_cost(model, t, x):
x_hat = model(t)
return tf.keras.losses.MeanSquaredError()(x_hat, x)
NOTE: for differentiablility reasons, the output of the model x_hat = model(t) should be computed inside the cost function.
optimizer_BFGS = LBFGS(get_cost, omega.trainable_variables)
results = optimizer_BFGS.minimize(omega, t, x)
The number of parameters to the minimize function is the same as those of get_cost.
Using Tensorflow tf.Module
We will define our own multilayer perceptron as follows:
def init(layers):
Ws, bs = [], []
for i in range(len(layers) - 1):
W = xavier_init(size=[layers[i], layers[i + 1]])
b = tf.zeros([1, layers[i + 1]])
Ws.append(tf.Variable(W, dtype=tf.float32, name=f"W_{i}"))
bs.append(tf.Variable(b, dtype=tf.float32, name=f"b_{i}"))
return Ws, bs
def xavier_init(size):
in_dim = size[0]
out_dim = size[1]
xavier_stddev = np.sqrt(2 / (in_dim + out_dim))
return np.random.normal(size=[in_dim, out_dim], scale=xavier_stddev)
class NeuralNetwork(tf.Module):
def __init__(self, hidden_layers, **kwargs):
super().__init__(**kwargs)
self.layers = [1] + hidden_layers + [1]
self.Ws, self.bs = init(layers=self.layers)
@tf.function
def __call__(self, input):
num_layers = len(self.Ws)
H = tf.cast(input, tf.float32)
for layer in range(0, num_layers - 1):
W = self.Ws[layer]
b = self.bs[layer]
H = tf.tanh(tf.add(tf.matmul(H, W), b))
W = self.Ws[-1]
b = self.bs[-1]
return tf.add(tf.matmul(H, W), b)
omega = NeuralNetwork([10]*3)
NOTE: It is of main importance that the class inherits from tf.Module.
Define the cost function as follows:
@tf.function
def get_cost(model, t, x):
return tf.reduce_mean(tf.square(model(t) - x))
NOTE: for differentiablility reasons, the output of the model x_hat = model(t) should be computed inside the cost function. Note that it is also recommended to use the decorator @tf.function for performance reasons.
The training is perfomed by the following snippet:
optimizer_BFGS = LBFGS(get_cost, omega.trainable_variables)
results = optimizer_BFGS.minimize(omega, t, x)
The number of parameters to the minimize function is the same as those of get_cost.
Using physics-informed neural networks
This optimizer might be useful when dealing with physics-informed neural network. To this end, we must change the cost from the previous subsection to include the minimization of the physics residual.
@tf.function
def get_pinn_cost(model, t, x, t_phys):
# Data cost
data_cost = tf.reduce_mean(tf.square(model(t) - x))
# Physics cost
t_phys = tf.convert_to_tensor(t)
with tf.GradientTape(watch_accessed_variables=False) as tape:
tape.watch(t)
model_tf = model(t)
dmodel_dt = tape.gradient(model_tf, t_phys)
physics_cost = tf.reduce_mean(tf.square(dmodel_dt + tf.sin(t)))
return data_cost + 0.1 * physics_cost
Training is done in a similar manner.
t_phys = np.linspace(0, 1, 100).reshape((-1, 1)).astype(np.float32)
optimizer_BFGS = LBFGS(get_pinn_cost, omega.trainable_variables)
results = optimizer_BFGS.minimize(omega, t, x, t_phys)
Options
The optimizer has the same options as the function tfp.optimizer.bfgs_minimize in the case of BFGS or tfp.optimizer.lbfgs_minimize in the case of LBFGS.
For instance, to set the maximum number of iterations to 100, we can use the following code when defining the optimizer:
options = {'max_iterations': 100}
optimizer_BFGS = LBFGS(get_cost, omega.trainable_variables, options)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tf2_bfgs-0.1.0.tar.gz.
File metadata
- Download URL: tf2_bfgs-0.1.0.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d68c2f0bd26f3e39a99595da45ae91cb4f146d53926e5d85d292fa67103cdf1
|
|
| MD5 |
a7cbdcff9a9a1c64ccdf2e3fce4d0714
|
|
| BLAKE2b-256 |
0a4c61e227e6e24bd16f66ca96c9785d3dcb8d92eeee4fa8121f74f4aa73baf5
|
File details
Details for the file tf2_bfgs-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tf2_bfgs-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7db59b36df10e0f7dae10772010a23e06ae747a56085ad51bd4596e1dd5628de
|
|
| MD5 |
39a4265344e3aa6988260aa1f6efd4eb
|
|
| BLAKE2b-256 |
0a3a776cb93c410babfab4a4c30aedc00805f41464f3ce296d7970ddf1139b64
|