Skip to main content

Wrapper to train a tf.Module or tf.keras.Model using (L)BFGS optimizer from Tensorflow Probability

Project description

tf2-bfgs

Use BFGS optimizer in Tensorflow 2 almost as it was with Tensorflow 1 and tf.contrib

License: MIT

tf2-bfgs is a wrapper to train a tf.Module or tf.keras.Model using (L-)BFGS optimizer from Tensorflow Probability. This work uses the code snippet from Pi-Yueh Chuang available here.

Install

You can run the following command in a dedicated virtual environment:

pip install tf2-bfgs

or the following one to get the latest version

pip install git+https://github.com/mBarreau/tf2-bfgs

You might also need to install extra dependencies such as

pip install tensorflow-probabilities tf_keras

Examples

We will train a regression model to fit a cos. Here are the training data.

import tensorflow as tf
import numpy as np
from tf2_bfgs import LBFGS

t = np.linspace(0, 1, 10).reshape((-1, 1)).astype(np.float32)
x = np.cos(t)

Using Keras

We define the model first, a multilayer perceptron.

omega = tf.keras.Sequential(
    [tf.keras.Input(shape=[1,]),
     tf.keras.layers.Dense(10, "tanh"),
     tf.keras.layers.Dense(10, "tanh"),
     tf.keras.layers.Dense(10, "tanh"),
     tf.keras.layers.Dense(1, None)])

The loss function is the traditional mean squared error:

def get_cost(model, t, x):
    x_hat = model(t)
    return tf.keras.losses.MeanSquaredError()(x_hat, x)

NOTE: for differentiablility reasons, the output of the model x_hat = model(t) should be computed inside the cost function.

optimizer_BFGS = LBFGS(get_cost, omega.trainable_variables)
results = optimizer_BFGS.minimize(omega, t, x)

The number of parameters to the minimize function is the same as those of get_cost.

Using Tensorflow tf.Module

We will define our own multilayer perceptron as follows:

def init(layers):
    Ws, bs = [], []
    for i in range(len(layers) - 1):
        W = xavier_init(size=[layers[i], layers[i + 1]])
        b = tf.zeros([1, layers[i + 1]])
        Ws.append(tf.Variable(W, dtype=tf.float32, name=f"W_{i}"))
        bs.append(tf.Variable(b, dtype=tf.float32, name=f"b_{i}"))
    return Ws, bs


def xavier_init(size):
    in_dim = size[0]
    out_dim = size[1]
    xavier_stddev = np.sqrt(2 / (in_dim + out_dim))
    return np.random.normal(size=[in_dim, out_dim], scale=xavier_stddev)


class NeuralNetwork(tf.Module):
    def __init__(self, hidden_layers, **kwargs):
        super().__init__(**kwargs)
        self.layers = [1] + hidden_layers + [1]
        self.Ws, self.bs = init(layers=self.layers)

    @tf.function
    def __call__(self, input):
        num_layers = len(self.Ws)
        H = tf.cast(input, tf.float32)
        for layer in range(0, num_layers - 1):
            W = self.Ws[layer]
            b = self.bs[layer]
            H = tf.tanh(tf.add(tf.matmul(H, W), b))
        W = self.Ws[-1]
        b = self.bs[-1]
        return tf.add(tf.matmul(H, W), b)

omega = NeuralNetwork([10]*3)

NOTE: It is of main importance that the class inherits from tf.Module.

Define the cost function as follows:

@tf.function
def get_cost(model, t, x):
    return tf.reduce_mean(tf.square(model(t) - x))

NOTE: for differentiablility reasons, the output of the model x_hat = model(t) should be computed inside the cost function. Note that it is also recommended to use the decorator @tf.function for performance reasons.

The training is perfomed by the following snippet:

optimizer_BFGS = LBFGS(get_cost, omega.trainable_variables)
results = optimizer_BFGS.minimize(omega, t, x)

The number of parameters to the minimize function is the same as those of get_cost.

Using physics-informed neural networks

This optimizer might be useful when dealing with physics-informed neural network. To this end, we must change the cost from the previous subsection to include the minimization of the physics residual.

@tf.function
def get_pinn_cost(model, t, x, t_phys):
    # Data cost
    data_cost = tf.reduce_mean(tf.square(model(t) - x))
    # Physics cost
    t_phys = tf.convert_to_tensor(t)
    with tf.GradientTape(watch_accessed_variables=False) as tape:
        tape.watch(t)
        model_tf = model(t)
    dmodel_dt = tape.gradient(model_tf, t_phys)
    physics_cost = tf.reduce_mean(tf.square(dmodel_dt + tf.sin(t)))
    return data_cost + 0.1 * physics_cost

Training is done in a similar manner.

t_phys = np.linspace(0, 1, 100).reshape((-1, 1)).astype(np.float32)
optimizer_BFGS = LBFGS(get_pinn_cost, omega.trainable_variables)
results = optimizer_BFGS.minimize(omega, t, x, t_phys)

Options

The optimizer has the same options as the function tfp.optimizer.bfgs_minimize in the case of BFGS or tfp.optimizer.lbfgs_minimize in the case of LBFGS.

For instance, to set the maximum number of iterations to 100, we can use the following code when defining the optimizer:

options = {'max_iterations': 100}
optimizer_BFGS = LBFGS(get_cost, omega.trainable_variables, options)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tf2_bfgs-0.1.0.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tf2_bfgs-0.1.0-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file tf2_bfgs-0.1.0.tar.gz.

File metadata

  • Download URL: tf2_bfgs-0.1.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for tf2_bfgs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4d68c2f0bd26f3e39a99595da45ae91cb4f146d53926e5d85d292fa67103cdf1
MD5 a7cbdcff9a9a1c64ccdf2e3fce4d0714
BLAKE2b-256 0a4c61e227e6e24bd16f66ca96c9785d3dcb8d92eeee4fa8121f74f4aa73baf5

See more details on using hashes here.

File details

Details for the file tf2_bfgs-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tf2_bfgs-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for tf2_bfgs-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7db59b36df10e0f7dae10772010a23e06ae747a56085ad51bd4596e1dd5628de
MD5 39a4265344e3aa6988260aa1f6efd4eb
BLAKE2b-256 0a3a776cb93c410babfab4a4c30aedc00805f41464f3ce296d7970ddf1139b64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page