Painless variables in PyTorch and TensorFlow
Project description
Varz
Painless optimisation of constrained variables in AutoGrad, TensorFlow, PyTorch, and JAX
Requirements and Installation
See the instructions here. Then simply
pip install varz
Manual
Basics
from varz import Vars
To begin with, create a variable container of the right data type.
For use with AutoGrad, use a np.*
data type;
for use with PyTorch, use a torch.*
data type;
for use with TensorFlow, use a tf.*
data type;
and for use with JAX, use a jnp.*
data type.
In this example we'll use AutoGrad.
>>> vs = Vars(np.float64)
Now a variable can be created by requesting it, giving it an initial value and a name.
>>> vs.get(np.random.randn(2, 2), name="x")
array([[ 1.04404354, -1.98478763],
[ 1.14176728, -3.2915562 ]])
If the same variable is created again, because a variable with the name x
already exists, the existing variable will be returned.
>>> vs.get(name="x")
array([[ 1.04404354, -1.98478763],
[ 1.14176728, -3.2915562 ]])
Alternatively, indexing syntax may be used to get the existing variable x
.
This asserts that a variable with the name x
already exists and will throw a
KeyError
otherwise.
>>> vs["x"]
array([[ 1.04404354, -1.98478763],
[ 1.14176728, -3.2915562 ]])
>>> vs["y"]
KeyError: 'y'
The value of x
can be changed by assigning it a different value.
>>> vs.assign("x", np.random.randn(2, 2))
array([[ 1.43477728, 0.51006941],
[-0.74686452, -1.05285767]])
By default, assignment is non-differentiable and overwrites data.
For differentiable assignment, which replaces data, set the keyword argument
differentiable=True
.
>>> vs.assign("x", np.random.randn(2, 2), differentiable=True)
array([[ 0.12500578, -0.21510423],
[-0.61336039, 1.23074066]])
The variable container can be copied with vs.copy()
.
Note that the copy shares its variables with the original.
This means that non-differentiable assignment will also mutate the original;
differentiable assignment, however, will not.
Naming
Variables may be organised by naming them hierarchically using /
s.
For example, group1/bar
, group1/foo
, and group2/bar
.
This is helpful for extracting collections of variables, where wildcards may
be used to match names.
For example, */bar
would match group1/bar
and group2/bar
, and
group1/*
would match group1/bar
and group1/foo
.
The names of all variables can be obtained with Vars.names
, and variables can
be printed with Vars.print
.
Example:
>>> vs = Vars(np.float64)
>>> vs.get(1, name="x1")
array(1.)
>>> vs.get(2, name="x2")
array(2.)
>>> vs.get(3, name="y")
array(3.)
>>> vs.names
['x1', 'x2', 'y']
>>> vs.print()
x1: 1.0
x2: 2.0
y: 3.0
Constrained Variables
-
Positive variables: A variable that is constrained to be positive can be created using
Vars.positive
orVars.pos
.>>> vs.pos(name="positive_variable") 0.016925610008314832
-
Bounded variables: A variable that is constrained to be bounded can be created using
Vars.bounded
orVars.bnd
.>>> vs.bnd(name="bounded_variable", lower=1, upper=2) 1.646772663807718
-
Lower-triangular matrix: A matrix variable that is contrained to be lower triangular can be created using
Vars.lower_triangular
orVars.tril
. Either an initialisation or a shape of square matrix must be given.>>> vs.tril(shape=(2, 2), name="lower_triangular") array([[ 2.64204459, 0. ], [-0.14055559, -1.91298679]])
-
Positive-definite matrix: A matrix variable that is contrained to be positive definite can be created using
Vars.positive_definite
orVars.pd
. Either an initialisation or a shape of square matrix must be given.>>> vs.pd(shape=(2, 2), name="positive_definite") array([[ 1.64097496, -0.52302151], [-0.52302151, 0.32628302]])
-
Orthogonal matrix: A matrix variable that is contrained to be orthogonal can be created using
Vars.orthogonal
orVars.orth
. Either an initialisation or a shape of square matrix must be given.>>> vs.orth(shape=(2, 2), name="orthogonal") array([[ 0.31290403, -0.94978475], [ 0.94978475, 0.31290403]])
These constrained variables are created by transforming some latent
unconstrained representation to the desired constrained space.
The latent variables can be obtained using Vars.get_vars
.
>>> vs.get_vars("positive_variable", "bounded_variable")
[array(-4.07892742), array(-0.604883)]
To illustrate the use of wildcards, the following is equivalent:
>>> vs.get_vars("*_variable")
[array(-4.07892742), array(-0.604883)]
Automatic Naming of Variables
To parametrise functions, a common pattern is the following:
def objective(vs):
x = vs.get(5, name="x")
y = vs.get(10, name="y")
return (x * y - 5) ** 2 + x ** 2
The names for x
and y
are necessary, because otherwise new variables will
be created and initialised every time objective
is run.
Varz offers two ways to not having to specify a name for every variable:
sequential and parametrised specification.
Sequential Specification
Sequential specification can be used if, upon execution of objective
,
variables are always obtained in the same order.
This means that variables can be identified with their position in this order
and hence be named accordingly.
To use sequential specification, decorate the function with sequential
.
Example:
from varz import sequential
@sequential
def objective(vs):
x = vs.get(5) # Initialise to 5.
y = vs.get() # Initialise randomly.
return (x * y - 5) ** 2 + x ** 2
>>> vs = Vars(np.float64)
>>> objective(vs)
68.65047879833773
>>> objective(vs) # Running the objective again reuses the same variables.
68.65047879833773
>>> vs.names
['0', '1']
>>> vs.print()
0: 5.0 # This is `x`.
1: -0.3214 # This is `y`.
Parametrised Specification
Sequential specification still suffers from boilerplate code like
x = vs.get(5)
and y = vs.get()
.
This is the problem that parametrised specification addresses, which allows
you to specify variables as arguments to your function.
To indicate that an argument of the function is a variable, as opposed to a
regular argument, the argument's type hint must be set accordingly, as follows:
-
Unbounded variables:
@parametrised def f(vs, x: Unbounded): ...
-
Positive variables:
@parametrised def f(vs, x: Positive): ...
-
Bounded variables: The following two specifications are possible. The former uses the default bounds and the latter uses specified bounds.
@parametrised def f(vs, x: Bounded): ...
@parametrised def f(vs, x: Bounded(lower=1, upper=10)): ...
-
Lower-triangular variables:
@parametrised def f(vs, x: LowerTriangular(shape=(2, 2))): ...
-
Positive-definite variables:
@parametrised def f(vs, x: PositiveDefinite(shape=(2, 2))): ...
-
Orthogonal variables:
@parametrised def f(vs, x: Orthogonal(shape=(2, 2))): ...
As can be seen from the above, the variable container must also be an argument of the function, because that is where the variables will be obtained from. A variable can be given an initial value in the way you would expect:
@parametrised
def f(vs, x: Unbounded = 5):
...
Variable arguments and regular arguments can be mixed.
If f
is called, variable arguments must not be specified, because they
will be obtained automatically.
Regular arguments, however, must be specified.
To use parametrised specification, decorate the function with parametrised
.
Example:
from varz import parametrised, Unbounded, Bounded
@parametrised
def objective(vs, x: Unbounded, y: Bounded(lower=1, upper=3) = 2, option=None):
print("Option:", option)
return (x * y - 5) ** 2 + x ** 2
>>> vs = Vars(np.float64)
>>> objective(vs)
Option: None
9.757481795615316
>>> objective(vs, "other")
Option: other
9.757481795615316
>>> objective(vs, option="other")
Option: other
9.757481795615316
>>> objective(vs, x=5) # This is not valid, because `x` will be obtained automatically from `vs`.
ValueError: 1 keyword argument(s) not parsed: x.
>>> vs.print()
x: 1.025
y: 2.0
Optimisers
The following optimisers are available:
varz.{autograd,tensorflow,torch,jax}.minimise_l_bfgs_b (L-BFGS-B)
varz.{autograd,tensorflow,torch,jax}.minimise_adam (ADAM)
The L-BFGS-B algorithm is recommended for deterministic objectives and ADAM is recommended for stochastic objectives.
See the examples for an illustration how these optimisers can be used.
PyTorch Specifics
All the variables held by a container can be detached from the current
computation graph with Vars.detach
.
To make a copy of the container with detached versions of the variables, use
Vars.copy
with detach=True
instead.
Whether variables require gradients can be configured with Vars.requires_grad
.
By default, no variable requires a gradient.
Getting and Setting Variables as a Vector
It may be desirable to get the latent representations of a collection of
variables as a single vector, e.g. when feeding them to an optimiser.
This can be achieved with Vars.get_vector
.
>>> vs.get_vector("x", "*_variable")
array([ 0.12500578, -0.21510423, -0.61336039, 1.23074066, -4.07892742,
-0.604883 ])
Similarly, to update the latent representation of a collection of variables,
Vars.set_vector
can be used.
>>> vs.set_vector(np.ones(6), "x", "*_variable")
[array([[1., 1.],
[1., 1.]]), array(1.), array(1.)]
>>> vs.get_vector("x", "*_variable")
array([1., 1., 1., 1., 1., 1.])
Get Variables from a Source
The keyword argument source
can set to a tensor from which the latent
variables will be obtained.
Example:
>>> vs = Vars(np.float32, source=np.array([1, 2, 3, 4, 5]))
>>> vs.get()
array(1., dtype=float32)
>>> vs.get(shape=(3,))
array([2., 3., 4.], dtype=float32)
>>> vs.pos()
148.41316
>>> np.exp(5).astype(np.float32)
148.41316
Examples
The follow examples show how a function can be minimised using the L-BFGS-B algorithm.
AutoGrad
import autograd.numpy as np
from varz.autograd import Vars, minimise_l_bfgs_b
target = 5.0
def objective(vs):
# Get a variable named "x", which must be positive, initialised to 10.
x = vs.pos(10.0, name="x")
return (x ** 0.5 - target) ** 2
>>> vs = Vars(np.float64)
>>> minimise_l_bfgs_b(objective, vs)
3.17785950743424e-19 # Final objective function value.
>>> vs['x'] - target ** 2
-5.637250666268301e-09
TensorFlow
import tensorflow as tf
from varz.tensorflow import Vars, minimise_l_bfgs_b
target = 5.0
def objective(vs):
# Get a variable named "x", which must be positive, initialised to 10.
x = vs.pos(10.0, name="x")
return (x ** 0.5 - target) ** 2
>>> vs = Vars(tf.float64)
>>> minimise_l_bfgs_b(objective, vs)
3.17785950743424e-19 # Final objective function value.
>>> vs['x'] - target ** 2
<tf.Tensor: id=562, shape=(), dtype=float64, numpy=-5.637250666268301e-09>
>>> vs = Vars(tf.float64)
>>> minimise_l_bfgs_b(objective, vs, jit=True) # Speed up optimisation with TF's JIT!
3.17785950743424e-19
PyTorch
import torch
from varz.torch import Vars, minimise_l_bfgs_b
target = torch.tensor(5.0, dtype=torch.float64)
def objective(vs):
# Get a variable named "x", which must be positive, initialised to 10.
x = vs.pos(10.0, name="x")
return (x ** 0.5 - target) ** 2
>>> vs = Vars(torch.float64)
>>> minimise_l_bfgs_b(objective, vs)
array(3.17785951e-19) # Final objective function value.
>>> vs["x"] - target ** 2
tensor(-5.6373e-09, dtype=torch.float64)
>>> vs = Vars(torch.float64)
>>> minimise_l_bfgs_b(objective, vs, jit=True) # Speed up optimisation with PyTorch's JIT!
array(3.17785951e-19)
JAX
import jax.numpy as jnp
from varz.jax import Vars, minimise_l_bfgs_b
target = 5.0
def objective(vs):
# Get a variable named "x", which must be positive, initialised to 10.
x = vs.pos(10.0, name="x")
return (x ** 0.5 - target) ** 2
>>> vs = Vars(jnp.float64)
>>> minimise_l_bfgs_b(objective, vs)
array(3.17785951e-19) # Final objective function value.
>>> vs["x"] - target ** 2
-5.637250666268301e-09
>>> vs = Vars(jnp.float64)
>>> minimise_l_bfgs_b(objective, vs, jit=True) # Speed up optimisation with Jax's JIT!
array(3.17785951e-19)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.