iflow

Monte-Carlo Integration using Neural Networks

These details have not been verified by PyPI

Project links

Homepage

Project description

This project implements normalizing flows using TensorFlow for the purpose of multi-dimensional integration and sampling.

Citation

If you use i-flow for your research, please cite:

"i-flow: High-dimensional Integration and Sampling with Normalizing Flows",
By Christina Gao, Joshua Isaacson, Claudius Krause.
Mach. Learn.: Sci. Technol. DOI
arXiv:2001.05486.

[[TOC]]

Importance Sampling

In high dimensional integration, Monte-Carlo (MC) techniques are required in order to obtain convergence of the integral in a efficient manner (See the discussion of curse of dimensionality for details). The naïve MC approach would be to uniformly sample points over the integration domain ($\Omega$) encompassing a volume $V$. The integral $I$ of a function $f(x)$ can then be estimated from evaluating the function at $N$ random points and taking the mean, i.e.

I = \int_\Omega f(x)\ dx \approx \frac{V}{N}\sum_{i=1}^{N} f(x_i) \equiv V \langle f \rangle_x,

and the uncertainty is determined by the standard deviation of the mean, i.e.

\sigma_f \approx V \sqrt{\frac{\langle f^2\rangle_x - \langle f\rangle_x^2}{N-1}}.

To improve upon the uncertainty of the estimated integral, a technique known as importance sampling was developed. The goal of importance sampling is the minimize the standard deviation by sampling points from a distribution that is closer to the function $f(x)$ than the uniform distribution. Given a probability distribution function $g(x)$ with cumulative distribution function $G(x)$, it is possible to transform the integral using the change of variable formula, giving:

I = \int_\Omega \frac{f(x)}{g(x)}\ dG(x) \approx \frac{V}{N}\sum_{i=1}^N
\frac{f(\tilde{x}_i)}{g(\tilde{x}_i)} \equiv V \langle f/g\rangle_G,

where $\tilde{x}_i$ is distributed according to $g(x)$. The standard deviation can be calculated in a similar method. From this, it can easily be seen that if $g(x) \rightarrow f(x)/I$, then the uncertainty from this method would produce an integral estimate with vanishing uncertainty. However, this would require to know the analytic solution to the integral!

Therefore, the goal of importance sampling is to find a distribution $g(x)$ as close to $f(x)$ as possible that is also easy to sample from.

Normalizing Flows

A Normalizing Flow is a bijective transformation from a (usually simple) base distribution to a (usually more complicated) target distribution, i.e. a change of variables transformation. It is therefore able to describe arbitrary, high-dimensional probability disctributions of continuous variables. The transformation is invertible and can therefore be used in two directions: In the one direction, we can infer the probability of a given sample point. In the other direction, we can sample random numbers according to the target distribution. Usually, Normalizing Flows are implemented usitilizing Neural Networks, see [1] for a review. Different architectures exist that are either fast in inference and slow at sampling or the other way around. Since our application for numerical integration with importance sampling requires both directions to be fast, we use the architecture based on Coupling Layers that was proposed in [2].

In a Coupling Layer, the $N$-dimensional is split in to two sets $\vec{x}_a=\{ x_1, \ldots, x_n\}$ and $\vec{x}_b=\{x_{n+1}, \ldots, x_N\}$, where $1 \leq n < N$. One set remains untransformed, but is used as input into a neural network. The other set is transformed based on a set of parameters determined from the output of the neural network. In other words,

x'_a = x_a, \quad\quad\quad\quad\quad\quad\ \  A\in[1,n]\quad\quad\  \\
x'_b = C(x_b; m(\vec{x}_a)),\quad\quad B\in[n+1,N]

with the inverse map given by:

x_a = x'_a,\quad\quad
x_b = C^{-1}(x'_b; m(\vec{x'}_a)) = C^{-1}(x'_b; m(\vec{x}_a)).

This leads to an analytic Jacobian independent on the gradient of the neural network $m(\vec{x}_a)$, and can be calculated as:

\left\lvert\frac{\partial c(\vec{x})}{\partial\vec{x}}\right\rvert^{-1}
= \left\lvert
\begin{pmatrix}
\vec{1} & 0 \\
\frac{\partial C}{\partial m}\frac{\partial m}{\partial \vec{x}_a} & \frac{\partial C}{\partial \vec{x}_b}
\end{pmatrix}
\right\rvert^{-1}
= \left\lvert \frac{\partial C(\vec{x}_b; m(\vec{x}_a))}{\partial \vec{x}_b} \right\rvert^{-1}.

Below is a graphical representation of single coupling layer, plus a permutation on the indices before being passed into a subsequent layer.

Coupling Layers

Currently, there are three well tested and validated coupling layers implemented in the i-flow package. These are:

Linear Spline
Quadratic Spline
Rational Quadratic Spline

Linear Spline

The linear spline is based on a piecewise linear function defining the CDF [3]. Given $N$ dimensions and $K$ bins with a width of $w$ the PDF is defined as $q_i(t) = \{Q_{ij} | i \in [1,N], (j-1)w \leq t < jw, j \in [1, K]\}$. The CDF can be calculated by integrating the PDF, giving:

C(x_i^B; Q) = \alpha Q_{ib} + \sum_{k=1}^{b-1} Q_{ik},

where $b$ is the bin in which $x_i^B$ occurs, and $\alpha=\frac{x_i^B-(b-1)w}{w}$. The Jacobian for this calculation is straight forward to calculate and is:

\left\lvert \frac{\partial C}{\partial x_B} \right\rvert = \prod_i Q_{ib}

Quadratic Spline

The quadratic spline is also based on a piecewise quadratic function defining the CDF [2]. Given $N$ dimensions, $K$ bins with widths $W_{ik}$, and $K+1$ vertices with heights $V_{ik}$, the PDF is defined as:

q_i(t) = \begin{cases}
      \frac{V_{i2}-V_{i1}}{W_{i1}}t+V_{i1} & t < W_{i1} \\
      \frac{V_{i3}-V_{i2}}{W_{i2}}\left(t-W_{i1}\right)+V_{i2} & W_{i1} \leq t < W_{i1}+W_{i2} \\
      \quad\vdots & \\
      \frac{V_{i(K+1)}-V_{iK}}{W_{iK}}\left(t-\sum_{k=1}^{K-1}W_{ik}\right)+V_{iK} & \sum_{k=1}^{K-1} W_{ik} \leq t < 1 \\
      \end{cases}

Integrating the above to obtain the CDF gives:

C(x_i^B; W, V) = \frac{\alpha^2}{2}\left(V_{ib+1}-V_{ib}\right) W_{ib} + V_{ib}W_{ib}\alpha + \sum_{k=1}^{b-1}\frac{V_{ik+1} + V_{ik}}{2} W_{ik},

with $b$ the solution to $\sum_{k=1}^{b-1} W_{ik} \leq x_i^B < \sum_{k=1}^b W_{ik}$, and $\alpha = \frac{x_i^B-\sum_{k=1}^{b-1}W_{ik}}{W_{ib}}$ is the relative position of $x_i^B$ in bin $b$.

Rational Quadratic Spline

The rational quadratic spline was originally defined in [4]. Given $N$ dimensions, $K+1$ knot points $\{(x^{(k)}, y^{(k)}) \}_{k=0}^K$ that are monotonically increasing, with $( x^{(0)}, y^{(0)} ) = (0, 0)$ and $( x^{(K)}, y^{(K)} ) = (1, 1)$, and $K+1$ non-negative derivatives $\{ d^{(k)} \}_{k=0}^K$, the CDF can be calculated using the algorithm from [5], and is reproduced below.

First define bin widths $w^{(k)} = x^{(k+1)} - x^{(k)}$ and slopes $s^{k}=\frac{y^{(k+1)}-y^{(k)}}{w^{(k)}}$. Given the point $x$, the bin in which $x$ falls is $k$, and the fractional distance $x$ is between the two corresponding knots is $\xi=\frac{x-x^{(k)}}{w^{(k)}}$. From this, the CDF can be calculated from:

g(x) = y^{(k)} + \frac{\left(y^{(k+1)} - y^{(k)}\right)\left[s^{(k)}\xi^2+d^{(k)}\xi\left(1-\xi\right)\right]}{s^{(k)}+\left[d^{(k+1)}+d^{(k)}-2s^{(k)}\right]\xi\left(1-\xi\right)}.

Futher documentation

Additional documentation can be found at: doc

Installation

The i-flow program can be installed using pip.

pip install iflow

Requirements

In order to run the i-flow program, there are a few additional packages that are required. The packages are:

TensorFlow (version 2.0 or newer)
TensorFlow-Probability (version 0.8 or newer)
Numpy

In order to run the tests provided in the paper and the iflow_test.py file, a few additional packages are required. These are:

Vegas (to compare to the vegas algorithm)
matplotlib (for plotting the results)
absl-py (for command line parsing)

Usage

Running the script iflow_test.py will produce the results of our paper. It also illustrates how to use i-flow.

Line	Explanation
`0 -- 20`	Import required packages, see Requirements.
`21 -- 24`	Define shortcuts and set precision to float64.
`25 -- 44`	Define command line flags to control program.
`45 -- 325`	The class `TestFunctions` contains the functions to be integrated. This should be replaced by the desired function for integration.
`326 -- 353`	`build()` defines the Neural Network that is used in each Coupling Layer. This can be adapted to the problem at hand, for example by adding more layers or making the layers wider.
`354 -- 379`	`binary_masks()` creates the masking (i.e. which dimensions are transformed and which are passed through a coupling layer) based on the discussion of section III A of our paper.
`380 -- 412`	`build_iflow()` builds the i-flow Integrator object. The most important parts are explained below.
`391`	Defines which masking is used. Can be replaced by a custom list.
`394 -- 397`	Defines the main bijector. `PiecewiseRationalQuadratic` can be replaced by any other coupling function defined in couplings.py.
`408`	Defines which optimizer is used.
`410`	Defines which loss function is used. `'exponential'` can be replaced by any other loss function defined in divergences.py.
`414 -- 442`	`train_iflow()` trains i-flow for a given number of epochs.
`443 -- 472`	`train_iflow_target()` trains i-flow until a target precision is reached.
`473 -- 497`	`sample_iflow()` samples random numbers according to the trained distribution. Not used in the current setup.
`498 -- 528`	Defines helper functions used in the presentation of the results.
`529 -- 925`	Runs the setup defined with the command line flags.

Running

python iflow_test.py -f=Gauss -d=2 -p=5000 -tp=1e-4 -t

gives the result of our first test case, apart from small fluctuations due to a different random seed.

References

George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshminarayanan.
Normalizing Flows for Probabilistic Modeling and Inference. arXiv:1912.02762 link
Laurent Dinh, David Krueger, and Yoshua Bengio.
NICE: non-linear independent components estimation. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Workshop Track Proceedings, 2015. arXiv:1410.8516 link
Thomas Müller, Brian Mcwilliams, Fabrice Rousselle, Markus Gross, and Jan Novák.
Neural Importance Sampling. ACM Trans. Graph., 38(5), October 2019. link
Conor Durkan, Artur Bekasov, Iain Murray, and George Papamakarios.
Neural Spline Flows. Advances in Neural Information Processing Systems, 7511--7522, 2019. link
J. A. Gregory and R. Delbourgo.
Piecewise Rational Quadratic Interpolation to Monotonic Data.
IMA Journal of Numerical Analysis, 2(2):123-130, 04 1982. link

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.1

Aug 17, 2020

1.0

Aug 12, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iflow-1.0.1-py3-none-any.whl (44.5 kB view details)

Uploaded Aug 17, 2020 Python 3

File details

Details for the file iflow-1.0.1-py3-none-any.whl.

File metadata

Download URL: iflow-1.0.1-py3-none-any.whl
Upload date: Aug 17, 2020
Size: 44.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/49.3.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.6

File hashes

Hashes for iflow-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`204283f95329a7ccf73ad731158f67d2ba8f8e4533bc9d09c2e505aa9e93b2af`
MD5	`4586a2de09ffb40c3417623c3363e805`
BLAKE2b-256	`ed7905653a5a8ed60865a8e1e20b65d30964f5fe1454969fe9d54e4f94de9680`

See more details on using hashes here.

iflow 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Citation

Table of Contents

Importance Sampling

Normalizing Flows

Coupling Layers

Linear Spline

Quadratic Spline

Rational Quadratic Spline

Futher documentation

Installation

Requirements

Usage

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes