Skip to main content

Probabilistic modeling of tabular data with normalizing flows.

Project description

build codecov PyPI version

pzflow

Probabilistic modeling of tabular data with normalizing flows.

If your data consists of continuous variables that can be put into a Pandas DataFrame, pzflow can model the joint probability distribution of your data set.

The Flow class makes building and training a normalizing flow simple. It also allows you to easily sample from the normalizing flow (e.g. for forward modeling or data augmentation), and to calculate posteriors over any of your variables.

See this Jupyter notebook for an introduction. See this notebook for a more complicated reshift example.

Installation

You can install pzflow from PyPI with pip:

pip install pzflow

If you want to run pzflow on a GPU with CUDA, you need to follow the GPU-enabled installation instructions for jaxlib here. You may also need to add the following to your .bashrc:

# cuda setup
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:/usr/local/cuda/bin

If you have the GPU enabled version of jax installed, but would like to run on a CPU, add the following to the top of your scripts/notebooks:

import jax
# Global flag to set a specific platform, must be used at startup.
jax.config.update('jax_platform_name', 'cpu')

Citation

We are preparing a paper on pzflow. If you are using this package in your research, please check back here for a citation before publication.

Development

To work on pzflow, after forking and cloning the repo:

  1. Create a virtual environment with Python
    E.g., with conda conda create -n pzflow
  2. Activate the environment.
    E.g., conda activate pzflow
  3. Install pzflow in edit mode with the dev flag
    I.e., in the root directory, pip install -e .[dev]

Sources

pzflow was originally designed for forward modeling of photometric redshifts as a part of the Creation Module of the DESC RAIL project. The idea to use normalizing flows for photometric redshifts originated with Bryce Kalmbach. The earliest version of the normalizing flow in RAIL was based on a notebook by Francois Lanusse and included contributions from Alex Malz.

The jax structure of pzflow is largely based on jax-flows by Chris Waites. The implementation of the Neural Spline Coupling is largely based on the Tensorflow implementation, with some inspiration from nflows.

Neural Spline Flows are based on the following papers:

NICE: Non-linear Independent Components Estimation
Laurent Dinh, David Krueger, Yoshua Bengio
arXiv:1410.8516

Density estimation using Real NVP
Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio
arXiv:1605.08803

Neural Spline Flows
Conor Durkan, Artur Bekasov, Iain Murray, George Papamakarios
arXiv:1906.04032

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pzflow-1.2.0.tar.gz (6.8 MB view details)

Uploaded Source

File details

Details for the file pzflow-1.2.0.tar.gz.

File metadata

  • Download URL: pzflow-1.2.0.tar.gz
  • Upload date:
  • Size: 6.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for pzflow-1.2.0.tar.gz
Algorithm Hash digest
SHA256 1e406aa3ea7e11bff0e3f0c4495ac09f8bd96a26b1872ed220e9baf89113783d
MD5 9aaabb54f5e6137f72faaa643bba4c2c
BLAKE2b-256 02be0fb4617521d028bf43db5042f4ef56d179a945c8c8af66eb4ea90370a7b6

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page