Probabilistic modeling of tabular data with normalizing flows.
Project description
pzflow
Probabilistic modeling of tabular data with normalizing flows.
If your data consists of continuous variables that can be put into a Pandas DataFrame, pzflow can model the joint probability distribution of your data set.
The Flow
class makes building and training a normalizing flow simple.
It also allows you to easily sample from the normalizing flow (e.g. for forward modeling or data augmentation), and to calculate posteriors over any of your variables.
There are several example notebooks demonstrating how to use pzflow
- Introduction of a basic flow with the two moons data set
- A more complex example with galaxy redshifts
- An example of building a conditional flow on redshift data
- An example of using a more complicated joint latent distribution to model data with periodic topology
If you notice any bugs or have any questions, feel free to reach out!
Citation
We are preparing a paper on pzflow. If you use this package in your research, please check back here for a citation before publication. In the meantime, please cite the Zenodo release.
Installation
You can install pzflow from PyPI with pip:
pip install pzflow
If you want to run pzflow on a GPU with CUDA, you need to follow the GPU-enabled installation instructions for jaxlib here.
You may also need to add the following to your .bashrc
:
# cuda setup
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=$PATH:/usr/local/cuda/bin
If you have the GPU enabled version of jax installed, but would like to run on a CPU, add the following to the top of your scripts/notebooks:
import jax
# Global flag to set a specific platform, must be used at startup.
jax.config.update('jax_platform_name', 'cpu')
Note that if you run jax on GPU in multiple Jupyter notebooks simultaneously, you may get RuntimeError: cuSolver internal error
. Read more here and here.
Development
To work on pzflow, after forking and cloning the repo:
- Create a virtual environment with Python
E.g., with condaconda create -n pzflow
- Activate the environment.
E.g.,conda activate pzflow
- Install pzflow in edit mode with the
dev
flag
I.e., in the root directory,pip install -e .[dev]
Sources
pzflow was originally designed for forward modeling of photometric redshifts as a part of the Creation Module of the DESC RAIL project. The idea to use normalizing flows for photometric redshifts originated with Bryce Kalmbach. The earliest version of the normalizing flow in RAIL was based on a notebook by Francois Lanusse and included contributions from Alex Malz.
The functional jax structure of the bijectors was originally based on jax-flows
by Chris Waites. The implementation of the Neural Spline Coupling is largely based on the Tensorflow implementation, with some inspiration from nflows
.
Neural Spline Flows are based on the following papers:
NICE: Non-linear Independent Components Estimation
Laurent Dinh, David Krueger, Yoshua Bengio
arXiv:1410.8516
Density estimation using Real NVP
Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio
arXiv:1605.08803
Neural Spline Flows
Conor Durkan, Artur Bekasov, Iain Murray, George Papamakarios
arXiv:1906.04032
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pzflow-1.6.0.tar.gz
.
File metadata
- Download URL: pzflow-1.6.0.tar.gz
- Upload date:
- Size: 7.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.1.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba012ed4c64205546dac4fb4cf95e4cde97ebd061d29529125351f8bde2401d6 |
|
MD5 | 22a5d80a173cc626caf26af19825ca23 |
|
BLAKE2b-256 | 872514e23b9e46dd55419a63da0de0947880c03c48d548593ad4ff91351d45a1 |