A Copula-GP (Gaussian Process) package
Project description
Parametric Copula-GP framework
This is the GPyTorch-based package that infers copula parameters using a latent Gaussian Process model. The package contains 4 copula families (Gaussian, Frank, Clayton, Gumbel) + linear combinations of copulas from same or different families. The models are constructed with the greedy or heuristic algorithm and the best model is selected based on WAIC. Both greedy and heuristic algorithms perform well on synthetic data (see tests/integration). The bivariate models can be then organised into a C-Vine. A number of methods for computing information measures (e.g. vine.entropy, vine.inputMI) are implemented. For a complete description of our method, see our paper (link below).
Installing the package from PyPI
pip install copulagp
Installing the package from Github repo
In a virtual environment (e.g. virtualenv), install all the dependencies and the package using the following commands:
pip install -r requirements.txt
pip install .
Getting started
Let us start with importing pytorch and loading some data (e.g. the synthetic neuronal data generated with a GLM model, Fig3 in our pre-print):
import torch
import pickle as pkl
with open("./notebooks/started/GLM_generated_data.pkl","rb") as f:
data = pkl.load(f)
Next, we use fastKDE to transform the marginals:
import copulagp.marginal as mg
y = torch.zeros(data['Y'].shape)
for i in range(2):
y[i] = torch.tensor(mg.fast_signal2uniform(data['Y'][i],data['X']))
Next, let us try a Clayton copula model on this data (optionally: on a GPU; should take around 30 seconds)
import copulagp.bvcopula
device='cuda:0'
train_x = torch.tensor(data['X']).float().to(device=device)
train_y = y.T.float().to(device=device)
likelihoods = [bvcopula.ClaytonCopula_Likelihood(rotation='90°')]
(waic, model) = bvcopula.infer(likelihoods,train_x,train_y,device=device, prior_rbf_length=2.0)
print(f"WAIC: {waic}") # waic = -0.119
Let us plot the results, using a plot helper Plot_Fit for this:
from copulagp.utils import Plot_Fit
Plot_Fit(model, data['X'], y.numpy().T,'Excitatory', 'Inhibitory', device);
We can then sample from the GP model and calculate the conditional entropy of the copula model. This copula entropy is equivalent to the mutual information between two variables. Using sampling from a GP, we obtain confidence intervals for this mutual information:
import matplotlib.pyplot as plt
test_x = torch.linspace(0,1,200).float().to(device=device)
entropies = torch.zeros(10,200)
for i in range(10):
f = model.gp_model(test_x).rsample(torch.Size([1])) # sample from a GP
copula = model.likelihood.get_copula(f.squeeze()) # initialize a copula, parameterized by that GP sample
entropies[i] = copula.entropy(sem_tol=0.01, mc_size=1000).cpu() # calculate entropy
entropies = entropies.numpy()
plt.plot(test_x.cpu().numpy(),entropies.mean(0))
plt.fill_between(test_x.cpu().numpy(),entropies.mean(0)-entropies.std(0),entropies.mean(0)+entropies.std(0),alpha=0.2)
Note, that Clayton copula is not the best fitting model for this example. We can find the best one by using one of the model selection algorithms (e.g. heuristic):
import copulagp.select_copula
(store, waic) = select_copula.select_with_heuristics(data['X'],y.numpy().T,device,'cond',\
'./','Excitatory','Inhibitory',train_x=train_x,train_y=train_y)
print(f"Best model: {store.name_string}, WAIC: {waic}") # best_waic = -0.139
The best copula found by the heuristic algorithm is a mixture of Frank and Clayton. We can visualize this model and calculate it's entropy using the same code as for the Clayton copula (see the results in notebooks/Getting_started.ipynb).
More notebooks with examples and the code that generated the figures for our paper can be found in notebooks/.
Citation
If you find our Copula-GP package useful, please consider citing our work:
@article{kudryashova2022parametric,
title={Parametric Copula-GP model for analyzing multidimensional neuronal and behavioral relationships},
author={Kudryashova, Nina and Amvrosiadis, Theoklitos and Dupuy, Nathalie and Rochefort, Nathalie and Onken, Arno},
journal={PLoS computational biology},
volume={18},
number={1},
pages={e1009799},
year={2022},
publisher={Public Library of Science San Francisco, CA USA}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file copulagp-0.0.5.tar.gz
.
File metadata
- Download URL: copulagp-0.0.5.tar.gz
- Upload date:
- Size: 59.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a411e052d1a3e987a315f84e687815a3fc40de01b46264356fce8c06cdb50aa1 |
|
MD5 | e6ed0d3eb67430fe041b813041682ff7 |
|
BLAKE2b-256 | 5ed7ba3be52ae5da0fd7b251aeb4ceab24a950445dd9c52dc6da90098c5c89b1 |
File details
Details for the file copulagp-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: copulagp-0.0.5-py3-none-any.whl
- Upload date:
- Size: 72.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d5e29a95eeaa26e37805a6af6491ec39f5820202f845d84e4d5971e044bede66 |
|
MD5 | c70a80c0fc7320296408de6521f8766d |
|
BLAKE2b-256 | 3be3d6ff6629317bffdca49e3975f2943c632324a112a96c03eae64207d8dd2f |