A collection of tools for analysing cmdspanpy output, written in Python

These details have not been verified by PyPI

Project links

Homepage

License
- Public Domain
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

A Python library for analysing cmdstanpy output

This is a collection of functions for analysing output of cmdstanpy library. The main idea is to do a quick data analysis by calling a single function that makes:

traceplots of samples,
text and plots of the summaries of model parameters,
histograms and pair plots of posterior distributions of parameters.

The only known illustration of a tarpan made from life, depicting a five month old colt (Borisov, 1841). Source: Wikimedia Commons.

Setup

First, run:

pip install tarpan

Finally, install cmdstan by running:

install_cmdstan

Complete analysis: `save_analysis`

This is the main function of the library that saves summaries and trace/pair/tree plots in model_info directory. The function is useful when you want to generate all types of summaries and plots at once.

from tarpan.cmdstanpy.analyse import save_analysis

model = CmdStanModel(stan_file="your_model.stan")
fit = model.sample(data=your_data)
save_analysis(fit, param_names=['mu', 'sigma'])

Full example code

If you don't need everything, you can call individual functions described below to make just one type of plot or a summary.

Summary: `save_summary`

Creates a summary of parameter distributions and saves it in text and CSV files.

from tarpan.cmdstanpy.summary import save_summary

model = CmdStanModel(stan_file="your_model.stan")
fit = model.sample(data=your_data)
save_summary(fit, param_names=['mu', 'tau', 'eta.1'])

Full example code

The text summary format is such that the text can be pasted into Github/Gitlab/Bitbucket's Markdown file, like this:

Name	Mean	Std	Mode	+	-	68CI-	68CI+	95CI-	95CI+	N_Eff	R_hat
mu	8.05	5.12	7.53	4.63	4.59	2.93	12.16	-1.84	18.74	1540	1.00
tau	6.41	5.72	2.36	5.41	2.35	0.00	7.76	0.00	17.07	1175	1.00
eta.1	0.39	0.92	0.60	0.71	1.13	-0.53	1.31	-1.48	2.19	3505	1.00

Summary columns

Name, Mean, Std are the name of the parameter, its mean and standard deviation.
68CI-, 68CI+, 95CI-, 95CI+ are the 68% and 95% HPDIs (highest probability density intervals). These values are configurable.
Mode, +, - is a mode of distribution with upper and lower uncertainties, which are calculated as distances to 68% HPDI.
N_Eff is Stan's number of effective samples, the higher the better.
R_hat is a Stan's parameter representing the quality of the sampling. This value needs to be smaller than 1.00. After generating a model I usually immediately look at this R_hat column to see if the sampling was good.

Tree plot: `save_tree_plot`

This function shows exactly the same information as save_summary, but in the form a plot. The markers are the modes of the distributions, and the two error bars indicate 68% and 95% HPDIs (highest posterior density intervals).

from tarpan.cmdstanpy.tree_plot import save_tree_plot

model = CmdStanModel(stan_file="your_model.stan")
fit = model.sample(data=your_data)
save_tree_plot([fit], param_names=['mu', 'sigma'])

Full example code

Comparing multiple models on a tree plot

Supply multiple fits in order to compare parameters from multiple models.

from tarpan.cmdstanpy.tree_plot import save_tree_plot
from tarpan.shared.tree_plot import TreePlotParams

# Sample from two models
model1 = CmdStanModel(stan_file="your_model1.stan")
fit1 = model1.sample(data=your_data)
model2 = CmdStanModel(stan_file="your_model2.stan")
fit2 = model2.sample(data=your_data)

# Supply legend labels (optional)
tree_params = TreePlotParams()
tree_params.labels = ["Model 1", "Model 2", "Exact"]
data = [{ "mu": 2.2, "tau": 1.3 }]  # Add extra markers (optional)

save_tree_plot([fit1, fit2], extra_values=data, param_names=['mu', 'tau'],
               tree_params=tree_params)

Full example code

Trace plot: `save_traceplot`

The plot shows the values of parameters samples. Different colors correspond to samples form different chains. Ideally, the lines of different colors on the left plots are well mixed, and the right plot is fairly uniform.

from tarpan.cmdstanpy.traceplot import save_traceplot

model = CmdStanModel(stan_file="your_model.stan")
fit = model.sample(data=your_data)
save_traceplot(fit, param_names=['mu', 'tau', 'eta.1'])

Full example code

Pair plot: `save_pair_plot`

The plot helps to see correlations between parameters and spot funnel shaped distributions that can result in sampling problems.

from tarpan.cmdstanpy.pair_plot import save_pair_plot
model = CmdStanModel(stan_file="your_model.stan")
fit = model.sample(data=your_data)
save_pair_plot(fit, param_names=['mu', 'tau', 'eta.1'])

Full example code

Histogram: `save_histogram`

Show histograms of parameter distributions.

from tarpan.cmdstanpy.histogram import save_histogram
model = CmdStanModel(stan_file="your_model.stan")
fit = model.sample(data=your_data)
save_histogram(fit, param_names=['mu', 'tau', 'eta.1', 'theta.1'])

Full example code

Saving cmdstan samples to disk

It saves a lot of time to sample the model and save the results to disk, so they can be used on the next run instead of waiting for the sampling again. This can be done with run function:

from tarpan.cmdstanpy.cache import run

# Your function that creates CmdStanModel, runs its `sample` method
# and returns the result.
#
# This function must take `output_dir` input parameter and pass it to `sample`.
#
# It may also have any other parameters you wish to pass from `run`.
def run_stan(output_dir, other_param):
    model = CmdStanModel(stan_file="my_model.stan")

    fit = model.sample(
        data=data,
        output_dir=output_dir  # Pass to make CSVs in correct location
    )

    return fit  # Return the fit

# Will run `run_stan` once, save model to disk and read it on next calls
fit = run(func=run_stan, other_param="some data")

Full example code

Scatter and KDE plot

The save_scatter_and_kde saves a scatter and corresponding KDE (kernel density estimate) plot. The KDE plot takes into account uncertainties of individual value:

from tarpan.plot.kde import gaussian_kde, save_scatter_and_kde

save_scatter_and_kde(values=[1, 1.3, 1.5, 7, 4.9],
                     uncertainties=[0.1, 0.6, 0.35, 0.41, 0.03])

There is gaussian_kde function available that returns the values for a KDE plot:

from tarpan.plot.kde import gaussian_kde
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 1, 100)
y = gaussian_kde(x, values, uncert)
plt.fill_between(x, y)

Common questions

Run unit tests

pytest

The unlicense

This work is in public domain.

🐴🐴🐴

This work is dedicated to Tarpan, an extinct subspecies of wild horse.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- Public Domain
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.3.11

Feb 10, 2021

0.3.10

Feb 10, 2021

0.3.9

Mar 28, 2020

0.3.8

Mar 12, 2020

0.3.7

Mar 1, 2020

0.3.6

Feb 19, 2020

0.3.5

Feb 15, 2020

0.3.4

Feb 12, 2020

0.3.3

Feb 12, 2020

0.3.2

Feb 11, 2020

0.3.1

Feb 1, 2020

0.3.0

Jan 31, 2020

0.2.9

Jan 31, 2020

This version

0.2.8

Jan 31, 2020

0.2.7

Jan 31, 2020

0.2.6

Jan 30, 2020

0.2.5

Jan 28, 2020

0.2.4

Jan 27, 2020

0.2.2

Jan 26, 2020

0.2.1

Jan 26, 2020

0.2.0

Jan 26, 2020

0.1.9

Jan 25, 2020

0.1.8

Jan 25, 2020

0.1.7

Jan 25, 2020

0.1.5

Jan 23, 2020

0.1.1

Jan 22, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tarpan-0.2.8.tar.gz (20.4 kB view details)

Uploaded Jan 31, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tarpan-0.2.8-py3-none-any.whl (27.1 kB view details)

Uploaded Jan 31, 2020 Python 3

File details

Details for the file tarpan-0.2.8.tar.gz.

File metadata

Download URL: tarpan-0.2.8.tar.gz
Upload date: Jan 31, 2020
Size: 20.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.4

File hashes

Hashes for tarpan-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`bc737aafe1f36e74fbe0e33a20e106ca92a540aaac8924b4beee36bdb196174f`
MD5	`443f139d0f51226f3ecbf1a10ad89abe`
BLAKE2b-256	`64b46da0b0fa64a93df94b7b0d69a852375d02cc5f7c7fee4f6d171bf0e352c8`

See more details on using hashes here.

File details

Details for the file tarpan-0.2.8-py3-none-any.whl.

File metadata

Download URL: tarpan-0.2.8-py3-none-any.whl
Upload date: Jan 31, 2020
Size: 27.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.4

File hashes

Hashes for tarpan-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cbd79f41d10eab0162cb94bd0ba889fe1be7f4bf5251625da02795c86c97a58c`
MD5	`3d4471d71ff57dfd97d04c0b83993eb2`
BLAKE2b-256	`ec31fefe474af3c211602fe11f0e493eb294e465de9c51171e9e5b2e58864493`

See more details on using hashes here.

tarpan 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

A Python library for analysing cmdstanpy output

Setup

Complete analysis: save_analysis

Summary: save_summary

Summary columns

Tree plot: save_tree_plot

Comparing multiple models on a tree plot

Trace plot: save_traceplot

Pair plot: save_pair_plot

Histogram: save_histogram

Saving cmdstan samples to disk

Scatter and KDE plot

Common questions

Run unit tests

The unlicense

🐴🐴🐴

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Complete analysis: `save_analysis`

Summary: `save_summary`

Tree plot: `save_tree_plot`

Trace plot: `save_traceplot`

Pair plot: `save_pair_plot`

Histogram: `save_histogram`