__doc__
Project description
Bayesian average
Version:
0.2.1
Authors
Martino Trassinelli
CNRS, Institute of NanoSciences of Paris
emails: trassinelli AT cnrs.fr, m.trassinelli AT gmail.com
Marleen Maxton
Max Planck Institute for Nuclear Physics, Heidelberg
Homepage
https://github.com/martinit18/bayesian_average
License
Type: X11, see LICENCE.txt
Short description
This package calculates a robust weighted average and its uncertainty from a set of data points and their uncertainties based on Bayesian statistical methods. The proposed weighted average is particularly adapted for inconsistent data sets and the presence of outliers, both of which can distort the results of standard methods.
Basic principles
Given the arrays data
and sigma
, representing a set of data points $x_i$ and their associated uncertainties $\sigma_i$, this package calculates the corresponding weighted average particularly adapted for inconsistent data sets (with a spread larger than the associated error bars) and the presence of outliers.
This robust weighted average is based on Bayesian statistics, assuming a normal distribution for each $x_i$ and considering $\sigma_i$ as a lower bound of the possibly larger real uncertainty $\sigma'$. Two different priors are proposed for $\sigma'$: the non-informative Jeffreys' prior $p(\sigma') \propto 1/ \sigma'$ (more precisely its limit, see Ref. [1]), and a modified version of it $p(\sigma') \propto 1/ (\sigma')^2$ proposed in Ref. [2]. The probability distribution is obtained by marginalizing over $\sigma'$, resulting in a modified Gaussian distribution for each $x_i$ that still depends on $\sigma_i$ and is characterized by smoothly decreasing wings.
For both priors, the weighted average and its associated uncertainty are obtained numerically using the basinhopping
minimisation algorithm.
For comparison, both the standard (inverse-variance) weighted average and its value corrected by the Birge ratio are included.
How to install it
In your terminal, run:
pip install bayesian_average
How to use it
For the calculation of the weighted average, simply type in your Python shell:
import bayesian_average as ba
ba.average(data, sigma)
data
and sigma
are two arrays of the same dimension containing the data points and the associated uncertainties, respectively. The average mode can be specified using the keyword mode
, with the is default assumption being Jeffreys' prior (jeffreys
). The other available modes are cons
, standard
, and birge
.
ba.average(data, sigma, mode='cons')
Details on the different methods are presented below.
The typical output is:
(6.6742395674538315, 9.74833292573106e-5)
where the first number represents the weighted average and the second represents its estimated uncertainty.
To plot the resulting probability distribution, the weighted average, and the input data, use the following command:
ba.plot_average(data, sigma)
The default mode presents the Jeffreys' weighted average and its associated probability distribution in log-scale. For plotting, additional options are provided, like:
ba.plot_average(data, sigma, jeffreys_val=True, jeffreys_like=True, plot_data=True)
The option xxx_val=True
displays the value of the weighted average of the xxx
method.
xxx_like=True
plots the likelihood function of the xxx
method (in log-scale by default).
plot_data=True
shows the input data with their corresponding errorbars.
legendon=True
plots the legend.
linear=True
plots the likelihood function with a linear scale.
normalize=True
normalises the likelihood function.
showon=True
can be used in case the plot is not shown.
Details on the available weighted average modes
jeffreys
: Jeffreys' weighted average (default average, recommended, see Ref. [1])
The priors of the real uncertainty value are non-informative Jeffeys' prior proportional to $1/\sigma'$. Because of the non-normalisability of the probability distribution, the value of the weighted average corresponds to the limit case with prior bounds $[\sigma_i, \sigma_\mathrm{max}]$ and $\sigma_\mathrm{max} \to \infty$, where $\sigma_i$ is the uncertainty of the data point. The final probability distribution is, however, not a proper probability distribution.cons
: Conservative weighted average (adapted for proper probability distributions, see Ref. [2])
The priors of the real uncertainty value are proportional to $\sigma_i/(\sigma')^2$, where $\sigma_i$ is the uncertainty of the data point The bounds of the prior are $[\sigma_i, \sigma_\mathrm{max}]$ with $\sigma_\mathrm{max} \to \infty$. This is a modified and normalisable version of the non-informative Jeffeys' prior.standard
: Standard weighted average
The standard inverse-variance weighted average useful for comparisons.birge
: Standard weighted average corrected with the Birge ratio
The uncertainty of the weighted average is enhanced by a factor proportional to the $\chi^2$ of the data and the weighted average if $\chi^2 > 1$, following Ref.[3].
Reference articles:
[1] M. Trassinelli and M. Maxton, A minimalistic and general weighted average for inconsistent data, in preparation for Metrologia
[2] D. S. Sivia and J. Skilling, Data analysis: a Bayesian tutorial, 2nd ed 2006, Oxford Univ. Press
[3] R. T. Birge, The Calculation of Errors by the Method of Least Squares, Phys. Rev. 40, 207 (1932)
Version history
- 0.2: rearrangement of the average function(s), Birge ratio added.
- 0.1.5: First version available on GitHub with documentation.
- 0.0.1: First version published in PyPI with conservative, Jeffreys' and standard weighted averages.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file bayesian_average-0.2.2.tar.gz
.
File metadata
- Download URL: bayesian_average-0.2.2.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0ca02be2ff906533bc3278baf1f5e868ee1b34e540f0d58083ed7e7eacc7000 |
|
MD5 | 4ddcfe1c0f1882529e22f99278c7af7d |
|
BLAKE2b-256 | a7c6f35b0a3917952946d167ca1a9677961f052a2903b94482e325249aec64ad |