Skip to main content

__doc__

Project description

Bayesian average

Version:

0.2.4

Authors

Martino Trassinelli
CNRS, Institute of NanoSciences of Paris
emails: trassinelli AT cnrs.fr, m.trassinelli AT gmail.com

Marleen Maxton
Max Planck Institute for Nuclear Physics, Heidelberg

Homepage

https://github.com/martinit18/bayesian_average

License

Type: X11, see LICENCE.txt

Short description

This package calculates a robust weighted average and its uncertainty from a set of data points and their uncertainties based on Bayesian statistical methods. The proposed weighted average is particularly adapted for inconsistent data sets and the presence of outliers, both of which can distort the results of standard methods.

Basic principles

Given the arrays data and sigma, representing a set of data points $x_i$ and their associated uncertainties $\sigma_i$, this package calculates the corresponding weighted average particularly adapted for inconsistent data sets (with a spread larger than the associated error bars) and the presence of outliers.

This robust weighted average is based on Bayesian statistics, assuming a normal distribution for each $x_i$ and considering $\sigma_i$ as a lower bound of the possibly larger real uncertainty $\sigma'$. Two different priors are proposed for $\sigma'$: the non-informative Jeffreys' prior $p(\sigma') \propto 1/ \sigma'$ (more precisely its limit, see Ref. [1]), and a modified version of it $p(\sigma') \propto 1/ (\sigma')^2$ proposed in Ref. [2]. The probability distribution is obtained by marginalizing over $\sigma'$, resulting in a modified Gaussian distribution for each $x_i$ that still depends on $\sigma_i$ and is characterized by smoothly decreasing wings.

For both priors, the weighted average and its associated uncertainty are obtained numerically using the basinhopping minimisation algorithm.

For comparison, both the standard (inverse-variance) weighted average and its value corrected by the Birge ratio are included.

How to install it

In your terminal, run:

pip install bayesian_average

How to use it

For the calculation of the weighted average, simply type in your Python shell:

import bayesian_average as ba
ba.average(data, sigma)

data and sigma are two arrays of the same dimension containing the data points and the associated uncertainties, respectively. The average mode can be specified using the keyword mode, with the is default assumption being Jeffreys' prior (jeffreys). The other available modes are cons, standard, and birge.

ba.average(data, sigma, mode='cons')

Details on the different methods are presented below.

The typical output is:

(6.6742395674538315, 9.74833292573106e-5)

where the first number represents the weighted average and the second represents its estimated uncertainty.

To plot the resulting probability distribution, the weighted average, and the input data, use the following command:

ba.plot_average(data, sigma)

The default mode presents the Jeffreys' weighted average and its associated probability distribution in log-scale. For plotting, additional options are provided, like:

ba.plot_average(data, sigma, jeffreys_val=True, jeffreys_like=True, plot_data=True)

The option xxx_val=True displays the value of the weighted average of the xxx method.
xxx_like=True plots the likelihood function of the xxx method (in log-scale by default).
plot_data=True shows the input data with their corresponding errorbars.
legendon=True plots the legend.
linear=True plots the likelihood function with a linear scale.
normalize=True normalises the likelihood function.
showon=True can be used in case the plot is not shown.

Details on the available weighted average modes

  • jeffreys: Jeffreys' weighted average (default average, recommended, see Ref. [1])
    The priors of the real uncertainty value are non-informative Jeffeys' prior proportional to $1/\sigma'$. Because of the non-normalisability of the probability distribution, the value of the weighted average corresponds to the limit case with prior bounds $[\sigma_i, \sigma_\mathrm{max}]$ and $\sigma_\mathrm{max} \to \infty$, where $\sigma_i$ is the uncertainty of the data point. The final probability distribution is, however, not a proper probability distribution.
  • cons: Conservative weighted average (adapted for proper probability distributions, see Ref. [2])
    The priors of the real uncertainty value are proportional to $\sigma_i/(\sigma')^2$, where $\sigma_i$ is the uncertainty of the data point The bounds of the prior are $[\sigma_i, \sigma_\mathrm{max}]$ with $\sigma_\mathrm{max} \to \infty$. This is a modified and normalisable version of the non-informative Jeffeys' prior.
  • standard: Standard weighted average
    The standard inverse-variance weighted average useful for comparisons.
  • birge: Standard weighted average corrected with the Birge ratio
    The uncertainty of the weighted average is enhanced by a factor proportional to the $\chi^2$ of the data and the weighted average if $\chi^2 > 1$, following Ref.[3].

Reference articles:

[1] M. Trassinelli and M. Maxton, A minimalistic and general weighted average for inconsistent data, arXiv:2406.08293, submitted to Metrologia
[2] D. S. Sivia and J. Skilling, Data analysis: a Bayesian tutorial, 2nd ed 2006, Oxford Univ. Press
[3] R. T. Birge, The Calculation of Errors by the Method of Least Squares, Phys. Rev. 40, 207 (1932)

Version history

  • 0.2: rearrangement of the average function(s), Birge ratio added.
  • 0.1.5: First version available on GitHub with documentation.
  • 0.0.1: First version published in PyPI with conservative, Jeffreys' and standard weighted averages.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bayesian_average-0.2.4.tar.gz (7.2 kB view details)

Uploaded Source

File details

Details for the file bayesian_average-0.2.4.tar.gz.

File metadata

  • Download URL: bayesian_average-0.2.4.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for bayesian_average-0.2.4.tar.gz
Algorithm Hash digest
SHA256 70dfe14ed028823e65e23f8ac229a9b9b612b4dea9fb51198addd3b7926d0933
MD5 73029ac27a3c07eb4642707ed5ce4459
BLAKE2b-256 a3eb023b15398fd529be3d72da0d9af80b7ff3336ddec9f67836858b46eb61a2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page