Skip to main content

Evaluate the Goodness-of-Fit(GoF) for binned or unbinned data.

Project description

GOFevaluation

Evaluate the Goodness-of-Fit (GoF) for binned or unbinned data.
Test package

This GoF suite comprises the possibility to calculate different 1D / nD, binned / two-sample (unbinned) GoF measures and the corresponding p-value. A list of implemented measures is given below.

Implemented GoF measures

GoF measure Class data input reference input dim
Kolmogorov-Smirnov KSTestGOF sample binned 1D
Two-Sample Kolmogorov-Smirnov KSTestTwoSampleGOF sample sample 1D
Two-Sample Anderson-Darling ADTestTwoSampleGOF sample sample 1D
Poisson Chi2 BinnedPoissonChi2GOF binned / sample binned nD
Chi2 BinnedChi2GOF binned / sample binned nD
Point-to-point PointToPointGOF sample sample nD

Installation and Set-Up

Clone the repository:

git clone https://github.com/XENONnT/GOFevaluation
cd GOFevaluation

Install the requirements in your environment:

pip install -r requirements.txt

Then install the package:

python setup.py install --user

You are now good to go!

Usage

Individual GoF Measures

Depending on your data and reference input you can initialise a gof_object in one of the following ways:

import GOFevaluation as ge

# Data Sample + Binned PDF
gof_object = ge.BinnedPoissonChi2GOF(data_sample, pdf, bin_edges, nevents_expected)

# Binned Data + Binned PDF
gof_object = ge.BinnedPoissonChi2GOF.from_binned(binned_data, binned_reference)

# Data Sample + Reference Sample
gof_object = ge.PointToPointGOF(data_sample, reference_sample)

With any gof_object you can calculate the GoF and the corresponding p-value as follows:

gof = gof_object.get_gof()
p_value = gof_object.get_pvalue()

Multiple GoF Measures at once

You can compute GoF and p-values for multiple measures at once with the GOFTest class.

Example:

import GOFevaluation as ge
import scipy.stats as sps

# random_state makes sure the gof values are reproducible.
# For the p-values, a slight variation is expected due to
# the random re-sampling method that is used.
data_sample = sps.uniform.rvs(size=100, random_state=200)
reference_sample = sps.uniform.rvs(size=300, random_state=201)

# Initialise all two-sample GoF measures:
gof_object = ge.GOFTest(data_sample=data_sample, 
                        reference_sample=reference_sample,
                        gof_list=['ADTestTwoSampleGOF', 
                                  'KSTestTwoSampleGOF', 
                                  'PointToPointGOF'])
# Calculate GoFs and p-values:
d_min = 0.01
gof_object.get_gofs(d_min=d_min)
# OUTPUT:
# OrderedDict([('ADTestTwoSampleGOF', 1.6301454042304904),
#              ('KSTestTwoSampleGOF', 0.14),
#              ('PointToPointGOF', 0.00048491049630050576)])

gof_object.get_pvalues(d_min=d_min)
# OUTPUT:
# OrderedDict([('ADTestTwoSampleGOF', 0.08699999999999997),
#              ('KSTestTwoSampleGOF', 0.10699999999999998),
#              ('PointToPointGOF', 0.14300000000000002)])

# Re-calculate p-value only for one measure:
gof_object.get_pvalues(d_min=.3, gof_list=['PointToPointGOF'])
# OUTPUT:
# OrderedDict([('ADTestTwoSampleGOF', 0.08699999999999997),
#              ('KSTestTwoSampleGOF', 0.10699999999999998),
#              ('PointToPointGOF', 0.03400000000000003)])

print(gof_object)
# OUTPUT:
# GOFevaluation.gof_test
# GoF measures: ADTestTwoSampleGOF, KSTestTwoSampleGOF, PointToPointGOF
# gofs = 1.6301454042304904, 0.14, 0.00048491049630050576
# p-values = 0.08699999999999997, 0.10699999999999998, 0.03400000000000003

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

v0.1.0

  • Release as a python package

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GOFevaluation-0.1.0.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

GOFevaluation-0.1.0-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file GOFevaluation-0.1.0.tar.gz.

File metadata

  • Download URL: GOFevaluation-0.1.0.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for GOFevaluation-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b4bbad7fd5bf68c33e9ab3ff526445a2579f6b3d9ea90c5b511c35bfb7ef5c7b
MD5 9478662ba705b4bc8081eb1cdca19d02
BLAKE2b-256 3b7042c34c7534480c32eafa5cd5dc841b1081a5d594fe8b36ff2334723ee8fd

See more details on using hashes here.

File details

Details for the file GOFevaluation-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: GOFevaluation-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for GOFevaluation-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ee16d7781992a90b88b790193b9f4ccb9c53a67e0146fd16c88ece7ebe52ae7
MD5 9db8a5c6000dadc986abd13a0057dd7d
BLAKE2b-256 7446e162daa3b3f02f5347cf65c7966b660a322e52a732acdba2f91c05c88e13

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page