Skip to main content

Python package for probability density function fitting and hypothesis testing.

Project description

distfit

Python PyPI Version License Downloads Donate

  • Python package for probability density fitting and hypothesis testing.
  • Probability density fitting is the fitting of a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon. distfit scores each of the 89 different distributions for the fit wih the emperical distribution and return the best scoring distribution.

The following functions are available:

# To make the distribution fit with the input data
.fit()
# Compute probabilities using the fitted distribution
.proba_parametric()
# Compute probabilities in an emperical manner
.proba_emperical()
# Plot results
.plot()

See below for the exact working of the functions.

Contents

Installation

  • Install distfit from PyPI (recommended). distfit is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
  • It is distributed under the MIT license.

Requirements

pip install numpy pandas matplotlib

Quick Start

pip install distfit
  • Alternatively, install distfit from the GitHub source:
git clone https://github.com/erdogant/distfit.git
cd distfit
python setup.py install

Import distfit package

import distfit as dist

Generate some random data:

import numpy as np
X=np.random.normal(5, 8, [1000])

# Print to screen
print(X)
array([[-12.65284521,  -3.81514715,  -4.53613236],
       [ 11.5865475 ,   2.42547023,   6.6395518 ],
       [  3.82076163,   6.65765319,   9.95795751],
       ...,
       [  3.65728268,   7.298237  ,  -4.25641318],
       [  7.51820943,  16.26147929,  -0.60033084],
       [  2.49165326,   3.97880574,   7.98986818]])

Example fitting best scoring distribution to input-data:

model = dist.fit(X)
dist.plot(model)

Output looks like this:

[DISTFIT] Checking for [norm] [SSE:0.000152]
[DISTFIT] Checking for [expon] [SSE:0.021767] 
[DISTFIT] Checking for [pareto] [SSE:0.054325] 
[DISTFIT] Checking for [dweibull] [SSE:0.000721]
[DISTFIT] Checking for [t] [SSE:0.000139]
[DISTFIT] Checking for [genextreme] [SSE:0.050649]
[DISTFIT] Checking for [gamma] [SSE:0.000152]
[DISTFIT] Checking for [lognorm] [SSE:0.000156]
[DISTFIT] Checking for [beta] [SSE:0.000152]
[DISTFIT] Checking for [uniform] [SSE:0.015671] 
[DISTFIT] Estimated distribution: t [loc:5.239912, scale:7.871518]

note that the best fit should be [normal], as this was also the input data. 
However, many other distributions can be very similar with specific loc/scale parameters. 
In this case, the t-distribution scored slightly better then normal. The normal distribution 
scored similar to gamma and beta which is not strange to see. 

Example Compute probability whether values are of interest compared 95%CII of the data distribution:

expdata=[-20,-12,-8,0,1,2,3,5,10,20,30,35]
# Use fitted model
model_P = dist.proba_parametric(expdata, X, model=model)
# Make plot
dist.plot(model)

# Its also possible to do the distribution fit in the proba_ function. Note that this if not practical in a loop with fixed background. 
model_P = dist.proba_parametric(expdata, X)

Citation

Please cite distfit in your publications if this is useful for your research. Here is an example BibTeX entry:

@misc{erdogant2019distfit,
  title={distfit},
  author={Erdogan Taskesen},
  year={2019},
  howpublished={\url{https://github.com/erdogant/distfit}},
}

Maintainers

Contribute

  • Contributions are welcome.

Licence

See LICENSE for details.

Donation

  • This package is created and maintained in my free time. If this package is usefull, you can show your gratitude :) Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

distfit-0.1.2.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

distfit-0.1.2-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file distfit-0.1.2.tar.gz.

File metadata

  • Download URL: distfit-0.1.2.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.10

File hashes

Hashes for distfit-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a6ba2dd9ef8160a1f77901d42ecd951389942227010a64ef034a59e6337f7d5b
MD5 e6dcfd561c6b85ad4c85a26e8f77f819
BLAKE2b-256 f31eab5f6d63ca0fbc9ffac62302ddcb7d8d95854aaf6b35664f307187a0bdca

See more details on using hashes here.

File details

Details for the file distfit-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: distfit-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.10

File hashes

Hashes for distfit-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e2b42309fe9ebf19053fc1c63d9c6b5363fe557ef63ada4f0667acd3e42a5183
MD5 974e2f67c3b33bebb37d8aaa97fd1f94
BLAKE2b-256 b1fb4cdc61d694dd70c24a548ae915c1b5ea17eb1d8ebe323fa378a111e23920

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page