Python package for probability density function fitting and hypothesis testing.
Project description
distfit
- Python package for probability density fitting and hypothesis testing.
- Probability density fitting is the fitting of a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon. distfit scores each of the 89 different distributions for the fit wih the emperical distribution and return the best scoring distribution.
Four functions are available:
# To make the distribution fit with the input data
.fit()
# Compute probabilities using the fitted distribution
.proba_parametric()
# Compute probabilities in an emperical manner
.proba_emperical()
# Plot results
.plot()
See below for the exact working of the functions
Contents
Installation
- Install distfit from PyPI (recommended). distfit is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
- It is distributed under the MIT license.
Requirements
pip install numpy pandas matplotlib
Quick Start
pip install distfit
- Alternatively, install distfit from the GitHub source:
git clone https://github.com/erdogant/distfit.git
cd distfit
python setup.py install
Import distfit package
import distfit as dist
Generate some random data:
import numpy as np
data=np.random.normal(5, 8, [1000])
data looks like this:
array([[-12.65284521, -3.81514715, -4.53613236],
[ 11.5865475 , 2.42547023, 6.6395518 ],
[ 3.82076163, 6.65765319, 9.95795751],
...,
[ 3.65728268, 7.298237 , -4.25641318],
[ 7.51820943, 16.26147929, -0.60033084],
[ 2.49165326, 3.97880574, 7.98986818]])
Example fitting best scoring distribution to input-data:
model = dist.fit(data)
dist.plot(model)
Output looks like this:
[DISTFIT] Checking for [norm] [SSE:0.000152]
[DISTFIT] Checking for [expon] [SSE:0.021767]
[DISTFIT] Checking for [pareto] [SSE:0.054325]
[DISTFIT] Checking for [dweibull] [SSE:0.000721]
[DISTFIT] Checking for [t] [SSE:0.000139]
[DISTFIT] Checking for [genextreme] [SSE:0.050649]
[DISTFIT] Checking for [gamma] [SSE:0.000152]
[DISTFIT] Checking for [lognorm] [SSE:0.000156]
[DISTFIT] Checking for [beta] [SSE:0.000152]
[DISTFIT] Checking for [uniform] [SSE:0.015671]
[DISTFIT] Estimated distribution: t [loc:5.239912, scale:7.871518]
note that the best fit should be [normal], as this was also the input data.
However, many other distributions can be very similar with specific loc/scale parameters.
In this case, the t-distribution scored slightly better then normal. The normal distribution
scored similar to gamma and beta which is not strange to see.
If you dont understand why, do some homework first ;)
Example Compute probability whether values are of interest compared 95%CII of the data distribution:
expdata=[-20,-12,-8,0,1,2,3,5,10,20,30,35]
# Use fitted model
model_P = dist.proba_parametric(expdata, data, model=model)
# Make plot
dist.plot(model)
# Its also possible to do the distribution fit in the proba_ function:
model_P = dist.proba_parametric(expdata, data)
Citation
Please cite distfit in your publications if this is useful for your research. Here is an example BibTeX entry:
@misc{erdogant2019distfit,
title={distfit},
author={Erdogan Taskesen},
year={2019},
howpublished={\url{https://github.com/erdogant/distfit}},
}
Maintainers
- Erdogan Taskesen, github: erdogant
Contribute
- Contributions are welcome.
© Copyright
See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
distfit-0.1.1.tar.gz
(16.6 kB
view details)
Built Distribution
distfit-0.1.1-py3-none-any.whl
(19.1 kB
view details)
File details
Details for the file distfit-0.1.1.tar.gz
.
File metadata
- Download URL: distfit-0.1.1.tar.gz
- Upload date:
- Size: 16.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7b2de887b89d53734ccc18a2a8b1864ec8d1633516587e7f65bf91e8f77d33d |
|
MD5 | d749cba45c1864d220addd1e00c9efd8 |
|
BLAKE2b-256 | b531c5ab8c8ef903afff0419615a315dfa7b02eb4aba582f7a3b580e4ace7c2c |
File details
Details for the file distfit-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: distfit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 19.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71317d2cc633573d99bd720f5c5ff7d551a930d368c18fa328a707370622f515 |
|
MD5 | 9de27c3dfc70bcf27599dd01a78d645d |
|
BLAKE2b-256 | 4b57324f22f4a46fe02c906baded0a7f1d513ec6b3c03d484875677a407ed82f |