Python package for probability density function fitting and hypothesis testing.
Project description
distfit
- Python package for probability density fitting and hypothesis testing.
- Probability density fitting is the fitting of a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon. distfit scores each of the 89 different distributions for the fit wih the emperical distribution and return the best scoring distribution.
The following functions are available:
import distfit as dist
# To make the distribution fit with the input data
dist.fit()
# Compute probabilities using the fitted distribution
dist.proba_parametric()
# Compute probabilities in an emperical manner
dist.proba_emperical()
# Plot results
dist.plot()
# Plot summary
dist.plot_summary()
See below for the exact working of the functions.
Contents
Installation
- Install distfit from PyPI (recommended). distfit is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
- It is distributed under the MIT license.
Requirements
pip install numpy pandas matplotlib
Quick Start
pip install distfit
- Alternatively, install distfit from the GitHub source:
git clone https://github.com/erdogant/distfit.git
cd distfit
python setup.py install
Import distfit package
import distfit as dist
Generate some random data:
import numpy as np
X=np.random.normal(5, 8, [1000])
# Print to screen
print(X)
array([[-12.65284521, -3.81514715, -4.53613236],
[ 11.5865475 , 2.42547023, 6.6395518 ],
[ 3.82076163, 6.65765319, 9.95795751],
...,
[ 3.65728268, 7.298237 , -4.25641318],
[ 7.51820943, 16.26147929, -0.60033084],
[ 2.49165326, 3.97880574, 7.98986818]])
Example fitting best scoring distribution to input-data:
model = dist.fit(X)
dist.plot(model)
Output looks like this:
[DISTFIT] Checking for [norm] [SSE:0.000152]
[DISTFIT] Checking for [expon] [SSE:0.021767]
[DISTFIT] Checking for [pareto] [SSE:0.054325]
[DISTFIT] Checking for [dweibull] [SSE:0.000721]
[DISTFIT] Checking for [t] [SSE:0.000139]
[DISTFIT] Checking for [genextreme] [SSE:0.050649]
[DISTFIT] Checking for [gamma] [SSE:0.000152]
[DISTFIT] Checking for [lognorm] [SSE:0.000156]
[DISTFIT] Checking for [beta] [SSE:0.000152]
[DISTFIT] Checking for [uniform] [SSE:0.015671]
[DISTFIT] Estimated distribution: t [loc:5.239912, scale:7.871518]
note that the best fit should be [normal], as this was also the input data.
However, many other distributions can be very similar with specific loc/scale parameters.
In this case, the t-distribution scored slightly better then normal. The normal distribution
scored similar to gamma and beta which is not strange to see.
Example Compute probability whether values are of interest compared 95%CII of the data distribution:
expdata=[-20,-12,-8,0,1,2,3,5,10,20,30,35]
# Use fitted model
model_P = dist.proba_parametric(expdata, X, model=model)
# Make plot
dist.plot(model)
# Its also possible to do the distribution fit in the proba_ function. Note that this if not practical in a loop with fixed background.
model_P = dist.proba_parametric(expdata, X)
Citation
Please cite distfit in your publications if this is useful for your research. Here is an example BibTeX entry:
@misc{erdogant2019distfit,
title={distfit},
author={Erdogan Taskesen},
year={2019},
howpublished={\url{https://github.com/erdogant/distfit}},
}
Maintainers
- Erdogan Taskesen, github: erdogant
Contribute
- Contributions are welcome.
Licence
See LICENSE for details.
Donation
This package is created and maintained in my free time. If this package is usefull, feel free to use more of my packages. Sponser here.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
distfit-0.1.3.tar.gz
(14.4 kB
view details)
Built Distribution
distfit-0.1.3-py3-none-any.whl
(14.8 kB
view details)
File details
Details for the file distfit-0.1.3.tar.gz
.
File metadata
- Download URL: distfit-0.1.3.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200127 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.6.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40b4cc76a55ae2f71ef8dabb3f087806bf95acface431e4543d10b98effd4f6e |
|
MD5 | 05d97dbae3087bf57b0c62dbb99105c3 |
|
BLAKE2b-256 | 591ae0d11cafd1a1e4b2ae4a542c8722a15e2bd62ac1d36e1b2687c21a995f5d |
File details
Details for the file distfit-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: distfit-0.1.3-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200127 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.6.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 515b446fac281b25792a9ec981395a59b41119c3e230bc529a574b52dea093e1 |
|
MD5 | 05d6725ac91b0c28f9e80139b1b069eb |
|
BLAKE2b-256 | 34b5bfdf7ba40b698bf96b5ee488dc3a7f9ce17b350fdbf074a885ca5e62a017 |