Skip to main content

Python package for probability density function fitting and hypothesis testing.

Project description

distfit - Probability density fitting

Python PyPI Version License Github Forks GitHub Open Issues Project Status Downloads Downloads DOI Sphinx Open In Colab

distfit is a python package for probability density fitting across 89 univariate distributions to non-censored data by residual sum of squares (RSS), and hypothesis testing. Probability density fitting is the fitting of a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon. distfit scores each of the 89 different distributions for the fit wih the empirical distribution and return the best scoring distribution.

⭐️ Star this repo if you like it ⭐️

Documentation pages

On the documentation pages you can find detailed information about the distfit library with many examples.

Installation

Install bnlearn from PyPI
pip install bnlearn
Install from github source (beta version)
 install git+https://github.com/erdogant/distfit#egg=master
Check version
import distfit
print(distfit.__version__)
The following functions are available after installation:
# Import library
from distfit import distfit

dist = distfit()        # Initialize 
dist.fit_transform(X)   # Fit distributions on empirical data X
dist.predict(y)         # Predict the probability of the resonse variables
dist.plot()             # Plot the best fitted distribution (y is included if prediction is made)

Examples

Example: Quick start to find best fit for your input data
# Prints the screen:
# [distfit] >fit..
# [distfit] >transform..
# [distfit] >[norm      ] [RSS: 0.0133619] [loc=-0.059 scale=2.031] 
# [distfit] >[expon     ] [RSS: 0.3911576] [loc=-6.213 scale=6.154] 
# [distfit] >[pareto    ] [RSS: 0.6755185] [loc=-7.965 scale=1.752] 
# [distfit] >[dweibull  ] [RSS: 0.0183543] [loc=-0.053 scale=1.726] 
# [distfit] >[t         ] [RSS: 0.0133619] [loc=-0.059 scale=2.031] 
# [distfit] >[genextreme] [RSS: 0.0115116] [loc=-0.830 scale=1.964] 
# [distfit] >[gamma     ] [RSS: 0.0111372] [loc=-19.843 scale=0.209] 
# [distfit] >[lognorm   ] [RSS: 0.0111236] [loc=-29.689 scale=29.561] 
# [distfit] >[beta      ] [RSS: 0.0113012] [loc=-12.340 scale=41.781] 
# [distfit] >[uniform   ] [RSS: 0.2481737] [loc=-6.213 scale=12.281] 

Example: Plot summary of the tested distributions

After we have a fitted model, we can make some predictions using the theoretical distributions. After making some predictions, we can plot again but now the predictions are automatically included.

Example: Make predictions using the fitted distribution

Example: Test for one specific distributions

The full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html

Example: Test for multiple distributions

The full list of distributions is listed here: https://erdogant.github.io/distfit/pages/html/Parametric.html

Example: Fit discrete distribution
from scipy.stats import binom
# Generate random numbers

# Set parameters for the test-case
n = 8
p = 0.5

# Generate 10000 samples of the distribution of (n, p)
X = binom(n, p).rvs(10000)
print(X)

# [5 1 4 5 5 6 2 4 6 5 4 4 4 7 3 4 4 2 3 3 4 4 5 1 3 2 7 4 5 2 3 4 3 3 2 3 5
#  4 6 7 6 2 4 3 3 5 3 5 3 4 4 4 7 5 4 5 3 4 3 3 4 3 3 6 3 3 5 4 4 2 3 2 5 7
#  5 4 8 3 4 3 5 4 3 5 5 2 5 6 7 4 5 5 5 4 4 3 4 5 6 2...]

# [distfit] >fit..
# [distfit] >transform..
# [distfit] >Fit using binomial distribution..
# [distfit] >[binomial] [SSE: 7.79] [n: 8] [p: 0.499959] [chi^2: 1.11]
# [distfit] >Compute confidence interval [discrete]

Example: Make predictions on unseen data for discrete distribution

Example: Generate samples based on the fitted distribution

Contribute

  • All kinds of contributions are welcome!

Citation

Please cite distfit in your publications if this is useful for your research. See column right for citation information.

Maintainer

  • Erdogan Taskesen, github: erdogant
  • Contributions are welcome.
  • If you wish to buy me a Coffee for this work, it is very appreciated :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

distfit-1.4.5.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

distfit-1.4.5-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file distfit-1.4.5.tar.gz.

File metadata

  • Download URL: distfit-1.4.5.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.13

File hashes

Hashes for distfit-1.4.5.tar.gz
Algorithm Hash digest
SHA256 466bda055291ab1b38526539d299a0c91da4a6d1fa9d549deee84ad91694fa79
MD5 c1004ff21d25855794f415e61fcb9f00
BLAKE2b-256 92a4877833b052a250480f933c7449aacf7b808d2c130284ec716cf0a1f2239d

See more details on using hashes here.

File details

Details for the file distfit-1.4.5-py3-none-any.whl.

File metadata

  • Download URL: distfit-1.4.5-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.13

File hashes

Hashes for distfit-1.4.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7fec1323843fe6b528d2268032b780c473d6a494646a17fc8f776a101f7996fc
MD5 97726899a15e17b179363f6ef0831df4
BLAKE2b-256 1a1770ad7da093b0853fc03c7bd59d3abda5dd831afd2e9a888890b272af7a04

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page