Skip to main content

Find the best probability distribution for your dataset

Project description

phitter-dark-logo

Phitter analyzes datasets and determines the best analytical probability distributions that represent them. The Phitter kernel studies over 80 probability distributions, both continuous and discrete, 3 goodness-of-fit tests, and interactive visualizations. For each selected probability distribution, a standard modeling guide is provided along with spreadsheets that detail the methodology for using the chosen distribution in data science, operations research, and artificial intelligence.

In this repository is the implementation of the python library and the kernel of Phitter Web

Installation

Requirements

python: >=3.9

PyPI

pip install phitter

Usage

General

import phitter

data: list[int | float] = [...]

phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

Full continuous implementation

import phitter

data: list[int | float] = [...]

phitter_cont = phitter.PHITTER(
    data=data,
    fit_type="continuous",
    num_bins=15,
    confidence_level=0.95,
    minimum_sse=1e-2,
    distributions_to_fit=["beta", "normal", "fatigue_life", "triangular"],
)
phitter_cont.fit(n_workers=6)

Full discrete implementation

import phitter

data: list[int | float] = [...]

phitter_disc = phitter.PHITTER(
    data=data,
    fit_type="discrete",
    confidence_level=0.95,
    minimum_sse=1e-2,
    distributions_to_fit=["binomial", "geometric"],
)
phitter_disc.fit(n_workers=2)

Phitter: properties and methods

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.best_distribution -> dict
phitter_cont.sorted_distributions_sse -> dict
phitter_cont.not_rejected_distributions -> dict
phitter_cont.df_sorted_distributions_sse -> pandas.DataFrame
phitter_cont.df_not_rejected_distributions -> pandas.DataFrame

Histogram Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.plot_histogram()
phitter_histogram

Histogram PDF Dsitributions Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.plot_histogram_distributions()
phitter_histogram

Histogram PDF Dsitribution Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.plot_distribution("beta")
phitter_histogram

ECDF Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.plot_ecdf()
phitter_histogram

ECDF Distribution Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.plot_ecdf_distribution("beta")
phitter_histogram

QQ Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.qq_plot("beta")
phitter_histogram

QQ - Regression Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.qq_plot_regression("beta")
phitter_histogram

Distributions: Methods and properties

import phitter

distribution = phitter.continuous.BETA(parameters={"alpha": 5, "beta": 3, "A": 200, "B": 1000})

## CDF, PDF, PPF, PMF receive float or numpy.ndarray. For discrete distributions PMF instead of PDF. Parameters notation are in description of ditribution
distribution.cdf(752) # -> 0.6242831129533498
distribution.pdf(388) # -> 0.0002342575686629883
distribution.ppf(0.623) # -> 751.5512889417921
distribution.sample(2) # -> [550.800114   514.85410326]

## STATS
distribution.mean # -> 700.0
distribution.variance # -> 16666.666666666668
distribution.standard_deviation # -> 129.09944487358058
distribution.skewness # -> -0.3098386676965934
distribution.kurtosis # -> 2.5854545454545454
distribution.median # -> 708.707130841534
distribution.mode # -> 733.3333333333333

Continuous Distributions

• ALPHA • ARCSINE • ARGUS • BETA • BETA PRIME • BETA PRIME 4P • BRADFORD • BURR • BURR 4P • CAUCHY • CHI SQUARE • CHI SQUARE 3P • DAGUM • DAGUM 4P • ERLANG • ERLANG 3P • ERROR FUNCTION • EXPONENTIAL • EXPONENTIAL 2P • F • FATIGUE LIFE • FOLDED NORMAL • FRECHET • F 4P • GAMMA • GAMMA 3P • GENERALIZED EXTREME VALUE • GENERALIZED GAMMA • GENERALIZED GAMMA 4P • GENERALIZED LOGISTIC • GENERALIZED NORMAL • GENERALIZED PARETO • GIBRAT • GUMBEL LEFT • GUMBEL RIGHT • HALF NORMAL • HYPERBOLIC SECANT • INVERSE GAMMA • INVERSE GAMMA 3P • INVERSE GAUSSIAN • INVERSE GAUSSIAN 3P • JOHNSON SB • JOHNSON SU • KUMARASWAMY • LAPLACE • LEVY • LOGGAMMA • LOGISTIC • LOGLOGISTIC • LOGLOGISTIC 3P • LOGNORMAL • MAXWELL • MOYAL • NAKAGAMI • NON CENTRAL CHI SQUARE • NON CENTRAL F • NON CENTRAL T STUDENT • NORMAL • PARETO FIRST KIND • PARETO SECOND KIND • PERT • POWER FUNCTION • RAYLEIGH • RECIPROCAL • RICE • SEMICIRCULAR • TRAPEZOIDAL • TRIANGULAR • T STUDENT • T STUDENT 3P • UNIFORM • WEIBULL • WEIBULL 3P

Discrete Distributions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phitter-0.0.4.tar.gz (72.2 kB view details)

Uploaded Source

Built Distribution

phitter-0.0.4-py3-none-any.whl (224.0 kB view details)

Uploaded Python 3

File details

Details for the file phitter-0.0.4.tar.gz.

File metadata

  • Download URL: phitter-0.0.4.tar.gz
  • Upload date:
  • Size: 72.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.10

File hashes

Hashes for phitter-0.0.4.tar.gz
Algorithm Hash digest
SHA256 3c415498eca680f1295746e5da26d51e62cb7c92b1d949b4f8781fbd5e823199
MD5 4b6cb4ab34f98be43948e443924090f3
BLAKE2b-256 583f76eff1cb713499e3090b16fc5e8098503c885ac2ed73021efdb2d532a91a

See more details on using hashes here.

File details

Details for the file phitter-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: phitter-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 224.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.10

File hashes

Hashes for phitter-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 cc4e8b05b0c7ad234b9a76d675f37d78d2b61077bcc49a32f623541d8d881235
MD5 904fe7157988a226637d9f4daf7eccfe
BLAKE2b-256 f2f10efe7b54916978880ed7aa825d50ea45a36bd7835da6a75b7bcc85d5e0a8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page