Skip to main content

Find the best probability distribution for your dataset

Project description

phitter-dark-logo

Phitter analyzes datasets and determines the best analytical probability distributions that represent them. The Phitter kernel studies over 80 probability distributions, both continuous and discrete, 3 goodness-of-fit tests, and interactive visualizations. For each selected probability distribution, a standard modeling guide is provided along with spreadsheets that detail the methodology for using the chosen distribution in data science, operations research, and artificial intelligence.

In this repository is the implementation in python and the kernel for Phitter Web

Installation

Requirements

python: >=3.9

PyPI

pip install phitter

Usage

General

import phitter

data: list[int | float] = [...]

phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

Full continuous implementation

import phitter

data: list[int | float] = [...]

phitter_cont = phitter.PHITTER(
    data=data,
    fit_type="continuous",
    num_bins=15,
    confidence_level=0.95,
    minimum_sse=1e-2,
    distributions_to_fit=["beta", "normal", "fatigue_life", "triangular"],
)
phitter_cont.fit(n_jobs=6)

Full discrete implementation

import phitter

data: list[int | float] = [...]

phitter_disc = phitter.PHITTER(
    data=data,
    fit_type="discrete",
    confidence_level=0.95,
    minimum_sse=1e-2,
    distributions_to_fit=["binomial", "geometric"],
)
phitter_disc.fit(n_jobs=2)

Phitter: properties and methods

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.best_distribution -> dict
phitter_cont.sorted_distributions_sse -> dict
phitter_cont.not_rejected_distributions -> dict
phitter_cont.df_sorted_distributions_sse -> pandas.DataFrame
phitter_cont.df_not_rejected_distributions -> pandas.DataFrame

Histogram Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.plot_histogram()
phitter_histogram

Histogram PDF Dsitributions Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.plot_histogram_distributions()
phitter_histogram

Histogram PDF Dsitribution Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.plot_distribution("beta")
phitter_histogram

ECDF Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.plot_ecdf()
phitter_histogram

ECDF Distribution Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.plot_ecdf_distribution("beta")
phitter_histogram

QQ Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.qq_plot("beta")
phitter_histogram

QQ - Regression Plot

import phitter
data: list[int | float] = [...]
phitter_cont = phitter.PHITTER(data)
phitter_cont.fit()

phitter_cont.phitter.qq_plot_regression("beta")
phitter_histogram

Distributions: Methods and properties

import phitter

distribution = phitter.continuous.BETA({"alpha": 5, "beta": 3, "A": 200, "B": 1000})

## CDF, PDF, PPF, PMF receive float or numpy.ndarray. For discrete distributions PMF instead of PDF. Parameters notation are in description of ditribution
distribution.cdf(752) # -> 0.6242831129533498
distribution.pdf(388) # -> 0.0002342575686629883
distribution.ppf(0.623) # -> 751.5512889417921
distribution.sample(2) # -> [550.800114   514.85410326]
distribution.sample(2) # -> [622.94263263 827.21838464]

## STATS
distribution.mean # -> 700.0
distribution.variance # -> 16666.666666666668
distribution.standard_deviation # -> 129.09944487358058
distribution.skewness # -> -0.3098386676965934
distribution.kurtosis # -> 2.5854545454545454
distribution.median # -> 708.707130841534
distribution.mode # -> 733.3333333333333

Continuous Distributions

• ALPHA • ARCSINE • ARGUS • BETA • BETA PRIME • BETA PRIME 4P • BRADFORD • BURR • BURR 4P • CAUCHY • CHI SQUARE • CHI SQUARE 3P • DAGUM • DAGUM 4P • ERLANG • ERLANG 3P • ERROR FUNCTION • EXPONENTIAL • EXPONENTIAL 2P • F • FATIGUE LIFE • FOLDED NORMAL • FRECHET • F 4P • GAMMA • GAMMA 3P • GENERALIZED EXTREME VALUE • GENERALIZED GAMMA • GENERALIZED GAMMA 4P • GENERALIZED LOGISTIC • GENERALIZED NORMAL • GENERALIZED PARETO • GIBRAT • GUMBEL LEFT • GUMBEL RIGHT • HALF NORMAL • HYPERBOLIC SECANT • INVERSE GAMMA • INVERSE GAMMA 3P • INVERSE GAUSSIAN • INVERSE GAUSSIAN 3P • JOHNSON SB • JOHNSON SU • KUMARASWAMY • LAPLACE • LEVY • LOGGAMMA • LOGISTIC • LOGLOGISTIC • LOGLOGISTIC 3P • LOGNORMAL • MAXWELL • MOYAL • NAKAGAMI • NON CENTRAL CHI SQUARE • NON CENTRAL F • NON CENTRAL T STUDENT • NORMAL • PARETO FIRST KIND • PARETO SECOND KIND • PERT • POWER FUNCTION • RAYLEIGH • RECIPROCAL • RICE • SEMICIRCULAR • TRAPEZOIDAL • TRIANGULAR • T STUDENT • T STUDENT 3P • UNIFORM • WEIBULL • WEIBULL 3P

Discrete Distributions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phitter-0.0.1.tar.gz (70.1 kB view details)

Uploaded Source

Built Distribution

phitter-0.0.1-py3-none-any.whl (219.9 kB view details)

Uploaded Python 3

File details

Details for the file phitter-0.0.1.tar.gz.

File metadata

  • Download URL: phitter-0.0.1.tar.gz
  • Upload date:
  • Size: 70.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.10

File hashes

Hashes for phitter-0.0.1.tar.gz
Algorithm Hash digest
SHA256 554a4e10bb04ee1003de966e1fafd6b1855e19bff85408c1a936714f9a631f83
MD5 91e6afd1f43ec9235bfa8319f14cace5
BLAKE2b-256 6385b8cadc92d1ab177bbb33fbe8e4ce8fcf42c62139bfac4c377bfaeff028f6

See more details on using hashes here.

File details

Details for the file phitter-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: phitter-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 219.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.10

File hashes

Hashes for phitter-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 26006e90d97233f1084561433acf7913b7438e9e5e6ee3da7179348d1973a322
MD5 01f18b0742d447119db4fc3b75cb4be9
BLAKE2b-256 509b85e25c9ec9b19551efd69109ee1a26c05c408b79c0eca2b6f062526afa36

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page