Skip to main content

the package fits data to metalog distribution and generates samples, quantiles, densities and probabilities based on the fitted distribution.

Project description

Sergey Kim, Reidar Brumer Bratvold

Metalog Distribution

The metalog distributions constitute a new system of continuous univariate probability distributions designed for flexibility, simplicity, and ease/speed of use in practice. The system is comprised of unbounded, semi-bounded, and bounded distributions, each of which offers nearly unlimited shape flexibility compared to Pearson, Johnson, and other traditional systems of distributions.

The package requires the following packages: numpy, pandas, matplotlib and scipy (ver 1.3.1).

The following paper and website provide a full background of the metalog distribution.

Using the Package

This Python package was transfered from RMetalog package by Isaac J. Faber and therefore shares the same R-based structure.

The data used for demonstration are body length of salmon and were collected in 2008-2010:

import numpy as np
import pandas as pd

salmon = pd.read_csv("Chinook and forage fish lengths.csv")

# Filtered data for eelgrass vegetation and chinook salmon
salmon = salmon[(salmon['Vegetation'] == 'Eelgrass') & (salmon['Species'] == 'Chinook_salmon')]
salmon = np.array(salmon['Length'])

To import package with metalog distribution run the code:

from metalog import metalog

To fit the data to metalog distribution one should use function metalog.fit(). It has the following arguments:

  • x: data.
  • bounds: bounds of metalog distribution. Depending on boundedness argument can take zero, one or two values.
  • boundedness: boundedness of metalog distribution. Can take values 'u' for unbounded, 'sl' for semi-bounded lower, 'su' for semi-bounded upper and 'b' for bounded on both sides.
  • term_limit: maximum number of terms to specify the metalog distribution. Can take values from 3 to 30.
  • term_lower_bound: the lowest number of terms to specify the metalog distribution. Must be greater or equal to 2 and less than term_limit. The argument is optional. Default value is 2.
  • step_len: size of steps to summarize the distribution. The argument is optional. Default value is 0.01.
  • probs: probabilities corresponding to data. The argument is optional. Default value is numpy.nan.
  • fit_method: fit method 'OLS', 'LP' or 'any'. The argument is optional. Default value is 'any'.
  • save_data: if True then data will be saved for future update. The argument is optional. Default values is False.

Fit metalog distribution to data and store the result to variable metalog_salmon. The distribution is bounded on both sides: from 0 to 200. Term limit is set to 10:

metalog_salmon = metalog.fit(x=salmon, boundedness='b', bounds=[0, 200], term_limit=10)

To get summary of distribution call the following function with only one argument m - the variable that stores fitted metalog distribution:

metalog.summary(m=metalog_salmon)

Output:

 -----------------------------------------------
 SUMMARY OF METALOG DISTRIBUTION OBJECT
 -----------------------------------------------

PARAMETERS

 Term Limit:  10
 Term Lower Bound:  2
 Boundedness:  b
 Bounds (only used based on boundedness):  [0, 200]
 Step Length for Distribution Summary:  0.01
 Method Use for Fitting:  any
 Number of Data Points Used:  138
 Original Data Saved:  False


VALIDATION AND FIT METHOD

     term valid method
2      2   yes    OLS
3      3   yes    OLS
4      4   yes    OLS
5      5   yes    OLS
6      6   yes    OLS
7      7   yes    OLS
8      8   yes    OLS
9      9   yes    OLS
10    10   yes    OLS

It’s possible to plot corresponding PDF and CDF of metalog distribution:

metalog.plot(m=metalog_salmon)

Output:

pdf\_cdf

To draw samples from distribution use metalog.r() function where n is number of samples and term specifies the terms of distribution to sample from:

metalog.r(m=metalog_salmon, n=5, term=10)

Output:

array([73.81897286, 86.74055734, 84.22509619, 83.80426247, 97.79800677])

To get densities based on quantiles type metalog.d() function where q is vector of quantiles:

metalog.d(m=metalog_salmon, q=[50, 110, 150], term=10)

Output:

array([0.00038265, 0.00712032, 0.00373991])

To calculate probabilities based on quantiles use metalog.p() function:

metalog.p(m=metalog_salmon, q=[50, 110, 150], term=10)

Output:

array([0.00275336, 0.82349578, 0.98686581])

Finally, to get quantiles from probabilites input metalog.q():

metalog.q(m=metalog_salmon, y=[0.00275336, 0.82349578, 0.98686581], term=10)

Output:

array([ 50.02583336, 109.99861143, 149.99737059])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for metalog, version 0.2.2
Filename, size File type Python version Upload date Hashes
Filename, size metalog-0.2.2-py3-none-any.whl (12.9 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size metalog-0.2.2.tar.gz (13.4 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page