Skip to main content

SeqMetrics: Various errors for sequential data

Project description

Codacy Badge License: GPL v3 HitCount Downloads Documentation Status PyPI version GitHub code size in bytes GitHub contributors GitHub last commit (branch)

The purpose of this repository to collect various classification and regression performance metrics or errors which can be calculated for time-series/sequential/tabular data, at one place. Currently only 1-dimensional data is supported.

How to Install

You can install SeqMetrics using pip

pip install SeqMetrics

or using GitHub link for the latest code

python -m pip install git+https://github.com/AtrCheema/SeqMetrics.git

or using setup file, go to folder where repo is downloaded

python setup.py install

How to Use

SeqMetrics provides a uniform API for calculation of both regression and classification metrics.

RegressionMetrics

import numpy as np
from SeqMetrics import RegressionMetrics

true = np.random.random((20, 1))
pred = np.random.random((20, 1))

er = RegressionMetrics(true, pred)

for m in er.all_methods: print("{:20}".format(m)) # get names of all availabe methods

er.nse()   # calculate Nash Sutcliff efficiency

er.calculate_all(verbose=True)  # or calculate errors using all available methods 

We can visualize the calcuated performance metrics if we have easy_mpl package installed.

import numpy as np
from SeqMetrics import RegressionMetrics, plot_metrics

np.random.seed(313)
true = np.random.random((20, 1))
pred = np.random.random((20, 1))

er = RegressionMetrics(true, pred)

plot_metrics(er.calculate_all(),  color="Blues")

RegressionMetrics currently, calculates following performane metrics for regression.

Name Name in this repository
Absolute Percent Bias abs_pbias
Agreement Index agreement_index
Aitchison Distance aitchison
Alpha decomposition of the NSE nse_alpha
Anomaly correction coefficient acc
Bias bias
Beta decomposition of NSE nse_beta
Bounded NSE nse_bound
Bounded KGE kge_bound
Brier Score brier_score
Correlation Coefficient corr_coeff
Coefficient of Determination r2
Centered Root Mean Square Deviation centered_rms_dev
Covariances covariance
Decomposed Mean Square Error decomposed_mse
Explained variance score exp_var_score
Euclid Distance euclid_distance
Geometric Mean Difference gmaen_diff
Geometric Mean Absolute Error gmae
Geometric Mean Relative Absolute Error gmrae
Inertial Root Squared Error irmse
Integral Normalized Root Squared Error inrse
Inter-percentile Normalized Root Mean Squared Error nrmse_ipercentile
Jensen-shannon divergence JS
Kling-Gupta Efficiency kge
Legate-McCabe Efficiency Index lm_index
Logrithmic Nash Sutcliff Efficiency log_nse
Logrithmic probability distribution log_prob
maximum error max_error
Mean Absolute Error mae
Mean Absolute Percentage Deviation mapd
Mean Absolute Percentage Error mape
Mean Absolute Relative Error mare
Mean Absolute Scaled Error mase
Mean Arctangle Absolute Percentage Error maape
Mean Bias Error mean_bias_error
Mean Bounded relative Absolute Error mbrae
Mean Errors me
Mean Gamma Deviances mean_gamma_deviance
Mean Log Error mle
Mean Normalized Root Mean Square Error nrmse_mean
Mean Percentage Error mpe
Mean Poisson Deviance mean_poisson_deviance
Mean Relative Absolute Error mrae
Mean Square Error mse
Mean Square Logrithmic Errors mean_square_log_error
Mean Variance mean_var
Median Absolute Error median_abs_error
Median Absolute Percentage Error mdape
Median Dictionary Accuracy
Median Error mde
Median Relative Absolute Error mdrae
Median Squared Error med_seq_error
Mielke-Berry R mb_r
Modified Agreement of Index mod_agreement_index
Modified Kling-Gupta Efficiency kge_mod
Modified Nash-Sutcliff Efficiency nse_mod
Nash-Sutcliff Efficiency nse
Non parametric Kling-Gupta Efficiency kge_np
Normalized Absolute Error norm_ae
Normalized Absolute Percentage Error norm_ape
Normalized Euclid Distance norm_euclid_distance
Normalized Root Mean Square Error nrmse
Peak flow bias of the flow duration curve fdc_fhv
Pearson correlation coefficient person_r
Percent Bias pbias
Range Normalized root mean square nrmse_range
Refined Agreement of Index ref_agreement_index
Relative Agreement of Index rel_agreement_index
Relative Absolute Error rae
Relative Root Mean Squared Error relative_rmse
Relative Nash-Sutcliff Efficiency nse_rel
Root Mean Square Errors rmse
Root Mean Square Log Error rmsle
Root Mean Square Percentage Error rmspe
Root Mean Squared Scaled Error rmsse
Root Median Squared Scaled Error rmsse
Root Relative Squared Error rrse
RSR rsr
Separmann correlation coefficient spearmann_corr
Skill Score of Murphy skill_score_murphy
Spectral Angle sa
Spectral Correlation sc
Spectral Gradient Angle sga
Spectral Information Divergence sid
Symmetric kullback-leibler divergence KLsym
Symmetric Mean Absolute Percentage Error smape
Symmetric Median Absolute Percentage Error smdape
sum of squared errors sse
Volume Errors volume_error
Volumetric Efficiency ve
Unscaled Mean Bounded Relative Absolute Error umbrae
Watterson's M watt_m
Weighted Mean Absolute Percent Errors wmape
Weighted Absolute Percentage Error wape

ClassificationMetrics

The API is same for performance metrics of classification problem.

import numpy as np
from SeqMetrics import ClassificationMetrics

# boolean array

t = np.array([True, False, False, False])
p = np.array([True, True, True, True])
metrics = ClassificationMetrics(t, p)
print(metrics.calculate_all())

# binary classification with numerical labels

true = np.array([1, 0, 0, 0])
pred = np.array([1, 1, 1, 1])
metrics = ClassificationMetrics(true, pred)
print(metrics.calculate_all())

# multiclass classification with numerical labels

true = np.random.randint(1, 4, 100)
pred = np.random.randint(1, 4, 100)
metrics = ClassificationMetrics(true, pred)
print(metrics.calculate_all())

# You can also provide logits instead of labels.

predictions = np.array([[0.25, 0.25, 0.25, 0.25],
                       [0.01, 0.01, 0.01, 0.96]])
targets = np.array([[0, 0, 0, 1],
                    [0, 0, 0, 1]])
metrics = ClassificationMetrics(targets, predictions, multiclass=True)
print(metrics.calculate_all())

# Working with categorical values is seamless

true = np.array(['a', 'b', 'b', 'b']) 
pred = np.array(['a', 'a', 'a', 'a'])
metrics = ClassificationMetrics(true, pred)
print(metrics.calculate_all())

# same goes for multiclass categorical labels

t = np.array(['car', 'truck', 'truck', 'car', 'bike', 'truck'])
p = np.array(['car', 'car',   'bike',  'car', 'bike', 'truck'])
metrics = ClassificationMetrics(targets, predictions, multiclass=True)
print(metrics.calculate_all())

SeqMetrics library currently calculates following performance metrics of classification.

Name Name in this repository
Accuracy accuracy
Balanced Accuracy balanced_accuracy
Error Rate error_rate
Recall recall
Precision precision
F1 score f1_score
F2 score f2_score
Specificity specificity
Cross Entropy cross_entropy
False Positive Rate false_positive_rate
False Negative Rate false_negative_rate
False Discovery Rate false_discovery_rate
False Omission Rate false_omission_rate
Negative Predictive Value negative_predictive_value
Positive Likelihood Ratio positive_likelihood_ratio
Negative Likelihood Ratio negative_likelihood_ratio
Prevalence Threshold prevalence_threshold
Youden Index youden_index
Confusion Matrix confusion_matrix
Fowlkes Mallows Index fowlkes_mallows_index
Mathews correlation Coefficient mathews_corr_coeff

Related

forecasting_metrics

hydroeval

SkillMetrics

HydroErr

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SeqMetrics-1.3.4.tar.gz (37.5 kB view hashes)

Uploaded Source

Built Distribution

SeqMetrics-1.3.4-py3-none-any.whl (48.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page