Skip to main content

Statistical post-hoc analysis and outlier detection algorithms

Project description

https://travis-ci.org/maximtrp/scikit-posthocs.svg?branch=master https://img.shields.io/github/issues/maximtrp/scikit-posthocs.svg https://img.shields.io/pypi/v/scikit-posthocs.svg https://img.shields.io/badge/donate-PayPal-blue.svg

This Python package provides statistical post-hoc tests for pairwise multiple comparisons and outlier detection algorithms.

Features

  • Pairwise multiple comparisons parametric and nonparametric tests:

    • Conover, Dunn, and Nemenyi tests for use with Kruskal-Wallis test.

    • Conover, Nemenyi, Siegel, and Miller tests for use with Friedman test.

    • Quade, van Waerden, and Durbin tests.

    • Student, Mann-Whitney, Wilcoxon, and TukeyHSD tests.

    • Anderson-Darling test.

    • Mack-Wolfe test.

    • Nashimoto and Wright’s test (NPM test).

    • Scheffe test.

    • Tamhane T2 test.

  • Plotting functionality (e.g. significance plots).

  • Outlier detection algorithms:

    • Simple test based on interquartile range (IQR).

    • Grubbs test.

    • Tietjen-Moore test.

    • Generalized Extreme Studentized Deviate test (ESD test).

    All tests are capable of p adjustments for multiple pairwise comparisons.

Dependencies

Compatibility

Package is compatible with Python 2 and Python 3.

Install

You can install the package with: pip install scikit-posthocs

Examples

List or NumPy array

import scikit_posthocs as sp
x = [[1,2,3,5,1], [12,31,54], [10,12,6,74,11]]
sp.posthoc_conover(x, p_adjust = 'holm')
array([[-1.        ,  0.00119517,  0.00278329],
       [ 0.00119517, -1.        ,  0.18672227],
       [ 0.00278329,  0.18672227, -1.        ]])

Pandas DataFrame

Columns specified with val_col and group_col args must be melted prior to making comparisons.

import scikit_posthocs as sp
import pandas as pd
x = pd.DataFrame({"a": [1,2,3,5,1], "b": [12,31,54,62,12], "c": [10,12,6,74,11]})
x = x.melt(var_name='groups', value_name='values')
images/melted-dataframe.png
sp.posthoc_conover(x, val_col='values', group_col='groups', p_adjust = 'fdr_bh')
images/result-conover.png

Significance plots

P values can be plotted using a heatmap:

pc = sp.posthoc_conover(x, val_col='values', group_col='groups')
heatmap_args = {'linewidths': 0.25, 'linecolor': '0.5', 'clip_on': False, 'square': True, 'cbar_ax_bbox': [0.80, 0.35, 0.04, 0.3]}
sp.sign_plot(pc, **heatmap_args)
images/plot-conover.png

Custom colormap applied to a plot:

pc = sp.posthoc_conover(x, val_col='values', group_col='groups')
# Format: diagonal, non-significant, p<0.001, p<0.01, p<0.05
cmap = ['1', '#fb6a4a',  '#08306b',  '#4292c6', '#c6dbef']
heatmap_args = {'cmap': cmap, 'linewidths': 0.25, 'linecolor': '0.5', 'clip_on': False, 'square': True, 'cbar_ax_bbox': [0.80, 0.35, 0.04, 0.3]}
sp.sign_plot(pc, **heatmap_args)
images/plot-conover-custom-cmap.png

Credits

Thorsten Pohlert, PMCMR author and maintainer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit-posthocs-0.3.9.tar.gz (22.5 kB view hashes)

Uploaded Source

Built Distribution

scikit_posthocs-0.3.9-py2.py3-none-any.whl (22.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page