Skip to main content

Nonparametric Test Statistics

Project description

PyNonpar

PyPI version Travis-CI Build Status Build status codecov Codacy Badge Downloads DOI

Test statistics based on ranks may lead to paradoxical results. A solution are so-called pseudo-ranks. This package provides a function to calculate pseudo-ranks as well as nonparametric, (pseudo)-rank statistics. For a definition and discussion of pseudo-ranks, see for example [1].

To install the package from PyPI, simply type

pip install PyNonpar

Table of Contents

Two-Sample Tests
Paired Two-Sample Tests
Multi-Sample Tests
Repeated-Measures Tests

Pseudo-Ranks

If there are ties (i.e., observations with the same value) in the data, then the pseudo-ranks have to be adjusted. There are the options 'minimum', 'maximum' and 'average'. It is recommended to use 'average' as for this adjusmtent, normalized empirical distribution functions are used. See the example for details on the usage of the function 'psrank'.

import PyNonpar
from PyNonpar import*

# some artificial data
x = [1, 1, 1, 1, 2, 3, 4, 5, 6]
group = ['C', 'C', 'B', 'B', 'B', 'A', 'C', 'A', 'C']

PyNonpar.pseudorank.psrank(x, group, ties_method = "average")

Nonparametric Test Statistics

Two-Sample Tests

  1. Wilcoxon-Mann-Whitney test: wilcoxon_mann_whitney_test()
  2. Brunner-Munzel test (Generalized Wilcoxon test): brunner_munzel_test()

The Hodges-Lehmann estimator can be calculated in a location shift model: hodges_lehmann(). The confidence interval for this estimator is only asymptotic and assumes continuous distributions.

1. Wilcoxon-Mann-Whitney test

For large sample sizes is the asymptotic Wilcoxon test recommended (method = "asymptotic"). For small sample sizes, we recommend the exact Wilcoxon test. Note that the Wilcoxon test assumes the null hypothesis of equal distributions H0: F1 = F2.

import PyNonpar
from PyNonpar import*

x = [8,4,10,4,9,1,3,3,4,8]
y = [10,5,11,6,11,2,4,5,5,10]

PyNonpar.twosample.wilcoxon_mann_whitney_test(x, y, alternative="less", method = "asymptotic", alpha = 0.05)
PyNonpar.twosample.wilcoxon_mann_whitney_test(x, y, alternative="less", method = "exact", alpha = 0.05)
Wilcoxon-Mann-Whitney Sample Size Planning

To calculate the sample size which is needed to detect a specific relative effect p with probability beta and type-I error alpha, the function'wilcoxon_mann_whitney_ssp' can be used. Here, prior information for one group is needed. The artificial data for the second group can be created by some interpretable effect, e.g. a location shift effect. For more information, see [1] or [3].

import PyNonpar
from PyNonpar import*

# pior information
x_ssp = [315, 375, 356, 374, 412, 418, 445, 403, 431, 410, 391, 475, 379]
# y_ssp = x_ssp - 20
y_ssp = [295, 355, 336, 354, 392, 398, 425, 383, 411, 390, 371, 455, 359]

PyNonpar.twosample_paired.paired_ranks_ssp(x_ssp, y_ssp, 0.8, 0.05, 1/2)

2. Brunner-Munzel test

The Brunner-Munzel test extends the Wilcoxon test to the null hypothesis H0: p = 1/2.

import PyNonpar
from PyNonpar import*

x = [8,4,10,4,9,1,3,3,4,8]
y = [10,5,11,6,11,2,4,5,5,10]

PyNonpar.twosample.brunner_munzel_test(x, y, alternative="less", quantile = "t")
PyNonpar.twosample.brunner_munzel_test(x, y, alternative="less", quantile = "normal")

Paired Two-Sample Tests

1. Paired ranks test

The paired ranks test compares the marginal distributions F1 and F2. The Null hypothesis is H0: F1 = F2 (var_equal = True) or H0: p = 1/2 (var_equal = False). The two sided alternative is for both cases p != 1/2.

p = Probability(X_i < Y_j) + 1/2 * Probability(X_i = Y_j) for i != j where (X_i, Y_i), (X_j, Y_j) are paired observations.

import PyNonpar
from PyNonpar import*

x = [1, 2, 3, 4, 5, 7, 1, 1, 1]
y = [4, 6, 8, 7, 6, 5, 9, 1, 1]

PyNonpar.twosample_paired.paired_ranks_test(x, y, alternative="two.sided", var_equal=False, quantile="normal")

Multi-Sample Tests

  1. The Hettmansperger-Norton Test for Patterned Alternatives: hettmansperger_norton_test()
  2. Kruskal-Wallis test: kruskal_wallis_test()

1. The Hettmansperger-Norton Test for Patterned Alternatives

This package provides a function to calculate the Hettmansperger-Norton test for patterned alternatives using pseudo-ranks. Originally, this test was developed for ranks but this version was adapted to pseudo-ranks.

For the alternative, it is possible to use 'increasing' (i.e., trend = [1, 2, 3, ..., g]), 'decreasing' (i.e., trend = [g, g-1, g-2, ..., 1]) or 'custom' where the trend has to be specified manually. Note, that the trend is a list of length g where g is the number of groups.

import PyNonpar
from PyNonpar import*

# some artificial data
x = [1, 1, 1, 1, 2, 3, 4, 5, 6]
group = ['C', 'C', 'B', 'B', 'B', 'A', 'C', 'A', 'C']

PyNonpar.hettmansperger.hettmansperger_norton_test(x, group, alternative = "custom", trend = [1,3,2])

2. Kruskal-Wallis Test

import PyNonpar
from PyNonpar import*

# some artificial data
x = [1, 1, 1, 1, 2, 3, 4, 5, 6]
group = ['C', 'C', 'B', 'B', 'B', 'A', 'C', 'A', 'C']

# Using pseudo-ranks
PyNonpar.multisample.kruskal_wallis_test(x, group, pseudoranks = True)

# Using ranks
PyNonpar.multisample.kruskal_wallis_test(x, group, pseudoranks = False)

Repeated-Measures Tests

  1. The Paired-Ranks Test: paired_ranks_test()
  2. The Kepner-Robinson Test Test: kepner_robinson_test()

1. Paired ranks test

See Section ''Paired Twosample Tests''.

2. Kepner-Robinson Test

For the Kepner-Robinson Test we have several dependent observations per subject (subplot factor). Let us denote with F_k the cdf for the k-th observation. The null hypothesis for this test is H_0: F_1 = ... F_d where d is the number of observations per subject. This test assumes for the dependence structure a compound symmetry, that is, all variances are the same and all covariances are the same. In other words, the observations on one subject can basically be interchanged. For more information, we refer to [2].

import PyNonpar
from PyNonpar import*

# some artificial data
data = [1, 0, -2, -1, -2, 1, 0, 0, 0, -2]
time = [1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
subject = [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]

PyNonpar.repeated_measures.kepner_robinson_test(data, time, subject, distribution="F")

References

[1] Brunner, E., Bathke A. C. and Konietschke, F: Rank- and Pseudo-Rank Procedures in Factorial Designs - Using R and SAS, Springer Verlag, to appear.

[2] Kepner, J. L., & Robinson, D. H. (1988). Nonparametric methods for detecting treatment effects in repeated-measures designs. Journal of the American Statistical Association, 83(402), 456-461.

[3] Happ, M., Bathke, A. C., & Brunner, E. (2019). Optimal sample size planning for the Wilcoxon‐Mann‐Whitney test. Statistics in medicine, 38(3), 363-375.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyNonpar-0.2.0.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

PyNonpar-0.2.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file PyNonpar-0.2.0.tar.gz.

File metadata

  • Download URL: PyNonpar-0.2.0.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.20.0 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5

File hashes

Hashes for PyNonpar-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8000c2c54b43a6ad4f3ed122e2efb1fba7aac5748e71db3797ec6b4e586d05e6
MD5 62d6f9d3a3c48a9322fffb688d344a8d
BLAKE2b-256 c9001e03aae70ad6422379b523fe745d5aecb935cfbc9b3acf75ab764963da32

See more details on using hashes here.

File details

Details for the file PyNonpar-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: PyNonpar-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.20.0 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5

File hashes

Hashes for PyNonpar-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c9d474f8fa71958f7a14bf9c5b2200ffc5358ede4d49c3a0c54d521e01db0e74
MD5 3bdd9dd9fdd1cfd90483ee29cbab574d
BLAKE2b-256 4a5e2d181dda244d96eb7c78532ae7cfad179d5e89925226e334b877971c0392

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page