Skip to main content

Several methods of combining P-values

Project description

MultiTest — Global Tests for Multiple Hypothesis Testing

MultiTest provides several methods for combining p-values with a focus on detecting rare and weak effects.

Higher Criticism variants

Each HC variant is exposed as its own method so that the standardization choice is explicit rather than a constructor argument.

Method Standardization P-value range considered
MultiTest.hc Donoho-Jin 2008 [2] – theoretical uniform std (default) (0, γ]
MultiTest.hc_dj2004 Donoho-Jin 2004 [1] – observed p-value std (0, γ]
MultiTest.hc_dj2008 Donoho-Jin 2008 [2] – theoretical uniform std (0, γ]
MultiTest.hc_beta Beta-distribution std (0, γ]
MultiTest.hc_star Beta-distribution std (1/n, γ] (HCdagger [1])

Every HC method accepts:

  • gamma – upper fraction of sorted p-values to consider. Only the p-values ranked in positions 1 through ⌊γ·n⌋ (i.e. the smallest γ fraction) enter the HC statistic. Defaults to 'auto' (see below).
  • return_threshold – if True, returns (hc_score, threshold_pval); otherwise returns just the score (default False).

Default gamma and why it matters

When gamma='auto' the upper limit is set to

γ = log(n) / sqrt(n)

This keeps HC focused on the regime where it has the most power. Signals denser than roughly 1/√n features are detectable by simpler methods (e.g. the average z-score), so extending HC beyond γ = log(n)/√n adds noise without gaining power. The log(n) factor provides a small safety margin above the 1/√n threshold.

Other methods

  • MultiTest.berkjones / berkjones_plus – Berk-Jones statistic [3]
  • MultiTest.fdr – False-discovery rate functional
  • MultiTest.fdr_control – Benjamini-Hochberg FDR control
  • MultiTest.bonferroni / neg_log_minp – Bonferroni-style inference
  • MultiTest.fisher – Fisher's method to combine p-values

In all cases, reject the null for large values of the test statistic.

Quick start

import numpy as np
from scipy.stats import norm
from multitest import MultiTest

n = 100
z = np.random.randn(n)
pvals = 2 * norm.cdf(-np.abs(z))

mt = MultiTest(pvals)

# Default HC score
hc = mt.hc()

# HC score + threshold p-value
hc, hct = mt.hc(return_threshold=True)

# Berk-Jones
bj = mt.berkjones()

ii = np.arange(n)
print(f"HC = {hc:.3f}, features below HCT: {ii[pvals <= hct]}")
print(f"Berk-Jones = {bj:.3f}")

Choosing an HC variant

  • hc is the recommended starting point. It is an alias for hc_dj2008 with the default gamma setting.
  • hc_dj2008 uses the expected (theoretical) uniform mean and std for normalization, matching the formulation in [2].
  • hc_beta is nearly identical to hc_dj2008 for large n but uses the exact beta-distribution moments of order statistics.
  • hc_dj2004 uses the observed p-value standard deviation as denominator. This makes it more sensitive to extreme p-values but also increases variance under the null.
  • hc_star ignores p-values below 1/n (sample-size adjusted, HCdagger [1]).

Use cases

This package was used to obtain evaluations reported in [4] and [5].

References

[1] Donoho, David L. and Jin, Jiashun. "Higher criticism for detecting sparse heterogeneous mixtures." The Annals of Statistics 32, no. 3 (2004): 962-994.
[2] Donoho, David L. and Jin, Jiashun. "Higher criticism thresholding: Optimal feature selection when useful features are rare and weak." Proceedings of the National Academy of Sciences, 2008.
[3] Moscovich, Amit, Boaz Nadler, and Clifford Spiegelman. "On the exact Berk-Jones statistics and their p-value calculation." Electronic Journal of Statistics 10 (2016): 2329-2354.
[4] Donoho, David L. and Alon Kipnis. "Higher criticism to compare two large frequency tables, with sensitivity to possible rare and weak differences." The Annals of Statistics 50, no. 3 (2022): 1447-1472.
[5] Kipnis, Alon. "Unification of rare/weak detection models using moderate deviations analysis and log-chisquared p-values." Statistica Sinica, 2025.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiple_hypothesis_testing-0.1.15.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multiple_hypothesis_testing-0.1.15-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file multiple_hypothesis_testing-0.1.15.tar.gz.

File metadata

File hashes

Hashes for multiple_hypothesis_testing-0.1.15.tar.gz
Algorithm Hash digest
SHA256 aaadecd95665bd69595f2c318e314f780a0d7cb5eec2673a5efdfa95f5ef2204
MD5 37586e0fee63fa2ae85b226d1839cf7f
BLAKE2b-256 348fba210ada569c2ed3ad2c82704684f70bffe61c57de69f33338f0c437f3b5

See more details on using hashes here.

File details

Details for the file multiple_hypothesis_testing-0.1.15-py3-none-any.whl.

File metadata

File hashes

Hashes for multiple_hypothesis_testing-0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 4b3c32af1c785e89551e5f6ac49975cb27069c6fd5c4291573f538e447e970eb
MD5 8c0228a4efd904e0f68618f8506da204
BLAKE2b-256 31e915eafcaf53fc2d3e20065d53afb1e294f4bddf490b6c51ca8dcc2da54ef1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page