Skip to main content

A fast and covariate-adaptive method for multiple hypothesis testing

Project description

AdaFDR

A fast and covariate-adaptive method for multiple hypothesis testing.

Software accompanying the paper "AdaFDR: a Fast, Powerful and Covariate-Adaptive Approach to Multiple Hypothesis Testing", 2018.

Installation

pip install adafdr

Usage

adafdr mainly offers two methods: adafdr_explore for covariate visualization and adafdr_test for multiple hypothesis testing.

Import package and load data

adafdr.method contains the algorithm implementation while adafdr.data_loader can be used to load the data used in the paper. Here we load the airway data used in the paper. See vignette for other data accompanied with the package.

import adafdr.method as md
import adafdr.data_loader as dl
p,x = dl.data_airway()

The data p,x has the following format:

  • p: (N,) numpy.ndarray, p-values for N hypotheses.
  • x: (N,d) numpy.ndarray, d-dimensional covariate for each hypothesis. When d=1, x is allowed to be (N,) numpy.ndarray or (N,1) numpy.ndarray.

Covariate visualization using adafdr_explore

md.adafdr_explore(p, x, output_folder=None)

p_scatter ratio

If output_folder is a folder path, figures will be saved to the folder instead of being plotted in the console.

Here, the left is a scatter plot of each hypothesis with p-values (y-axis) against the covariate (x-axis). The right are the estimated null hypothesis distribution (blue) and the estimated alternative hypothesis distribution (orange) with respect to the covariate. Here we can conclude that a hypothesis is more likely to be significant if the covariate (gene expression) value is larger.

Multiple hypothesis testing using adafdr_test

n_rej,t_rej,theta = md.adafdr_test(p, x, fast_mode=True, output_folder=None)
  • If fast_mode is True, AdaFDR-fast is used, otherwise, AdaFDR is used.
  • If output_folder is a folder path, log files will be saved in the folder.
  • n_rej is the number of rejections, t_rej is a (N,) numpy.ndarray for decision threshold for each hypothesis, theta is a list of learned parameters.

Here, the learned threshold looks as follows. Note that the two lines correspond to the data from two folds via hypothesis splitting p_scatter

Quick Test

Here is a quick test. First check if the package can be succesfully imported:

import adafdr

Next, run a small example which should take a few seconds:

import numpy as np
p,x,h,_,_ = adafdr.data_loader.load_1d_bump_slope()
n_rej,t_rej,theta = adafdr.method.adafdr_test(p, x, fast_mode=True)
D = np.sum(p<=t_rej)
FD = np.sum((p<=t_rej)&(~h))
print('# AdaFDR successfully finished! ')
print('# D=%d, FD=%d, FDP=%0.3f'%(D, FD, FD/D))

It runs AdaFDR-fast on a 1d simulated data. If the package is successfully imported, the result should look like:

# AdaFDR successfully finished! 
# D=840, FD=80, FDP=0.095

Citation information

Coming soon.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adafdr-0.0.6.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

adafdr-0.0.6-py3-none-any.whl (2.3 MB view details)

Uploaded Python 3

File details

Details for the file adafdr-0.0.6.tar.gz.

File metadata

  • Download URL: adafdr-0.0.6.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.19.5 CPython/3.6.3

File hashes

Hashes for adafdr-0.0.6.tar.gz
Algorithm Hash digest
SHA256 c58f836c7238acb8d7aaba2e5a7b7e09094a65cea8bee8f5193b5de64e284747
MD5 35395820f39c08938443e9f2ca934448
BLAKE2b-256 de7735b4ae3fc9e0f84b48f0a32d58d451196688ecf1274476d24febdc5e4dac

See more details on using hashes here.

File details

Details for the file adafdr-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: adafdr-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 2.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.5.0 requests-toolbelt/0.8.0 tqdm/4.19.5 CPython/3.6.3

File hashes

Hashes for adafdr-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 abc049547f85ea696a0dca625ad9c4412b804db479a0c5d3735fa692d6578f53
MD5 bc212eb3d6f93047fdd08acf2ad00e6b
BLAKE2b-256 0a7def907efcb1c583a05b92157df71df3b6a19c24b5402f33c15f8d1981f900

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page