Compute idr from m NarrowPeak files and a merged NarrowPeak
Project description
IDR
This tools is designed to compute the Irreproducible Discovery Rate (IDR) from NarrowPeaks files for two or more replicates. It's an implementation of the method described in the following paper using Gaussian copula.
LI, Qunhua, BROWN, James B., HUANG, Haiyan, et al. Measuring reproducibility of high-throughput experiments. The annals of applied statistics, 2011, vol. 5, no 3, p. 1752-1779.
Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
Prerequisites
To run midr on you computer you need to have python (>= 3) installed.
python3 --version
Installing
To easily install midr on you computer using pip
run the following command:
pip3 install midr
Otherwise you can clone this repository:
git clone https://github.com/LBMC/midr.git
cd midr/src/
python3 setup.py install
Given a list of peak calls in NarrowPeaks format and the corresponding peak call for the merged replicate. This tools compute and append a IDR column to NarrowPeaks files.
Dependencies
The idr package depend on the following python3 library:
Travis E. Oliphant. Python for Scientific Computing, Computing in Science & Engineering, 9, 10-20 (2007), DOI:10.1109/MCSE.2007.58
K. Jarrod Millman and Michael Aivazis. Python for Scientists and Engineers, Computing in Science & Engineering, 13, 9-12 (2011), DOI:10.1109/MCSE.2011.36
Travis E, Oliphant. A guide to NumPy, USA: Trelgol Publishing, (2006).
Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37
J. D. Hunter, "Matplotlib: A 2D Graphics Environment", Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95, 2007.
Usage
idr takes as input file in the NarrowPeaks format, and output NarrowPeaks files with an additional idr column.
Computing IDR between three replicates
$ midr -m merged_peak_calling.NarrowPeaks \
-f replicate1_.NarrowPeaks replicate2.NarrowPeaks replicate3.NarrowPeaks \
-o results
where replicate1_.NarrowPeaks
is the output of the peak caller on the
alignment file corresponding to the first replicate and
merged_peak_calling.NarrowPeaks
is the output of the peak caller on the merge
of the replicates alignment files.
results
is the directory where we want to output our results.
Displaying help:
$ midr -h
usage: midr [-h] --merged FILE --files FILES [FILES ...] [--output DIR]
[--score SCORE_COLUMN] [--threshold THRESHOLD] [--debug]
[--verbose]
Compute the Irreproducible Discovery Rate (IDR) from NarrowPeaks files
Implementation of the IDR methods for two or more replicates.
LI, Qunhua, BROWN, James B., HUANG, Haiyan, et al. Measuring reproducibility
of high-throughput experiments. The annals of applied statistics, 2011,
vol. 5, no 3, p. 1752-1779.
Given a list of peak calls in NarrowPeaks format and the corresponding peak
call for the merged replicate. This tools compute and append a IDR column to
NarrowPeaks files.
optional arguments:
-h, --help show this help message and exit
IDR settings:
--merged FILE, -m FILE
file of the merged NarrowPeaks
--files FILES [FILES ...], -f FILES [FILES ...]
list of NarrowPeaks files
--output DIR, -o DIR output directory (default: results)
--score SCORE_COLUMN, -s SCORE_COLUMN
NarrowPeaks score column to compute the IDR on, one of
'score', 'signalValue', 'pValue' or 'qValue' (default:
signalValue)
--threshold THRESHOLD, -t THRESHOLD
Threshold value for the precision of the estimator
(default: 0.01)
--debug, -d enable debugging (default: False)
--verbose log to console (default: False)
Authors
- Laurent Modolo - Initial work
License
This project is licensed under the CeCiLL License- see the LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.