Skip to main content

Multisample Consensus Peak Calling via Convex Optimization

Project description

ROCCO: [R]obust [O]pen [C]hromatin Detection via [C]onvex [O]ptimization

API Tests PyPI - Version

logo

What

ROCCO is an efficient algorithm for detection of "consensus peaks" in large datasets with multiple HTS samples (e.g., ATAC-seq), where an enrichment in read counts/densities is observed in a nontrivial subset of samples.

Input/Output

  • Input: Samples' BAM alignments or BigWig tracks
  • Output: BED file of consensus peak regions (Default format is BED3: chrom,start,end)

Note, if BigWig input is used, no preprocessing options can be applied at the alignment level.

How

ROCCO models consensus peak calling as a constrained optimization problem with an upper-bound on the total proportion of the genome selected as open/accessible and a fragmentation penalty to promote spatial consistency in active regions and sparsity elsewhere.

Why

  1. Consideration of enrichment and spatial characteristics of open chromatin signals
  2. Scaling to large sample sizes (100+) with an asymptotic time complexity independent of sample size
  3. Unsupervised Does not require training data or a heuristically determined set of initial candidate peak regions
  4. No rigid thresholds on the minimum number/width of supporting samples/replicates
  5. Mathematically tractable model permitting worst-case analysis of runtime and performance

Example Behavior

Input

  • ENCODE lymphoblastoid data (BEST5, WORST5): 10 real ATAC-seq alignments of varying TSS enrichment (SNR-like quality measure for ATAC-seq)
  • Synthetic noisy data (NOISY5)

We run twice under two conditions -- with noisy samples and without for comparison (blue)

Ideally, the resulting consensus peaks should be largely unaffected by the Noisy5 input samples. Likewise, both broad and narrow peaks should be captured.

rocco -i *.BEST5.bam *.WORST5.bam -g hg38 -o rocco_output_without_noise.bed
rocco -i *.BEST5.bam *.WORST5.bam *.NOISY5.bam -g hg38 -o rocco_output_with_noise.bed
  • Note, users may run ROCCO with flag --narrowPeak to generate 10-column output with various statistics for comparison of peaks and supplemental validation independent of ROCCO's optimality criterion.
    • As a byproduct, --narrowPeak will likewise produce a 'raw' peak-by-count matrix (one row per peak, one column per sample) that can be used in downstream analyses such as differential accessibility testing.

Output

example

See also: Running Consenrich+ROCCO.

Paper/Citation

If using ROCCO in your research, please cite the original paper in Bioinformatics (DOI: btad725)

 Nolan H Hamilton, Terrence S Furey, ROCCO: a robust method for detection of open chromatin via convex optimization,
 Bioinformatics, Volume 39, Issue 12, December 2023

Documentation

For additional details, usage examples, etc. please see ROCCO's documentation: https://nolan-h-hamilton.github.io/ROCCO/

Installation

PyPI (pip)

python -m pip install rocco --upgrade

If lacking administrative control, you may need to append --user to the above.

Build from Source

If preferred, ROCCO can easily be built from source:

  • Clone or download this repository

    git clone https://github.com/nolan-h-hamilton/ROCCO.git
    cd ROCCO
    python setup.py sdist bdist_wheel
    python -m pip install -e .
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rocco-1.7.0.tar.gz (916.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rocco-1.7.0-py3-none-any.whl (40.5 kB view details)

Uploaded Python 3

File details

Details for the file rocco-1.7.0.tar.gz.

File metadata

  • Download URL: rocco-1.7.0.tar.gz
  • Upload date:
  • Size: 916.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rocco-1.7.0.tar.gz
Algorithm Hash digest
SHA256 ae2ed6cca5184f933d573e196d21ad13389a50dcfa7125abb76a6d8224f88d36
MD5 482c9c65a0dafd82c3ef500f6b8ff03b
BLAKE2b-256 0b5ec1f7b53103d095b754fa0a347830eb3b2295e9901bd2da0bb6cbb95a177e

See more details on using hashes here.

File details

Details for the file rocco-1.7.0-py3-none-any.whl.

File metadata

  • Download URL: rocco-1.7.0-py3-none-any.whl
  • Upload date:
  • Size: 40.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rocco-1.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9e1b87e710e0c8ae29fcf3cca3dd1a58cea430d4f232f06a94f4d0d60b2d78b1
MD5 302549fc91b44a73553bf381a628923b
BLAKE2b-256 d481f346c2873baf7df9d07929e4d2c39b79e4467cee529d14079a03aad8bc36

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page