Skip to main content

Robust ATAC-seq Peak Calling for Many Samples via Convex Optimization

Project description

ROCCO: [R]obust [O]pen [C]hromatin Detection via [C]onvex [O]ptimization

logo

What

ROCCO is an efficient algorithm for detection of "consensus peaks" in large datasets with multiple HTS data samples (namely, ATAC-seq), where an enrichment in read counts/densities is observed in a nontrivial subset of samples.

Input/Output

  • Input: Samples' BAM alignments or BigWig tracks

  • Output: BED file of consensus peak regions (Default format is BED3: chrom,start,end)

  • Note, if BigWig input is used, no preprocessing options can be applied at the alignment level and narrowPeak output cannot be generated.

How

ROCCO models consensus peak calling as a constrained optimization problem with an upper-bound on the total proportion of the genome selected as open/accessible and a fragmentation penalty to promote spatial consistency in active regions and sparsity elsewhere.

Why

ROCCO offers several attractive features:

  1. Consideration of enrichment and spatial characteristics of open chromatin signals
  2. Scaling to large sample sizes (100+) with an asymptotic time complexity independent of sample size
  3. No required training data or a heuristically determined set of initial candidate peak regions
  4. No rigid thresholds on the minimum number/width of supporting samples/replicates
  5. Mathematically tractable model permitting worst-case analysis of runtime and performance

Example Behavior

Input

  • ENCODE lymphoblastoid data (BEST5, WORST5): 10 real ATAC-seq alignments of varying TSS enrichment (SNR-like quality measure for ATAC-seq)
  • Synthetic noisy data (NOISY5)

We run twice under two conditions -- with noisy samples and without for comparison (blue)

rocco -i *.BEST5.bam *.WORST5.bam -g hg38 -o rocco_output_without_noise.bed
rocco -i *.BEST5.bam *.WORST5.bam *.NOISY5.bam -g hg38 -o rocco_output_with_noise.bed

Output

Comparing each output file:

  • ROCCO is unaffected by the Noisy5 samples and effectively identifies true signal across multiple samples
  • ROCCO simultaneously detects both wide and narrow consensus peaks

example

Paper/Citation

If using ROCCO in your research, please cite the original paper in Bioinformatics (DOI: btad725)

 Nolan H Hamilton, Terrence S Furey, ROCCO: a robust method for detection of open chromatin via convex optimization,
 Bioinformatics, Volume 39, Issue 12, December 2023

Documentation

For additional details, usage examples, etc. please see ROCCO's documentation: https://nolan-h-hamilton.github.io/ROCCO/

Installation

PyPI (pip)

python -m pip install rocco --upgrade

If lacking administrative control, you may need to append --user to the above.

Build from Source

If preferred, ROCCO can easily be built from source:

  • Clone or download this repository

    git clone https://github.com/nolan-h-hamilton/ROCCO.git
    cd ROCCO
    python setup.py sdist bdist_wheel
    pip install -e .
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rocco-1.6.0.tar.gz (707.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rocco-1.6.0-py3-none-any.whl (40.2 kB view details)

Uploaded Python 3

File details

Details for the file rocco-1.6.0.tar.gz.

File metadata

  • Download URL: rocco-1.6.0.tar.gz
  • Upload date:
  • Size: 707.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for rocco-1.6.0.tar.gz
Algorithm Hash digest
SHA256 afc28cdc1ee769721ba57f48524449978e5ac177c901de125db834173fbe37c5
MD5 dc2fe55bc9e130de0801ed4da35dd1a0
BLAKE2b-256 fe92a96354f8eda30c96ac9ee882301f651cc51051067651fa1fc30e8d96b0af

See more details on using hashes here.

File details

Details for the file rocco-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: rocco-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 40.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for rocco-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fdafd4eaf35a6eeffc49a76c2b94ea9abe13cc4e7824c844614d7ac4972ac89a
MD5 b437b0c6b066c23135b72c4d4c53d010
BLAKE2b-256 ec2df835917541a27cc9e07261c71674f720a43f5b9f48e15610d4f11c47a586

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page