Skip to main content

A package for discovering motifs in ChIP-seq datasets with knockout controls

Project description

PeaKO

What is peaKO?

PeaKO discovers motifs in ChIP-seq datasets with knockout controls. PeaKO takes in paired wild-type/knockout BAM files in addition to several reference files, as input. It returns a file of ranked motifs (see our paper for more details).

Quick start

Dependencies

  1. Conda (Miniconda or Anaconda)
  2. MEME Suite version 5.1.0 or MEME Suite version 4.12.0 with our CentriMo binary* (see below)

Please note that Conda can be installed locally without system administrator privileges. We suggest following Miniconda's installation instructions. PeaKO has only been tested on Linux systems.

Installation

  1. Download peaKO's environment file.
  2. Open a terminal and run conda env create -f peako-env.yml in your Downloads directory. This will create a Conda environment called "peako".
  3. Run conda activate peako or source activate peako to activate this environment.
  4. Install peaKO from PyPI by running python3 -m pip install peako.
  5. You can test that this worked by running peako --help.

NOTES: If you run into errors indicating missing bs4 or pyYAML packages, try running pip3 install beautifulsoup4==4.8.2 pyyaml. If step 2 above fails to create a Conda environment, you may try:

conda create --name peako
conda activate peako # or source activate peako
conda install python=3.7
conda install -c anaconda beautifulsoup4=4.7 pandas
conda install -c bioconda -c conda-forge -c anaconda snakemake-minimal flake8 pathlib2 ipython twine
conda install -c bioconda pybedtools

Please note that we have only tested peaKO on Linux.

Instructions for our modified CentriMo binary

If MEME Suite version 5.1.0 is installed on your system and accessible in your path, you do not need to install our CentriMo binary separately. Important: MEME Suite 5.1.0 must be installed from source and should not be installed from Conda at this time due to documented implementation issues. Future versions of peaKO will support installation of MEME Suite through Conda once these issues are resolved. If you are using an older version of MEME Suite, please follow the steps below to replace the CentriMo binary with our own to use peaKO.

  1. Download MEME distribution 4.12.0 from the MEME Suite Download page.
  2. Follow the "Quick Install" steps on the MEME Suite Installation page up until make install.
  3. After running make install, replace $HOME/meme/bin/centrimo with our modified CentriMo binary.
  4. Make sure that $HOME/meme/bin is located on your $PATH. You should now be able to call centrimo --help.

Usage

PeaKO uses Snakemake, which is a workflow management system. You can run peaKO either locally or on a compute cluster using the Slurm job scheduling system. To run on Slurm, you must create your own cluster.config file (template) and provide it to peaKO via --sm-cluster-config.

Each step of the workflow either inherits from the main activated Conda environment ("peako") or uses its own separate environment. If you are working on a compute cluster, run peaKO first with --sm-build-envs on a node with internet access to create these additional Conda environments. Then, you can run it on the cluster without internet, providing a Slurm configuration file (see above).

After activating peaKO's Conda environment (conda activate peako or source activate peako), you can run peako as follows:

peako <outdir> <wt-bam> <ko-bam> <organism> <chr-sizes> <trf-masked-genome> <motif-database> [options]

There are 7 required arguments. Please provide absolute paths for files and directories.

  • outdir: output directory (please make sure this already exists); all output directories and files will be created here
  • wt-bam: wild-type sample BAM file
  • ko-bam: knockout sample BAM file
  • organism: name of organism (must be either mouse or human)
  • chr-sizes: chromosome sizes file of reference genome (TXT)
  • trf-masked-genome: TRF masked reference genome file (FASTA)
  • motif-database: JASPAR motif database (MEME)

Here are the optional arguments:

General:

  • -h or --help: access the help message and exit
  • -V or --version: show the program's version and exit

PeaKO submodule:

  • -j <JASPAR_ID>: transcription factor motif JASPAR identifier (e.g. MA0083.3)
  • -m <MOTIF>: transcription factor motif common name (e.g. SRF)
  • --extra: output all intermediate peaKO files for plotting
  • --pickle: use pickled peaKO dictionaries from previous run

Snakemake:

  • --sm-build-envs: build conda environments for workflow and exit (requires internet connection)
  • --sm-cluster-config: snakemake cluster configuration file (JSON)

Output

Currently, peaKO generates output directories and files for each step. These can all be found under your provided outdir directory. PeaKO's main output file is <outdir>/peako_out/peaKO-rankings.txt, which contains a ranked list of motifs.

Additional resources

Source code is available at: https://github.com/hoffmangroup/peako.

We have deposited the current version of the code, example HTML and TXT CentriMo outputs, and a modified CentriMo binary on Zenodo.

Citation

If you found peaKO useful, please cite:

Denisko D, Viner C, Hoffman MM. Motif elucidation in ChIP-seq datasets with a knockout control. BioRxiv 10.1101/721720 [Preprint]. 2019. Available from: https://doi.org/10.1101/721720

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peako-0.0.6.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

peako-0.0.6-py3-none-any.whl (33.1 kB view details)

Uploaded Python 3

File details

Details for the file peako-0.0.6.tar.gz.

File metadata

  • Download URL: peako-0.0.6.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.3

File hashes

Hashes for peako-0.0.6.tar.gz
Algorithm Hash digest
SHA256 c3f89930e5939b7ea22c8d270f9fcc864d61aaab3388ff9285dbffbcaa5c07ee
MD5 64d6fbfb51616e38c5b42f6d3d6b5ca2
BLAKE2b-256 f2bf9a957c4a6800494bc04f608f3cb04c52d3b7ca485fd726e4b728bd05e8ee

See more details on using hashes here.

File details

Details for the file peako-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: peako-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 33.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.3

File hashes

Hashes for peako-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 3be5cb23d5816ffc24a3d523865191f25abe168cdc9a208b168fe74ecbd12ae1
MD5 170a21fb23af0f8329a7a1342a3f3325
BLAKE2b-256 59ca8586cb1f13bcd92499ab2bb6151a447d234465e552b152fc27b1a222e127

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page