Pipeline for phased methylation calling on raw nanopore data
Project description
phased-methylation
This package defines a pipeline for phased methylation calling on raw Nanopore data. It has three steps:
- Mapping reads to a reference genome (minimap2)
- Phasing variants and reads (longshot)
- Calling methylated bases (megalodon)
TODO
- improve documentation
- integrate deepsignal-plant
Environment setup and installation
conda create -n phased-methylation -c bioconda -c conda-forge git cython pytest \
pandas pysam pybedtools minimap2 longshot megalodon==2.3.4 gputil psutil \
tabulate pyfaidx gff2bed
conda activate phased-methylation
pip install ont_pyguppy_client_lib nvsmi tempfifo phased-methylation
Test run
To execute a test run of the pipeline, use the test
subcommand
phased-methylation test <test_dir/>
Usage
usage: phased-methylation [-h]
{launch,map,phase,call,test,mean,promoter,gene-body,plot,plot-genes,plot-repeats,export-metilene,export-bedgraph}
...
pipeline for methylation calling
positional arguments:
{launch,map,phase,call,test,mean,promoter,gene-body,plot,plot-genes,plot-repeats,export-metilene,export-bedgraph}
launch launch full pipeline
map perform mapping step
phase perform phasing step
call perform methylation calling step
test execute test run
mean calculate average methylation across chromosomes
promoter quantify promoter methylation
gene-body quantify gene body methylation
plot plot methylation across chromosomes
plot-genes plot methylation profiles over genomic features
plot-repeats plot methylation profiles over genomic features
export-metilene export methylation data formatted for input into
metilene
export-bedgraph export methylation data in bedgraph format
optional arguments:
-h, --help show this help message and exit
Input files
Output files
Results will be written to the indicated output directory.
Example
Minimal example:
phased-methylation launch reference.fa fast5s_dir/ output_dir/ query.fastq
GPU resource management
phased-methylation
(specifically, the call
step using megalodon
) requires at least one available GPU to run successfully. When the pipeline is launched, it will check for availability of the devices indicated by the --devices
argument, defaulting to device 0
. If one or more indicated devices are not available, the user will be prompted to free up resources by terminating processes running on them:
Cannot launch because the following processes are occupying resources on device(s) 0:
===== ===== ======================================== ==================
GPU PID Process Name GPU Memory Usage
===== ===== ======================================== ==================
0 16397 /opt/ont/guppy/bin/guppy_basecall_server 9.80 GB
===== ===== ======================================== ==================
Terminate these processes and continue? [y/N]:
Enter y
to terminate the indicated processes and continue with the pipeline,
or enter any other input to terminate phased-methylation
.
Misc
Documentation from original shell script:
IMPORTANT, this can only be run when there are no sequencing runs in progress. If there are they must be paused and then the guppy_basecaller needs to be killed to clear the GPU memory. In a terminal, type nvidia-smi and find the job ID of the guppy_basecaller(there may be two running). KILL THEM all Run nvidia-smi again and you should see that the GPU memory usage is very low now. Start this script. Once it gets to the megalodon part you can start sequencing on the PromethION again. If you start the sequener too soon this script will crash once it gets to the megalodon step
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file phased-methylation-0.17.7.tar.gz
.
File metadata
- Download URL: phased-methylation-0.17.7.tar.gz
- Upload date:
- Size: 19.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81eb60693815201fa2237f0ee04cb6cc867e0260080d9c21bde7e112f343324c |
|
MD5 | 3322aa3ea8221616b25e93b27251d516 |
|
BLAKE2b-256 | 81743373efbd3746bd729c217bc52237c02024d1bff2b675393c0ae0b581e758 |
File details
Details for the file phased_methylation-0.17.7-py3-none-any.whl
.
File metadata
- Download URL: phased_methylation-0.17.7-py3-none-any.whl
- Upload date:
- Size: 19.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67942f26b0c01531d1b3c938aec79f75078cb6fc9acc8d023eed0414889a399e |
|
MD5 | be9cb142acad341680313fafc523f4f6 |
|
BLAKE2b-256 | 2221c7673d9769786a71f84bba16ff73fa114a53b4d4ec55644e571ba9f0a154 |