Extract Methylation calls from ONT or PB long read data
Project description
LoReMe pipeline
LoReMe (Long Read Methylaton) is a Python package facilitating analysis of DNA methylation signals from Pacific Biosciences or Oxford Nanopore long read sequencing data.
It consists of an API and CLI for three distinct applications:
-
Pacific Biosciences data processing. PB reads in SAM/BAM format are aligned to a reference genome with the special-purpose aligner pbmm2, a modified version of minimap2. Methylation calls are then piled up from the aligned reads with pb-CpG-tools.
-
Oxford nanopore basecalling. ONT reads are optionally converted from FAST5 to POD5 format, then basecalled and aligned to a reference with dorado (dorado alignment also uses minimap2 under the hood), and finally piled up with modkit.
-
Postprocessing and QC of methylation calls. Several functions are available to generate diagnostic statistics and plots.
See also the full documentation.
Other tools of interest: methylartist and modbamtools (modbamtools docs), methplotlib
Installation
In a Conda environment
The recommended way to install loreme
is with a dedicated conda
environment:
First create an environment including all dependencies:
conda create -n loreme -c conda-forge -c bioconda samtools pbmm2 \
urllib3 pybedtools gff2bed seaborn pyfaidx psutil gputil tabulate \
cython h5py iso8601 more-itertools tqdm
conda activate loreme
Then install with pip
:
pip install loreme
You may also wish to install nvtop
to monitor GPU usage:
conda install -c conda-forge nvtop
With pip
pip install loreme
Check installation
Check that the correct version was installed with loreme --version
Uninstall
To uninstall loreme:
loreme clean
pip uninstall loreme
Oxford Nanopore reads
Download dorado
Calling methylation from ONT long reads requires the basecaller dorado . Download it by running
loreme download-dorado <platform>
This will download dorado and several basecalling models. The platform should be one of: linux-x64
, linux-arm64
, osx-arm64
, win64
, whichever matches your system. Running loreme download-dorado --help
will show a hint as to the correct choice.
Note
For members of Michael Lab at Salk running on seabiscuit, use
loreme download-dorado linux-x64
.
Modified basecalling
You can carry out modified basecalling (i.e. DNA methylation) with default parameters by running:
loreme dorado-basecall <pod5s/> <output.sam>
The input argument pod5s/
should be a directory containing one or more POD5 files. For other parameter options, see loreme dorado-basecall --help
Note
Basecalling ONT data is disk-read intensive, so for best performance the input POD5 data should be on a fast SSD (For example,
/scratch/<username>
for members of Michael Lab at Salk).
To run dorado with only regular basecalling, use the --no-mod
option:
loreme dorado-basecall --no-mod <pod5s/> <output.sam>
If you wish to convert the SAM file to a FASTQ file, use:
samtools view -bo output.bam output.sam
samtools fastq -T '*' output.bam > output.fq
Alignment
The SAM file produced by dorado can be aligned to a reference index (FASTA or MMI file) with loreme dorado-align
:
loreme dorado-align <index> <reads> <output.bam>
Download modkit
Piling up methylation calls from BAM data requires modkit . Download it by running:
loreme download-modkit
Pileup
The pileup step generates a bedMethyl file from an aligned BAM file.
loreme modkit-pileup <reference.fasta> <input.bam> <output.bed>
Note
See
loreme modkit-pileup --help
for additional options. On a HPC system you may want to use additional threads with the-t
flag.
Postprocessing
See the Pacific Biosciences reads section for examples of postprocessing analysis that can be applied to bedMethyl files.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file loreme-0.1.5.tar.gz
.
File metadata
- Download URL: loreme-0.1.5.tar.gz
- Upload date:
- Size: 24.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2940aac1e513ee561d16d719f6ffc352bf705928591341d3d1f3c4e2b74eefa |
|
MD5 | 88f1131697fd6a4ccdacd84c4096e1be |
|
BLAKE2b-256 | 889028f34e3c0a01d183b0da21f4007755213877ab1ba113f5c31c28afd97aba |
File details
Details for the file loreme-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: loreme-0.1.5-py3-none-any.whl
- Upload date:
- Size: 32.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b218a5881b650cd8e019403b5b5c7bb5dfa283de949aee05c1a16e5b17508fbf |
|
MD5 | 6dd24e174047b2af96d9024c9e8b0b0a |
|
BLAKE2b-256 | 812da21d5158fa79685508dbe418590ef193eaa759610d2b7db699a297c58019 |