Generate modbed track files for visualization on WashU Epigenome Browser
Project description
modbedtools
Requires Python >= 3.6
A python command line tool to generate modbed files for visualization on WashU Epigenome Browser
This tool parses MM/ML tag from BAM files generated from 3rd generation sequencing platform like Oxford Nanopore and PacBio devices using the pysam package.
modbed format
chr11 5173273 5195306 -110,-266,-1459,-1780,-1840,-1842,-1848,-1865,-1928,-1936,... -396,-1543,-3222,-4195,-4319,-4692,-5352,-5366,-5523,-5838,...
chr11 5174507 5194585 223,605,607,613,630,693,701,936,1761,3369,... 307,544,1280,2017,2859,2994,3116,3249,3790,3935,...
chr11 5174543 5196481 187,271,508,570,576,593,901,1729,2826,3216,... 568,656,664,1985,2961,3083,3703,4115,4286,4882,...
Each row in this bed-based format is a long read, the columns are:
- chromosome
- start position of this read
- end position of this read
- methylated/modified base positions, relative to start
- unmethylated/unmodified/canonical base positions, relative to start
All positions are 0 based.
commands
modbedtools -h
usage: modbedtools [-h] [--version] {bam2mod,addbg} ...
Python command line tool to generate modbed files for visualization on WashU Epigenome Browser.
optional arguments:
-h, --help show this help message and exit
--version, -v show program's version number and exit
subcommands:
valid subcommands
{bam2mod,addbg} additional help
bam2mod convert bam to modbed
addbg add backgroud bases given modified bases and reference sequence
(files for testing can be found in the test folder in this repository)
bam2mod
convert bam files with MM/ML tags to modbed format.
modbedtools bam2mod -h
usage: modbedtools bam2mod [-h] [-c CUTOFF] [-o OUTPUT] bamfile
positional arguments:
bamfile bam file with MM/ML tags
optional arguments:
-h, --help show this help message and exit
-c CUTOFF, --cutoff CUTOFF
methylation cutoff, >= cutoff as methylated. default: 0.5
-o OUTPUT, --output OUTPUT
output file name, a suffix .modbed will be added. default: output
examples:
modbedtools bam2mod hifi-test.bam -o hifi
modbedtools bam2mod remora-test.bam -o remora
addbg
For data provided methylated bases, given a reference genome fasta sequence, add the unmethylated bases from genome sequence as background, this assumes all other specified bases from genome are unmethylated/unmodified.
The input file should be in bed format, the last column saves the comma separated relative base position with modification (0 based).
example input:
chr11 5193360 5212743 {middle columns can be anything or none} 21,273,296,307,440,461,475,688,689,694,863...
the data above is adopted from one of the Fiber-seq data from John Stamatoyannopoulos lab.
modbedtools addbg -b A fiber_seq_HBG_DS182418_2022dec07.bed12 chr11.fa.gz -o fibe-seq-HBG_DS182418
track formating
Tabix is used to compress and index the modbed files generated in last steps.
example:
bgzip hifi.modbed
tabix -p bed hifi.modbed.gz
Then the .gz and .gz.tbi files can be placed into any web server for hosting and the URL to the .gz file can be used for Visualization in WashU Epigenome Browser.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for modbedtools-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d563e18db3c9eaf31e20fcb121926839d952986b951c2559e7fc08ead092d44f |
|
MD5 | 2b941c989437a793569b377835e5fa07 |
|
BLAKE2b-256 | a6a612f0dd2ffde9386e06ce7ab316a5ef8b2a9ec86c4e98549b167054e08aa8 |