Find hypomethylated regions in centromeres
Project description
centrodip
Installation
# conda Install:
conda install jmmenend::centrodip
# docker Run:
docker run -it jmmenend/centrodip:latest
# pip install:
pip install centrodip
Preprocessing:
centrodip requires two inputs: (1) a bedMethyl file from modkit and (2) active-alpha annotations
(1) can be created by aligning a BAM and calling modkit pileup
Example:
UBAM="HG002.unaligned.bam"
FA="HG002.fa"
# convert to FQ, then align
samtools fastq -T '*' $UBAM > $FQ
minimap2 ... $FQ > $SAM
# convert to BAM and index
samtools view -bh $SAM > $BAM
samtools index $BAM
# aggregate methylation with modkit
modkit pileup --cpg --ref $FA $BAM $bedMethyl
(2) can be created by subsetting the output from the cenSat Annotation workflow
Documentation for running this workflow can be found here
Example:
CENSAT="HG002.censat.bed"
# filter for only active-alpha censat annotations
grep "active_hor" $CENSAT > $ACTIVE_ALPHA
# it is recommended to perform a bedtools merge on these subset annotations
bedtools merge -d 100000 $ACTIVE_ALPHA > $regions
Running centrodip:
centrodip $bedMethyl $regions $output
Inputs:
bedMethyl-modkit pileupfile (Refer to modkit github).regions- bed file of regions you want to search for CDRs.output- name of output file.
Output:
Default output file is a BED file with 9 columns
- Column 4 can be adjusted with the
--labelflag - Column 9 can be adjusted with the
--colorflag - The
--debugflag adds chromosomal summary printouts and additional outputs like smoothed methylation, and unfiltered dip calls - The
--plotflag creates a folder that contains summary png files for each chromosome
Help Documentation
usage: centrodip [-h] [--mod-code MOD_CODE] [--bedgraph] [--window-size WINDOW_SIZE] [--cov-conf COV_CONF] [--prominence PROMINENCE] [--height HEIGHT] [--broadness BROADNESS] [--enrichment] [--min-size MIN_SIZE]
[--min-score MIN_SCORE] [--cluster-distance CLUSTER_DISTANCE] [--label LABEL] [--color COLOR] [--plot] [--threads THREADS] [--debug]
bedMethyl regions output
Inspect BED / bedGraph files using BedTable
positional arguments:
bedMethyl Path to the bedMethyl file
regions Path to BED file of regions to search for dips
output Path to the output BED file
options:
-h, --help show this help message and exit
Input Options:
--mod-code MOD_CODE Modification code to filter bedMethyl file. Selects rows with this value in the fourth column. (default: "m")
--bedgraph Input file in a bedGraph format rather than bedMethyl. Requires bedGraph4 with the fourth column being fraction modified (default: False)
Smoothing Options:
--window-size WINDOW_SIZE
Window size (bp) to use in LOWESS smoothing of fraction modified. (default: 10000)
--cov-conf COV_CONF Minimum coverage required to be a confident CpG site. (default: 10)
Detection Options:
--prominence PROMINENCE
Sensitivity of dip detection for scipy.signal.find_peaks. Higher values require more pronounced dips. Must be a float between 0 and 1. (default: 0.5)
--height HEIGHT Minimum depth for dip detection, lower values require deeper dips. Must be a float between 0 and 1. (default: 0.1)
--broadness BROADNESS
Broadness of dips called, higher values make broader entries. Must be a float between 0 and 1. (default: 0.75)
--enrichment Find regions that are enriched (rather than depleted) for methylation. (default: False)
Filtering Options:
--min-size MIN_SIZE Minimum dip size in base pairs. (default: 1000)
--min-score MIN_SCORE
Minimum score that a dip must have to be kept. Must be an int between 0 and 1000. (default: 500)
--cluster-distance CLUSTER_DISTANCE
Cluster distance in base pairs. Attempts to keep the single largest cluster of annotationed dips. Negative Values turn it off. (default: 500000)
Output Options:
--label LABEL Label to use for regions in BED output. (default: "CDR")
--color COLOR Color of predicted dips. (default: "50,50,255")
Other Options:
--plot Create summary plot of the results. Written to <output_prefix>.summary.png (default: False)
--threads THREADS Number of worker processes. (default: 4)
--debug Dumps smoothed methylation values, their derivatives, methylation peaks, and derivative peaks. Each to separate BED/BEDGraph files. (default: False)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file centrodip-1.0.1.tar.gz.
File metadata
- Download URL: centrodip-1.0.1.tar.gz
- Upload date:
- Size: 36.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7687cc24eccdccf1162aeefabfde8feaa56074515f6d3c3e2a7c38921df9620d
|
|
| MD5 |
ffd37b8b9f09d0e76258be840ff25884
|
|
| BLAKE2b-256 |
12a701f67f4a86b04916b9c816f80ea1bdf8c973724c9c30697f0acee8af9ae2
|
Provenance
The following attestation bundles were made for centrodip-1.0.1.tar.gz:
Publisher:
publish-pypi.yml on jmenendez98/centrodip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
centrodip-1.0.1.tar.gz -
Subject digest:
7687cc24eccdccf1162aeefabfde8feaa56074515f6d3c3e2a7c38921df9620d - Sigstore transparency entry: 953588966
- Sigstore integration time:
-
Permalink:
jmenendez98/centrodip@a0b673ecff1d0c473dbe78d025550293e7821687 -
Branch / Tag:
refs/tags/1.0.1 - Owner: https://github.com/jmenendez98
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@a0b673ecff1d0c473dbe78d025550293e7821687 -
Trigger Event:
release
-
Statement type:
File details
Details for the file centrodip-1.0.1-py3-none-any.whl.
File metadata
- Download URL: centrodip-1.0.1-py3-none-any.whl
- Upload date:
- Size: 24.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c56d598ce6940d2a326b217747d746ff268c15004d2d70a84c73353a1d2b0b06
|
|
| MD5 |
25fd7d44edc6a24219d4b0c25de0a345
|
|
| BLAKE2b-256 |
ca26606a182f25a1e03ad682c39d040d5e2cb82bdb3087bcefd9bf1dbd4b020c
|
Provenance
The following attestation bundles were made for centrodip-1.0.1-py3-none-any.whl:
Publisher:
publish-pypi.yml on jmenendez98/centrodip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
centrodip-1.0.1-py3-none-any.whl -
Subject digest:
c56d598ce6940d2a326b217747d746ff268c15004d2d70a84c73353a1d2b0b06 - Sigstore transparency entry: 953588972
- Sigstore integration time:
-
Permalink:
jmenendez98/centrodip@a0b673ecff1d0c473dbe78d025550293e7821687 -
Branch / Tag:
refs/tags/1.0.1 - Owner: https://github.com/jmenendez98
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@a0b673ecff1d0c473dbe78d025550293e7821687 -
Trigger Event:
release
-
Statement type: