Skip to main content

A tool to identify genomic peaks based on kernel density estimation.

Project description

kdpeak

DOI

kdpeak is a Python package designed to identifying genomic peaks from genomic reads in bed format using kernel density estimation (KDE).

Installation

Install via PyPI with:

pip install kdpeak

Alternatively, install directly from GitHub:

pip install git+https://github.com/settylab/kdpeak.git

Using kdpeak

kdpeak allows for processing of a bed file. It applies KDE, pinpoints peaks based on the fragemnt end density, and records the results in an output bed file. The package enables customization of parameters like KDE bandwidth, sequence blacklist, minimum peak size, and others. Designed to deliver a specific fraction-in-peaks (FRiP), it defaults to 0.3.

Elementary usage:

kdpeak reads.bed --frip 0.3 --out peaks.bed

Parameters

usage: kdpeak [-h] [--out OUTPUT_FILE] [-l LEVEL] [--logfile LOGFILE] [--blacklisted-seqs chrN [chrN ...]] [--kde-bw FLOAT] [--min-peak-size INT] [--fraction-in-peaks FLOAT] [--span INT] READS.BED

Positional Argument:

  • reads.bed - Path to the bed file containing the genomic reads.

Options:

  • -h, --help - Show this help message and exit.
  • --out output_file.bed - Path to the output file where the results will be saved. Peaks are saved in bed format with the columns: start, end, peak name, AUC (area under the cut density curve where cut-density is in cuts per 100 base pairs). Defaults to peaks.bed.
  • --summits-out summits_file.bed - Path to the output file where the peak summits will be saved. The file will have columns for start, end (start+1), peak name, and summit height (in cuts per 100 base pairs). If nothing is specified the summits will not be saved.
  • -l LEVEL, --log LEVEL - Set the logging level. Options include: DEBUG, INFO, WARNING, ERROR, CRITICAL. Default is INFO.
  • --logfile LOGFILE - Path to the file to write a detailed log.
  • --blacklisted-seqs chrN [chrN ...] - List of sequences (e.g., chromosomes) to exclude from peak calling. Input as space-separated values.
  • --kde-bw FLOAT - Bandwidth (standard deviation, sigma in base pairs) for the KDE. Increase for larger features to reduce noise. Default is 200.
  • --min-peak-size INT - Minimal size (in base pairs) for a peak to be considered valid. Default is 100.
  • --fraction-in-peaks FLOAT, --frip FLOAT - Expected fraction of total reads to be located in peaks. Default is 0.3.
  • --span INT - Resolution of the analysis in base pairs, determining the granularity of the KDE and peak calling. Default is 10.

Utilizing All Available Options:

kdpeak reads.bed --out peaks.bed --log DEBUG --logfile debug.log --blacklisted-seqs chr1 chr2 --kde-bw 500 --min-peak-size 50 --frip 0.5 --span 5

Disclaimer

kdpeak, being in its Alpha stage, encourages usage with care. We warmly welcome users to report any issues experienced during utilization. Together, we can enhance kdpeak for a better genomic analysis experience.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kdpeak-0.2.0.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

kdpeak-0.2.0-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file kdpeak-0.2.0.tar.gz.

File metadata

  • Download URL: kdpeak-0.2.0.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for kdpeak-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4c1e360eb2c5962900cb52fc4219e2156f60956076edbb4c093967b6d65c21cf
MD5 6659965d4fd1c80f656e13634f5f99c1
BLAKE2b-256 70a58df5e1be5027d7de367538e730ae1a4d45897a7364c72bfa939f446cf783

See more details on using hashes here.

File details

Details for the file kdpeak-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: kdpeak-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for kdpeak-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d65a222aee828e7b8cd82abd2dae7998e32f859046869b06cacfc96a32da7945
MD5 f34e5c71ca066231a270383ee27d5e35
BLAKE2b-256 5de872b5ca1fc4d85ce1b3ed0713e2da3fe736024b3a0b843e92bf1666536f88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page