Fast Window Protection Score calculator for cell-free DNA analysis
Project description
optwps
A high-performance Python package for computing Window Protection Score (WPS) from BAM files, designed for cell-free DNA (cfDNA) analysis. It was built as a direct alternative of a script provided by the Kircher Lab, and has been tested to replicate the exact numbers.
Overview
optwps is a fast and efficient tool for calculating Window Protection Scores from aligned sequencing reads. WPS is a metric used in cell-free DNA analysis to identify nucleosome positioning and protected regions by analyzing fragment coverage patterns.
Installation
pip install optwps
Dependencies
- Python >= 3.7
- samtools
Usage
Command Line Interface
Basic usage:
optwps -i input.bam -o output.tsv
With custom parameters:
optwps \
-i input.bam \
-o output.tsv \
-w 120 \
--min_insert_size 120 \
--max_insert_size 180 \
--downsample 0.5
Command Line Arguments
-i, --input: Input BAM file (required)-o, --output: Output file path for WPS results. If not provided, results will be printed to stdout. Supports placeholders{chrom}and{target}for creating separate files per chromosome or region (optional)-r, --regions: BED file with regions of interest (default: whole genome, optional)-w, --protection: Base pair protection window (default: 120)--min-insert-size: Minimum read length threshold to consider (optional)--max-insert-size: Maximum read length threshold to consider (optional)--downsample: Ratio to downsample reads (optional)--chunk-size: Chunk size for processing in pieces (default: 1e8)--valid-chroms: Comma-separated list of valid chromosomes to include (e.g., '1,2,3,X,Y') or 'canonical' for chromosomes 1-22, X, Y (optional)--compute-coverage: If provided, output will include base coverage--verbose-output: If provided, output will include separate counts for 'outside' and 'inside' along with WPS--add-header: If provided, output file(s) will have headers
Python API
from optwps import WPS
# Initialize WPS calculator
wps_calculator = WPS(
protection_size=120,
min_insert_size=120,
max_insert_size=180,
valid_chroms=set(map(str, list(range(1, 23)) + ['X', 'Y']))
)
# Run WPS calculation
wps_calculator.run(
bamfile='input.bam',
out_filepath='output.tsv',
downsample_ratio=0.5
)
Output Format
The output is a tab-separated no-header (unless --add-header is specified) file with the following columns:
- Chromosome name (without 'chr' prefix)
- Start position (0-based)
- End position (start + 1)
- Base read coverage (if `--compute-coverage`)
- Count of fragments spanning the protection window (if `--verbose-output`)
- Count of fragment endpoints in protection window (if `--verbose-output`)
- Window Protection Score (outside - inside)
Example output:
1\t1000\t1001\t12
1\t1001\t1002\t14
1\t1002\t1003\t10
With --compute-coverage
1\t1000\t1001\t20\t12
1\t1001\t1002\t20\t14
1\t1002\t1003\t19\t10
With --verbose-output:
1\t1000\t1001\t15\t3\t12
1\t1001\t1002\t16\t2\t14
1\t1002\t1003\t14\t4\t10
Algorithm
The Windowed Protection Score algorithm has the following steps:
-
Fragment Collection: For each genomic position, collect all DNA fragments (paired-end reads or single reads) in the region
-
Protection Window: Define a protection window of size
protection_size(default 120bp, or ±60bp from the center) -
Score Calculation:
- Outside Score: Count fragments that completely span the protection window
- Inside Score: Count fragment endpoints that fall within the protection window (exclusive boundaries)
- WPS: Subtract inside score from outside score:
WPS = outside - inside
-
Interpretation: Positive WPS values indicate protected regions (likely nucleosome-bound), while negative values suggest accessible regions
Examples
Example 1: Basic WPS Calculation
optwps -i sample.bam -o sample_wps.tsv
Example 2: Providing a regions bed file, limiting the range of the size of the inserts considered, and printing to the terminal
optwps \
-i sample.bam \
-r regions.tsv \
--min_insert_size 120 \
--max_insert_size 180
Example 3: Specific Regions with Downsampling
optwps \
-i high_coverage.bam \
-o wps.tsv \
--downsample 0.3
Example 4: Creating Separate Output Files per Chromosome
optwps \
-i sample.bam \
-o "wps_{chrom}.tsv"
Example 5: Include coverage
optwps \
-i sample.bam \
--compute_coverage \
-o "wps.tsv"
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file optwps-1.2.0.tar.gz.
File metadata
- Download URL: optwps-1.2.0.tar.gz
- Upload date:
- Size: 86.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b69e493bb5b6c3031bd25142de58fe9c605526d665a2224875a6e853046e44c
|
|
| MD5 |
6e4957f876cdd5ee1e15001a6bcd45ab
|
|
| BLAKE2b-256 |
c1181afae3594e3c1bfc45728efae18469ba8f3ca1deb3060e1b63eccb8cdf24
|
File details
Details for the file optwps-1.2.0-py3-none-any.whl.
File metadata
- Download URL: optwps-1.2.0-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f10bdb675caf392429525b95178e466fd610d95a2045b8cdd829f0c0fd5d90f0
|
|
| MD5 |
b915a89d95e8580525bd73413dfae70b
|
|
| BLAKE2b-256 |
a4d7c9c3ae2cd5fec325bc1a5d39408353cb2e9318cbc770f96c705cb09bb925
|