Skip to main content

Fast Window Protection Score calculator for cell-free DNA analysis

Project description

optwps

PyPI - Version Tests codecov DOI

A high-performance Python package for computing Window Protection Score (WPS) from BAM files, designed for cell-free DNA (cfDNA) analysis. It was built as a direct alternative of a script provided by the Kircher Lab, and has been tested to replicate the exact numbers.

Overview

optwps is a fast and efficient tool for calculating Window Protection Scores from aligned sequencing reads. WPS is a metric used in cell-free DNA analysis to identify nucleosome positioning and protected regions by analyzing fragment coverage patterns.

Installation

From Source

pip install optwps

Dependencies

  • Python >= 3.7
  • pysam
  • pandas
  • pigz
  • tqdm
  • bx-python

Usage

Command Line Interface

Basic usage:

optwps -i input.bam -o output.tsv

With custom parameters:

optwps \
    -i input.bam \
    -o output.tsv \
    -w 120 \
    --min_insert_size 120 \
    --max_insert_size 180 \
    --downsample 0.5

Command Line Arguments

  • -i, --input: Input BAM file (required)
  • -o, --output: Output file path for WPS results. If not provided, results will be printed to stdout. Supports placeholders {chrom} and {target} for creating separate files per chromosome or region (optional)
  • -r, --regions: BED file with regions of interest (default: whole genome, optional)
  • -w, --protection: Base pair protection window (default: 120)
  • --min-insert-size: Minimum read length threshold to consider (optional)
  • --max-insert-size: Maximum read length threshold to consider (optional)
  • --downsample: Ratio to downsample reads (optional)
  • --chunk-size: Chunk size for processing in pieces (default: 1e8)
  • --valid-chroms: Comma-separated list of valid chromosomes to include (e.g., '1,2,3,X,Y') or 'canonical' for chromosomes 1-22, X, Y (optional)
  • --compute-coverage: If provided, output will include base coverage
  • --verbose-output: If provided, output will include separate counts for 'outside' and 'inside' along with WPS
  • --add-header: If provided, output file(s) will have headers

Python API

from optwps import WPS

# Initialize WPS calculator
wps_calculator = WPS(
    protection_size=120,
    min_insert_size=120,
    max_insert_size=180,
    valid_chroms=set(map(str, list(range(1, 23)) + ['X', 'Y']))
)

# Run WPS calculation
wps_calculator.run(
    bamfile='input.bam',
    out_filepath='output.tsv',
    downsample_ratio=0.5
)

Output Format

The output is a tab-separated no-header file with the following columns:

- Chromosome name (without 'chr' prefix)
- Start position (0-based)
- End position (start + 1)
- Base read coverage (if `--compute-coverage`)
- Count of fragments spanning the protection window (if `--verbose-output`)
- Count of fragment endpoints in protection window (if `--verbose-output`)
- Window Protection Score (outside - inside)

Example output:

1\t1000\t1001\t12
1\t1001\t1002\t14
1\t1002\t1003\t10

With --compute-coverage

1\t1000\t1001\t20\t12
1\t1001\t1002\t20\t14
1\t1002\t1003\t19\t10

With --verbose-output:

1\t1000\t1001\t15\t3\t12
1\t1001\t1002\t16\t2\t14
1\t1002\t1003\t14\t4\t10

Algorithm

The Windowed Protection Score DOI algorithm has the following steps:

  1. Fragment Collection: For each genomic position, collect all DNA fragments (paired-end reads or single reads) in the region

  2. Protection Window: Define a protection window of size protection_size (default 120bp, or ±60bp from the center)

  3. Score Calculation:

    • Outside Score: Count fragments that completely span the protection window
    • Inside Score: Count fragment endpoints that fall within the protection window (exclusive boundaries)
    • WPS: Subtract inside score from outside score: WPS = outside - inside
  4. Interpretation: Positive WPS values indicate protected regions (likely nucleosome-bound), while negative values suggest accessible regions

Examples

Example 1: Basic WPS Calculation

optwps -i sample.bam -o sample_wps.tsv

Example 2: Providing a regions bed file, limiting the range of the size of the inserts considered, and printing to the terminal

optwps \
    -i sample.bam \
    -r regions.tsv \
    --min_insert_size 120 \
    --max_insert_size 180

Example 3: Specific Regions with Downsampling

optwps \
    -i high_coverage.bam \
    -o wps.tsv \
    --downsample 0.3

Example 4: Creating Separate Output Files per Chromosome

optwps \
    -i sample.bam \
    -o "wps_{chrom}.tsv"

Example 5: Include coverage

optwps \
    -i sample.bam \
    --compute_coverage \
    -o "wps.tsv"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optwps-1.1.0.tar.gz (85.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

optwps-1.1.0-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file optwps-1.1.0.tar.gz.

File metadata

  • Download URL: optwps-1.1.0.tar.gz
  • Upload date:
  • Size: 85.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.8

File hashes

Hashes for optwps-1.1.0.tar.gz
Algorithm Hash digest
SHA256 fdeac6e0cb36dea653892d0500e429001dc774e1e8db14f54db3ba08c87e3a23
MD5 c401c0cbe86e9fce6cf13491fbcf1c69
BLAKE2b-256 5470fddf4a7c633291764d302b5bee5efcd59d7db228213d12563413cd41f753

See more details on using hashes here.

File details

Details for the file optwps-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: optwps-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.8

File hashes

Hashes for optwps-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f13f26dadb9a576ec694fad894e41265c25531b03f5798906783cef38da8ba0d
MD5 8c3bc64d4146ae3df78d7521405bb732
BLAKE2b-256 3aeeae45eef76f214b9d5d252b331d4e8830d0354b9ee5fc182baae2f7f48f61

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page