Skip to main content

Evaluate cfDNA fragmentomics features for ctDNA detection

Project description

Release Badge nbdev Badge DuckDB Badge Quarto Badge

kreview

Advanced cfDNA Fragmentomics Core Evaluation Engine


🧬 Overview

kreview is a production-grade, notebook-first (nbdev) evaluation engine designed for high-throughput cancer liquid biopsy fragmentomics feature analysis. Developed at Memorial Sloan Kettering (MSKCC), it processes cohorts containing tens of thousands of samples using an embedded DuckDB query engine with chunked I/O and automatic retry logic.

📖 Full Documentation

🚀 Features

  • 5-Tier ctDNA Taxonomy: MSK-IMPACT paired-inference to label True ctDNA+, Possible ctDNA+, Possible ctDNA−, Healthy Normal, and Insufficient Data.
  • DuckDB Dynamic Data Lake: In-memory read_parquet bindings with chunked I/O and exponential backoff retry. Builds a merged SQL-queryable kreview_lake.duckdb on demand.
  • Multi-Model Evaluation: Random Forest, XGBoost, and Logistic Regression with Stratified K-Fold CV, SHAP explainability, and subgroup analysis.
  • Interactive Dashboards: Plotly-native HTML reports with ROC curves, violin plots, SHAP beeswarm/waterfall, and per-cancer-type sensitivity tables.
  • 26 Built-In Evaluators: Modular extractors covering fragment sizes (FSC, FSD, FSR), nucleosome protection (WPS, TFBS), cleavage motifs (EndMotif, BreakPointMotif), chromatin accessibility (ATAC), motif divergence (MDS), and orientation (OCF).

⚙️ Quick Start

Installation

[!IMPORTANT] Quarto is strictly required for programmatic dashboard generation. Because quarto-cli wrapper packages are unreliable across Python environments, kreview assumes the Quarto executable is installed dynamically on your OS or container.

Option 1: Docker (Recommended "Batteries-Included" Method)

The easiest way to run kreview without managing external dependencies is to use our pre-built Docker container (hosted on GHCR). It natively ships with Python 3.12, all ML libraries, and the underlying quarto linux binaries configured flawlessly:

docker pull ghcr.io/msk-access/kreview:latest
docker run -v /your/data:/data ghcr.io/msk-access/kreview:latest \
  kreview run --cancer-samplesheet /data/cancer.csv ...

Option 2: Local Install (Pip)

If you install via pip, you must separately install Quarto via your OS manager:

  1. Install Quarto: Follow the official Quarto Installation Guide (e.g. brew install quarto on macOS).
  2. Install kreview:
git clone https://github.com/msk-access/kreview.git
cd kreview
pip install -e .

Running the Pipeline

PYTHONUNBUFFERED=1 kreview run \
  --cancer-samplesheet "/path/to/cancer/samplesheet.csv" \
  --healthy-xs1-samplesheet "/path/to/healthy/xs1/samplesheet.csv" \
  --healthy-xs2-samplesheet "/path/to/healthy/xs2/samplesheet.csv" \
  --cbioportal-dir "/path/to/cBioPortal_MAF_CNA_SV/" \
  --krewlyzer-dir "/path/to/unified_krewlyzer_results" \
  --output output/ \
  --workers 4 \
  --export-duckdb

Dashboard Access

Once finished, open the generated HTML reports:

open output/reports/ATAC_dashboard.html

🏗️ nbdev Architecture

This project operates as an nbdev repo. Do not edit .py scripts manually in kreview/. Build natively inside Jupyter notebooks within nbs/ and trigger:

nbdev-export

📚 Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kreview-0.0.1.tar.gz (63.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kreview-0.0.1-py3-none-any.whl (78.5 kB view details)

Uploaded Python 3

File details

Details for the file kreview-0.0.1.tar.gz.

File metadata

  • Download URL: kreview-0.0.1.tar.gz
  • Upload date:
  • Size: 63.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kreview-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8c308ad0fcce0011ca56a82008ac9c3386e0a963104b0542be8a310ab0b4808d
MD5 a62969d6671ccbf6b8a44c8949826d0d
BLAKE2b-256 04b4b8303fb6f0036bf7e061004b872c63f24f65c1c78b4176dc9a3b9d6c97fb

See more details on using hashes here.

Provenance

The following attestation bundles were made for kreview-0.0.1.tar.gz:

Publisher: release.yml on msk-access/kreview

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kreview-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: kreview-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 78.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kreview-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 84c72e63d6f9e6d3f6088a5275af1abe24236b7869a9d0a5a69bd7329d3a7cbf
MD5 2383d72c32f46beca95b3e7d2de37cf0
BLAKE2b-256 d0d7ddd7d7f4cb00c50046d2ca1c7eb82abaac1c4ca3e209fccffeeb252d5b64

See more details on using hashes here.

Provenance

The following attestation bundles were made for kreview-0.0.1-py3-none-any.whl:

Publisher: release.yml on msk-access/kreview

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page