LIONHEART is a method for detecting cancer from whole genome sequenced plasma cell-free DNA. Check the README for additional installation steps.

These details have not been verified by PyPI

Project links

Project description

LIONHEART Cancer Detector

LIONHEART is a method for detecting cancer from whole genome sequenced plasma cell-free DNA.

This software lets you run feature extraction and predict the cancer status of your samples. Further, you can train a model on your own data.

All you need is a BAM file with whole genome sequenced cell-free DNA.

Developed for hg38. See the remap directory for the applied remapping pipeline.

Article (open access): https://www.nature.com/articles/s41467-025-66503-3

The code was developed and implemented by @ludvigolsen.

If you experience an issue, please report it.

NOTE: Upgrading to version 2.0.0 requires full reinstallation of the environment and package. Remember to download the new resources from Zenodo.

Installation

This section describes the installation of lionheart and the custom version of mosdepth (exp. time: <10m). The code has only been tested on linux but should also work on Mac and Windows.

Install the main package:

# Create and activate conda environment
$ conda config --set channel_priority flexible
$ conda env create -f https://raw.githubusercontent.com/BesenbacherLab/lionheart/refs/heads/main/environment.yml
$ conda activate lionheart
# Try installing mawk (we use awk/gawk as backup if it's unavailable)
$ conda install -c bioconda mawk || echo "mawk not available on this platform — skipping"

# Install package from PyPI
$ pip install lionheart

# OR install from GitHub
$ pip install git+https://github.com/BesenbacherLab/lionheart.git

Custom mosdepth

We use a modified version of mosdepth available at https://github.com/LudvigOlsen/mosdepth/

To install this, it requires an installation of nim so we can use nimble install. Note that we use nim 1.6.14.

# Download nim installer and run
$ curl https://nim-lang.org/choosenim/init.sh -sSf | sh

# Add to PATH
# Change the path to fit with your system
# Tip: Consider adding it to the terminal configuration file (e.g., ~/.bashrc)
$ export PATH=/home/<username>/.nimble/bin:$PATH

# Install and use nim 1.6.4 
# NOTE: This step should be done even when nim is already installed
$ choosenim 1.6.14

Now that nim is installed, we can install the custom mosdepth. To not override an existing mosdepth installation, we install it into a separate directory:

# Make a directory for installing the nim packages into
$ mkdir mosdepth_installation

# Install modified mosdepth
$ NIMBLE_DIR=mosdepth_installation nimble install -y https://github.com/LudvigOlsen/mosdepth --mm:refc

# Get path to mosdepth binary to use in the software
$ find mosdepth_installation/pkgs/ -name "mosdepth*"
>> mosdepth_installation/pkgs/mosdepth-0.x.x/mosdepth

Get Resources

Download and unzip the required resources.

$ wget https://zenodo.org/records/15747531/files/inference_resources_v003.tar.gz
$ tar -xvzf inference_resources_v003.tar.gz

Main commands

This section describes the commands in lionheart and lists their main output files:

Command	Description	Main Output
`lionheart extract_features`	Extract features from a BAM file.	`feature_dataset.npy` and correction profiles
`lionheart predict_sample`	Predict cancer status of a sample.	`prediction.csv`
`lionheart collect`	Collect predictions and/or features across samples.	`predictions.csv`, `feature_dataset.npy`, and correction profiles for all samples
`lionheart customize_thresholds`	Extract ROC curve and more for using custom probability thresholds.	`ROC_curves.json` and `probability_densities.csv`
`lionheart cross_validate`	Cross-validate the model on new data and/or the included features.	`evaluation_summary.csv`, `splits_summary.csv`
`lionheart train_model`	Train a model on your own data and/or the included features.	`model.joblib` and training data results
`lionheart validate`	Validate a model on a validation dataset.	`evaluation_scores.csv` and `predictions.csv`
`lionheart evaluate_univariates`	Evaluate the cancer detection potential of each feature separately.	`univariate_evaluations.csv`

Examples

Run via command-line interface

This example shows how to run lionheart from the command-line.

Note: If you don't have a BAM file at hand, you can download an example BAM file from: https://zenodo.org/records/13909979 It is a downsampled version of a public BAM file from Snyder et al. (2016; 10.1016/j.cell.2015.11.050) that has been remapped to hg38. On our system, the feature extraction for this sample takes ~1h15m using 12 cores (n_jobs).

# Start by skimming the help page
$ lionheart -h

# Extract feature from a given BAM file
# `mosdepth_path` is the path to the customized `mosdepth` installation
# E.g., "/home/<username>/mosdepth/mosdepth"
# `ld_library_path` is the path to the `lib` folder in the conda environment
# E.g., "/home/<username>/anaconda3/envs/lionheart/lib/"
$ lionheart extract_features --bam_file {bam_file} --resources_dir {resources_dir} --out_dir {out_dir} --mosdepth_path {mosdepth_path} --ld_library_path {ld_library_path} --n_jobs {cores}

# `sample_dir` is the `out_dir` of `extract_features`
$ lionheart predict_sample --sample_dir {sample_dir} --resources_dir {resources_dir} --out_dir {out_dir} --thresholds max_j spec_0.95 spec_0.99 sens_0.95 sens_0.99 0.5 --identifier {sample_id}

After running these commands for a set of samples, you can use lionheart collect to collect features and predictions across the samples. You can then use lionheart train_model to train a model on your own data (and optionally the included features).

Via `gwf` workflow

We provide a simple workflow for submitting jobs to slurm via the gwf package. Make a copy of the workflow directory, open workflow.py, change the paths and list the samples to run lionheart on.

The first time running a workflow it's required to first set the gwf backend to slurm or one of the other backends :

# Start by downloading the repository
$ wget -O lionheart-main.zip https://github.com/BesenbacherLab/lionheart/archive/refs/heads/main.zip
$ unzip lionheart-main.zip

# Copy workflow directory to a location
$ cp -r lionheart-main/workflow <location>/workflow

# Navigate to your copy of the the workflow directory
$ cd <location>/workflow

# Activate conda environment
$ conda activate lionheart

# Set `gwf` backend to slurm (or another preferred backend)
$ gwf config set backend slurm

Open the workflow.py file and change the various paths. When you're ready to submit the jobs, run:

$ gwf run

gwf allows seeing a status of the submitted jobs:

$ gwf status
$ gwf status -f summary

Reproduction of results

This section shows how to reproduce the main results (cross-validation and external validation) from the paper. It uses the included features so the reproduction can be run without access to the raw sequencing data.

Note that different compilations of scikit-learn on different operating systems may lead to slightly different results. On linux, the results should match the reported results.

Cross-validation analysis

We start by performing the nested leave-one-dataset-out cross-validation analysis from Figure 3A (not including the benchmarks).

Note that the default settings are the ones used in the paper.

# Perform the cross-validation
# {cv_out_dir} should specify where you want the output files
$ lionheart cross_validate --out_dir {cv_out_dir} --resources_dir {resources_dir} --use_included_features --num_jobs 10

The output directory should now include multiple files. The main results are in evaluation_summary.csv and splits_summary.csv. Note that the results are given for multiple probability thresholds. The threshold reported in the paper is the "Max. J Threshold". You can extract the relevant lines of the summaries with:

$ awk 'NR==1 || /Average/ && /J Threshold/' {cv_out_dir}/evaluation_summary.csv
$ awk 'NR==1 || /Average/ && /J Threshold/' {cv_out_dir}/splits_summary.csv

External validation analysis

To reproduce the external validation, we first train a model on all the included training datasets and then validate it on the included validation dataset:

# Train a model on the included datasets
# {new_model_dir} should specify where you want the model files
$ lionheart train_model --out_dir {new_model_dir} --resources_dir {resources_dir} --use_included_features

# Validate the model on the included validation dataset
# {val_out_dir} should specify where you want the output files
$ lionheart validate --out_dir {val_out_dir} --resources_dir {resources_dir} --model_dir {new_model_dir} --use_included_validation --thresholds 'max_j'

The model training creates the model.joblib file along with predictions and evaluations from the training data (e.g., predictions.csv, evaluation_scores.csv, and ROC_curves.json).

The validation creates evaluation_scores.csv and predictions.csv from applying the model on the validation dataset. You will find the reported AUC score in evaluation_scores.csv:

$ cat {val_out_dir}/evaluation_scores.csv

Univariate analyses

Finally, we reproduce the univariate modeling evaluations in Figure 2D and 2E:

# Evaluate the classification potential of each cell type separately
# {univariates_dir} should specify where you want the evaluation files
$ lionheart evaluate_univariates --out_dir {univariates_dir} --resources_dir {resources_dir} --use_included_features --num_jobs 10

This creates the univariate_evaluations.csv file with evaluation metrics per cell-type. There are coefficients and p-values (bonferroni-corrected) from univariate logistic regression models and evaluation metrics from per-cell-type leave-one-dataset-out cross-validation.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.2.0

Apr 29, 2026

2.1.0

Jan 17, 2026

2.0.2

Sep 9, 2025

2.0.0

Jun 26, 2025

1.1.5

Feb 4, 2025

1.1.4

Feb 4, 2025

1.1.3

Feb 4, 2025

1.1.2

Feb 4, 2025

1.1.1

Dec 9, 2024

1.1.0

Dec 7, 2024

1.0.2

Dec 2, 2024

1.0.1

Dec 1, 2024

1.0.0

Nov 26, 2024

0.1.0

Jul 6, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lionheart-2.2.0.tar.gz (110.4 kB view details)

Uploaded Apr 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lionheart-2.2.0-py3-none-any.whl (136.9 kB view details)

Uploaded Apr 29, 2026 Python 3

File details

Details for the file lionheart-2.2.0.tar.gz.

File metadata

Download URL: lionheart-2.2.0.tar.gz
Upload date: Apr 29, 2026
Size: 110.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.9.6 Darwin/25.4.0

File hashes

Hashes for lionheart-2.2.0.tar.gz
Algorithm	Hash digest
SHA256	`abd22619da730072412a8b540065c0706e1d0aecb89afa09871a34c97dd1aac4`
MD5	`23b12a46ae657dc850ed976e772b2851`
BLAKE2b-256	`f0bf494774724e15afa8f5c0f8114ad5f79fd960955b3de7ae9406187a877133`

See more details on using hashes here.

File details

Details for the file lionheart-2.2.0-py3-none-any.whl.

File metadata

Download URL: lionheart-2.2.0-py3-none-any.whl
Upload date: Apr 29, 2026
Size: 136.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.9.6 Darwin/25.4.0

File hashes

Hashes for lionheart-2.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79c8f1475a95c9071eaf3fdf94537e5b4303b171dc97bf812d88ec2946b37194`
MD5	`49ea6fffdb3ed955f6202999c3f8854b`
BLAKE2b-256	`a17bc063fd9152e7fa69f49a47993809921219ad79ca35268bcd40c300960668`

See more details on using hashes here.

lionheart 2.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LIONHEART Cancer Detector

Installation

Custom mosdepth

Get Resources

Main commands

Examples

Run via command-line interface

Via `gwf` workflow

Reproduction of results

Cross-validation analysis

External validation analysis

Univariate analyses

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

lionheart 2.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LIONHEART Cancer Detector

Installation

Custom mosdepth

Get Resources

Main commands

Examples

Run via command-line interface

Via gwf workflow

Reproduction of results

Cross-validation analysis

External validation analysis

Univariate analyses

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Via `gwf` workflow