Skip to main content

Score calibration and false discovery estimation for de novo peptide sequencing.

Project description

Python 3.10+ Ruff pre-commit Test Coverage


Winnow

Confidence calibration and FDR control for de novo peptide sequencing
Explore the docs »

Report bug · Request feature

Table of Contents
  1. About the project
  2. Installation
  3. Usage
  4. Contributing
Winnow Workflow

Winnow workflow for confidence calibration and FDR control in de novo peptide sequencing

About the project

In bottom-up proteomics workflows, peptide sequencing—matching an MS2 spectrum to a peptide—is just the first step. The resulting peptide-spectrum matches (PSMs) often contain many incorrect identifications, which can negatively impact downstream tasks like protein assembly.

To mitigate this, intermediate steps are introduced to:

  1. Assign confidence scores to PSMs that better correlate with correctness.
  2. Estimate and control the false discovery rate (FDR) by filtering identifications based on confidence scores.

For database search-based peptide sequencing, PSM rescoring and target-decoy competition (TDC) are standard approaches, supported by an extensive ecosystem of tools. However, de novo peptide sequencing lacks standardised methods for these tasks.

Winnow aims to fill this gap by implementing the calibrate-estimate framework for FDR estimation. Unlike TDC, this approach is directly applicable to de novo sequencing models. Additionally, its calibration step naturally incorporates common confidence rescoring workflows as part of FDR estimation.

Winnow provides both a CLI and a Python package, offering flexibility in performing confidence calibration and FDR estimation.

(back to top)

Installation

Winnow is published on PyPI as winnow-fdr. Install with pip or a pip-compatible tool (e.g. uv pip install):

pip install winnow-fdr

or

uv pip install winnow-fdr

(back to top)

Quick Start

Get started in minutes using the example data in examples/example_data/.

# Train a calibrator on the example data
make train-sample

# Run prediction with the trained model without filtering on an FDR threshold
make predict-sample

Note: The sample data is minimal (100 spectra) and intended for testing only. The make commands shown above are configured for the sample data with adjusted settings (e.g., relaxed FDR threshold). For your own datasets, use the winnow commands outlined below.

(back to top)

Usage

Winnow supports two usage modes:

  1. A command-line interface (CLI) with sensible defaults and multiple FDR estimation methods.
  2. A configurable and extensible Python package for advanced users.

CLI

Installing Winnow provides the winnow command with three sub-commands:

  1. winnow train – Performs confidence calibration on a dataset of annotated PSMs, outputting the fitted model checkpoint.
  2. winnow compute-features – Computes and outputs the feature set for a dataset of PSMs.
  3. winnow predict – Performs confidence calibration using a fitted model checkpoint (defaults to a pretrained general model from Hugging Face), estimates and controls FDR using the calibrated confidence scores.

By default, winnow predict uses a pretrained general model (InstaDeepAI/winnow-general-model) hosted on Hugging Face Hub, allowing you to get started immediately without training. You can also specify custom Hugging Face models or use locally trained models.

Winnow uses Hydra for flexible, hierarchical configuration management. All parameters can be configured via YAML files or overridden on the command line:

# Quick start with defaults
winnow predict

# Override specific parameters
winnow predict fdr_control.fdr_threshold=0.01

# Specify different data source and dataset paths
winnow predict data_loader=mztab dataset.spectrum_path_or_directory=data/spectra.parquet dataset.predictions_path=data/preds.mztab

Refer to the CLI Guide and Configuration Guide for details on usage and configuration options.

The example notebook walks through the Python API for the same workflows you can run from the CLI.

(back to top)

Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire and create, and we welcome your support! Any contributions you make are greatly appreciated.

If you have ideas for enhancements, you can:

  • Fork the repository and submit a pull request.
  • Open an issue and tag it with "enhancement".

Contribution process

  1. Fork the repository.
  2. Create a feature branch (git checkout -b feat-amazing-feature).
  3. Commit your changes (git commit -m 'feat: add some amazing feature').
  4. Push to your branch (git push origin feat-amazing-feature).

For more details on the contributing process, see the Contributing Guide.

Don't forget to give the project a star! Thanks again! :star:

(back to top)

BibTeX entry and citation information

If you use Winnow in your research, please cite the following preprint:

@article{mabona2025novopeptidesequencingrescoring,
     title={De novo peptide sequencing rescoring and FDR estimation with Winnow},
     author={Amandla Mabona and Jemma Daniel and Henrik Servais Janssen Knudsen
     and Rachel Catzel and Kevin Michael Eloff and Erwin M. Schoof and Nicolas Lopez
     Carranza and Timothy P. Jenkins and Jeroen Van Goey and Konstantinos Kalogeropoulos},
      year={2025},
      eprint={2509.24952},
      archivePrefix={arXiv},
      primaryClass={q-bio.QM},
      url={https://arxiv.org/abs/2509.24952},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

winnow_fdr-1.0.4.tar.gz (55.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

winnow_fdr-1.0.4-py3-none-any.whl (62.4 kB view details)

Uploaded Python 3

File details

Details for the file winnow_fdr-1.0.4.tar.gz.

File metadata

  • Download URL: winnow_fdr-1.0.4.tar.gz
  • Upload date:
  • Size: 55.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnow_fdr-1.0.4.tar.gz
Algorithm Hash digest
SHA256 1e6be320d0cec0a01261b581e67f17205fb0e6849eb2bf72aa3f1ffa98e790fd
MD5 e1da8866928d002e626e5928b43e9ae8
BLAKE2b-256 dc824f5d2d5c5bcf23f31b16ea22835a433f29b5ca102b28405cadc071f6c44e

See more details on using hashes here.

Provenance

The following attestation bundles were made for winnow_fdr-1.0.4.tar.gz:

Publisher: publish.yml on instadeepai/winnow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file winnow_fdr-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: winnow_fdr-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 62.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for winnow_fdr-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e69365874f85fbe86778d5152be211a770057040b4459785a86fe1c41f85cae3
MD5 2e9a011f6a58bf63ce372a478781d2cb
BLAKE2b-256 dbc1ff6918edd4e86df5dd7900bf5b63af92a441f6ec952e61ca6d26ade66b0b

See more details on using hashes here.

Provenance

The following attestation bundles were made for winnow_fdr-1.0.4-py3-none-any.whl:

Publisher: publish.yml on instadeepai/winnow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page