Skip to main content

Flexible normalization methods for proteomics quantitative data

Project description

Pronoms Logo

Pronoms: Proteomics Normalization Python Library

Overview

Pronoms is a Python library implementing multiple normalization methods for quantitative proteomics data. Each normalization method is encapsulated within modular, reusable classes. The library includes visualization capabilities that allow users to easily observe the effects of normalization. Some normalization methods, such as VSN normalization, leverage R on the backend for computation.

Documentation

See https://pronoms.readthedocs.io/ for complete documentation.

Installation

You can install Pronoms directly from PyPI using pip:

pip install pronoms

Prerequisites

  • Python 3.9 or higher
  • For R-based normalizers (VSN):
    • R installed on your system
    • Required R packages: vsn

Installing for Development

# Clone the repository
git clone https://github.com/mriffle/pronoms.git
cd pronoms

# Install in development mode with dev dependencies
pip install -e .[dev]

Usage

Basic Example

import numpy as np
from pronoms.normalizers import MedianNormalizer

# Create sample data
data = np.random.rand(5, 100)  # 5 samples, 100 proteins/features

# Create normalizer and apply normalization
normalizer = MedianNormalizer()
normalized_data = normalizer.normalize(data)

# Visualize the effect of normalization
normalizer.plot_comparison(data, normalized_data)

Available Normalizers

  • DirectLFQNormalizer: Performs protein quantification directly from peptide/ion intensity data using the DirectLFQ algorithm. Ammar C, Schessner JP, Willems S, Michaelis AC, Mann M. Accurate Label-Free Quantification by directLFQ to Compare Unlimited Numbers of Proteomes. Mol Cell Proteomics. 2023 Jul;22(7):100581. doi:10.1016/j.mcpro.2023.100581. PMID: 37225017
  • L1Normalizer: Scales samples to have a unit L1 norm (sum of absolute values).
  • MADNormalizer: Median Absolute Deviation Normalization. Robustly scales samples by subtracting the median and dividing by the Median Absolute Deviation (MAD). Pass scale_to_sigma=True to multiply MAD by 1.4826 so the output is a robust z-score (matches R's mad()).
  • MedianNormalizer: Scales each sample (row) by its median, then rescales by the mean of medians to preserve overall scale.
  • MedianPolishNormalizer: Tukey's Median Polish. Decomposes data (often log-transformed) into overall, row, column, and residual effects by iterative median removal.
  • QuantileNormalizer: Normalizes samples to have the same distribution using quantile mapping.
  • RankNormalizer: Transforms each sample's values to their ranks (1 to N), with tied values receiving the median rank. Optionally normalizes ranks by dividing by N for cross-dataset comparability.
  • SPLMNormalizer: Stable Protein Log-Mean Normalization. Uses stably expressed proteins (lowest linear-space CV, std/mean) to derive scaling factors for normalization in log-space, then transforms back.
  • VSNNormalizer: Variance Stabilizing Normalization (via R's vsn package). Stabilizes variance across the intensity range. Huber W, von Heydebreck A, Sültmann H, Poustka A, Vingron M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002;18 Suppl 1:S96–104. doi:10.1093/bioinformatics/18.suppl_1.s96. PMID: 12169536

Data Format

All normalizers expect data in the format of a 2D numpy array or pandas DataFrame with shape (n_samples, n_features) where:

  • Each row represents a sample
  • Each column represents a protein/feature

This follows the standard convention used in scikit-learn and other Python data science libraries.

R Integration

For normalizers that use R (VSN), ensure R is properly installed and accessible. The library uses rpy2 to interface with R.

Installing Required R Packages

The VSN package is part of Bioconductor. In R, run the following commands:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("vsn")

Development

Set up a virtual environment and install the dev extras:

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

The pre-flight gate before any commit is:

pytest                          # full test suite (warnings -> errors)
ruff check src tests            # lint
ruff format --check src tests   # formatting (use `ruff format` to apply)
mypy                            # static type check

Coverage:

pytest --cov=src/pronoms --cov-report=term-missing

License

This project is licensed under the Apache License License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pronoms-0.3.0.tar.gz (51.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pronoms-0.3.0-py3-none-any.whl (39.1 kB view details)

Uploaded Python 3

File details

Details for the file pronoms-0.3.0.tar.gz.

File metadata

  • Download URL: pronoms-0.3.0.tar.gz
  • Upload date:
  • Size: 51.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pronoms-0.3.0.tar.gz
Algorithm Hash digest
SHA256 4cef3e14428fef739d842334e35b6ee16f714056e2e32444d7f2b34c20068834
MD5 cca48e9689492f797a1f0553a28429f4
BLAKE2b-256 0e6098b22fbf582586ffe3aeea660d24e6dfa4fc94e19aefad6f9e55114a9a63

See more details on using hashes here.

Provenance

The following attestation bundles were made for pronoms-0.3.0.tar.gz:

Publisher: publish.yml on mriffle/pronoms

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pronoms-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: pronoms-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 39.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pronoms-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9172a26d595a3dff49b2aff3d215ed9b38bdd6ad7266fdf5a162243a4930683d
MD5 156297e0e19eff9faca8266c90cf920b
BLAKE2b-256 454efcb04294805e6a2302699bcc56c9c3e38e8e99375afdb1dacc8b6eca2d91

See more details on using hashes here.

Provenance

The following attestation bundles were made for pronoms-0.3.0-py3-none-any.whl:

Publisher: publish.yml on mriffle/pronoms

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page