Skip to main content

A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes

Project description

MaldiAMRKit

PyPI Version License

MaldiAMRKit

A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes

InstallationFeaturesQuick StartLicenseContributing

Installation

pip install maldiamrkit

Features

  • 📊 Spectrum Processing: Load, smooth, baseline correct, and normalize MALDI-TOF spectra
  • 📦 Dataset Management: Process multiple spectra with metadata integration
  • 🔍 Peak Detection: Automated peak finding with customizable parameters
  • 📈 Spectral Alignment (Warping): Multiple alignment methods (shift, linear, piecewise, DTW)
  • 🤖 ML-Ready: Direct integration with scikit-learn pipelines

Quick Start

Load and Preprocess a Single Spectrum

from maldiamrkit.spectrum import MaldiSpectrum

# Load spectrum from file
spec = MaldiSpectrum("data/spectrum.txt")

# Preprocess: smoothing, baseline removal, normalization
spec.preprocess()

# Optional: bin to reduce dimensions
spec.bin(bin_width=3)  # 3 Da bins

# Visualize
spec.plot(binned=True)

Build a Dataset from Multiple Spectra

from maldiamrkit.dataset import MaldiSet

# Load multiple spectra with metadata
data = MaldiSet.from_directory(
    spectra_dir="data/spectra/",
    metadata_path="data/metadata.csv",
    aggregate_by={"antibiotic": "Drug", "species": "Species"},
    bin_width=3
)

# Access features and labels
X = data.X  # Feature matrix
y = data.y["Drug"]  # Target labels

Machine Learning Pipeline

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from maldiamrkit.peak_detector import MaldiPeakDetector

# Create ML pipeline
pipe = Pipeline([
    ("peaks", MaldiPeakDetector(binary=False, prominence=0.05)),
    ("scaler", StandardScaler()),
    ("clf", RandomForestClassifier(n_estimators=100, random_state=42))
])

# Train and predict
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)

Align spectra to correct for mass calibration drift:

from maldiamrkit.warping import Warping

# Create warping transformer with shift method
warper = Warping(
    method='shift',  # or 'linear', 'piecewise', 'dtw'
    reference='median',  # use median spectrum as reference
    max_shift=50
)

# Fit on training data and transform
warper.fit(X_train)
X_aligned = warper.transform(X_test)

# Visualize alignment results
fig, axes = warper.plot_alignment(
    X_original=X_test,
    X_aligned=X_aligned,
    indices=[0, 5, 10],  # plot multiple spectra
    xlim=(2000, 10000),  # zoom to m/z range
    show_peaks=True
)

Alignment Methods:

  • shift: Global median shift (fast, simple)
  • linear: Least-squares linear transformation
  • piecewise: Local shifts across spectrum segments (most flexible)
  • dtw: Dynamic Time Warping (best for non-linear drift)

For further details please see the quick guide notebook.

Contributing

Pull requests, bug reports, and feature ideas are welcome: feel free to open a PR!

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

This toolkit is inspired by and builds upon the methodology described in:

Weis, C., Cuénod, A., Rieck, B., et al. (2022). Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning. Nature Medicine, 28, 164–174. https://doi.org/10.1038/s41591-021-01619-9

Please consider citing this work if you find MaldiAMRKit useful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maldiamrkit-0.4.1.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

maldiamrkit-0.4.1-py3-none-any.whl (23.3 kB view details)

Uploaded Python 3

File details

Details for the file maldiamrkit-0.4.1.tar.gz.

File metadata

  • Download URL: maldiamrkit-0.4.1.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for maldiamrkit-0.4.1.tar.gz
Algorithm Hash digest
SHA256 32fbcc9e0e30d9f914c69985504da49fed5d74d9d4c4266aaf1fa8536a67e81e
MD5 73583a284a6af15a800df7f2bbab2de5
BLAKE2b-256 2e921f0ac9838f2dd784d0cc77a252656f7802e3f1e1a3c297e02577e721c0a4

See more details on using hashes here.

File details

Details for the file maldiamrkit-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: maldiamrkit-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for maldiamrkit-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 882a22488593c7208b089364541d4f0eb8d89cfc4bff500ae77d88b59d1ad262
MD5 f2b948c5492d14ee0a56d5b51430572d
BLAKE2b-256 98eb332e066097f02521e3297138935661335a7d36ae416248ca336a5ce78a7c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page