Skip to main content

Toolkit to read and preprocess MALDI-TOF mass-spectra for AMR analyses.

Project description

MaldiAMRKit

PyPI Version PyPI Downloads License

MaldiAMRKit

A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing and antimicrobial resistance (AMR) prediction

InstallationFeaturesQuick StartLicenseContributing

Installation

pip install maldiamrkit

Features

  • 📊 Spectrum Processing: Load, smooth, baseline correct, and normalize MALDI-TOF spectra
  • 📦 Dataset Management: Process multiple spectra with metadata integration
  • 🔍 Peak Detection: Automated peak finding with customizable parameters
  • 📈 Spectral Alignment (Warping): Multiple alignment methods (shift, linear, piecewise, DTW)
  • 🤖 ML-Ready: Direct integration with scikit-learn pipelines

Quick Start

Load and Preprocess a Single Spectrum

from maldiamrkit.spectrum import MaldiSpectrum

# Load spectrum from file
spec = MaldiSpectrum("data/spectrum.txt")

# Preprocess: smoothing, baseline removal, normalization
spec.preprocess()

# Optional: bin to reduce dimensions
spec.bin(bin_width=3)  # 3 Da bins

# Visualize
spec.plot(binned=True)

Build a Dataset from Multiple Spectra

from maldiamrkit.dataset import MaldiSet

# Load multiple spectra with metadata
data = MaldiSet.from_directory(
    spectra_dir="data/spectra/",
    metadata_path="data/metadata.csv",
    aggregate_by={"antibiotic": "Drug", "species": "Species"},
    bin_width=3
)

# Access features and labels
X = data.X  # Feature matrix
y = data.y["Drug"]  # Target labels

Machine Learning Pipeline

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from maldiamrkit.peak_detector import MaldiPeakDetector

# Create ML pipeline
pipe = Pipeline([
    ("peaks", MaldiPeakDetector(binary=False, prominence=0.05)),
    ("scaler", StandardScaler()),
    ("clf", RandomForestClassifier(n_estimators=100, random_state=42))
])

# Train and predict
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)

Align spectra to correct for mass calibration drift:

from maldiamrkit.warping import Warping

# Create warping transformer with shift method
warper = Warping(
    method='shift',  # or 'linear', 'piecewise', 'dtw'
    reference='median',  # use median spectrum as reference
    max_shift=50
)

# Fit on training data and transform
warper.fit(X_train)
X_aligned = warper.transform(X_test)

# Visualize alignment results
fig, axes = warper.plot_alignment(
    X_original=X_test,
    X_aligned=X_aligned,
    indices=[0, 5, 10],  # plot multiple spectra
    xlim=(2000, 10000),  # zoom to m/z range
    show_peaks=True
)

Alignment Methods:

  • shift: Global median shift (fast, simple)
  • linear: Least-squares linear transformation
  • piecewise: Local shifts across spectrum segments (most flexible)
  • dtw: Dynamic Time Warping (best for non-linear drift)

For further details please see the quick guide notebook.

Contributing

Pull requests, bug reports, and feature ideas are welcome: feel free to open a PR!

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

This toolkit is inspired by and builds upon the methodology described in:

Weis, C., Cuénod, A., Rieck, B., et al. (2022). Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning. Nature Medicine, 28, 164–174. https://doi.org/10.1038/s41591-021-01619-9

Please consider citing this work if you find MaldiAMRKit useful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maldiamrkit-0.3.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

maldiamrkit-0.3.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file maldiamrkit-0.3.0.tar.gz.

File metadata

  • Download URL: maldiamrkit-0.3.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for maldiamrkit-0.3.0.tar.gz
Algorithm Hash digest
SHA256 f7a1be6c689697d8ed55a42f35e874ab272c2f67ccda07a815b18d1066b2ea1e
MD5 6eec4208d662ac79620b0509e35eef36
BLAKE2b-256 2e52fc6cf3c36e4dc81262c65d09dbb9e582d778fb80a4df7c686c4f6898a4d5

See more details on using hashes here.

File details

Details for the file maldiamrkit-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: maldiamrkit-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for maldiamrkit-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a54f9619a724ef6d1215774822bc22e3bf720697b466b7ced0ee46cd807bb2c2
MD5 b56296ceddd1a1518e792c64d078ac93
BLAKE2b-256 f6424eaeda55b05ed677a148a197df39680e5665bff78abe0331a993dde2fd57

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page