Skip to main content

A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes

Project description

MaldiAMRKit

PyPI Version PyPI Downloads License

MaldiAMRKit

A comprehensive toolkit for MALDI-TOF mass spectrometry data preprocessing for antimicrobial resistance (AMR) prediction purposes

InstallationFeaturesQuick StartLicenseContributing

Installation

pip install maldiamrkit

Features

  • 📊 Spectrum Processing: Load, smooth, baseline correct, and normalize MALDI-TOF spectra
  • 📦 Dataset Management: Process multiple spectra with metadata integration
  • 🔍 Peak Detection: Automated peak finding with customizable parameters
  • 📈 Spectral Alignment (Warping): Multiple alignment methods (shift, linear, piecewise, DTW)
  • 🤖 ML-Ready: Direct integration with scikit-learn pipelines

Quick Start

Load and Preprocess a Single Spectrum

from maldiamrkit.spectrum import MaldiSpectrum

# Load spectrum from file
spec = MaldiSpectrum("data/spectrum.txt")

# Preprocess: smoothing, baseline removal, normalization
spec.preprocess()

# Optional: bin to reduce dimensions
spec.bin(bin_width=3)  # 3 Da bins

# Visualize
spec.plot(binned=True)

Build a Dataset from Multiple Spectra

from maldiamrkit.dataset import MaldiSet

# Load multiple spectra with metadata
data = MaldiSet.from_directory(
    spectra_dir="data/spectra/",
    metadata_path="data/metadata.csv",
    aggregate_by={"antibiotic": "Drug", "species": "Species"},
    bin_width=3
)

# Access features and labels
X = data.X  # Feature matrix
y = data.y["Drug"]  # Target labels

Machine Learning Pipeline

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from maldiamrkit.peak_detector import MaldiPeakDetector

# Create ML pipeline
pipe = Pipeline([
    ("peaks", MaldiPeakDetector(binary=False, prominence=0.05)),
    ("scaler", StandardScaler()),
    ("clf", RandomForestClassifier(n_estimators=100, random_state=42))
])

# Train and predict
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)

Align spectra to correct for mass calibration drift:

from maldiamrkit.warping import Warping

# Create warping transformer with shift method
warper = Warping(
    method='shift',  # or 'linear', 'piecewise', 'dtw'
    reference='median',  # use median spectrum as reference
    max_shift=50
)

# Fit on training data and transform
warper.fit(X_train)
X_aligned = warper.transform(X_test)

# Visualize alignment results
fig, axes = warper.plot_alignment(
    X_original=X_test,
    X_aligned=X_aligned,
    indices=[0, 5, 10],  # plot multiple spectra
    xlim=(2000, 10000),  # zoom to m/z range
    show_peaks=True
)

Alignment Methods:

  • shift: Global median shift (fast, simple)
  • linear: Least-squares linear transformation
  • piecewise: Local shifts across spectrum segments (most flexible)
  • dtw: Dynamic Time Warping (best for non-linear drift)

For further details please see the quick guide notebook.

Contributing

Pull requests, bug reports, and feature ideas are welcome: feel free to open a PR!

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

This toolkit is inspired by and builds upon the methodology described in:

Weis, C., Cuénod, A., Rieck, B., et al. (2022). Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning. Nature Medicine, 28, 164–174. https://doi.org/10.1038/s41591-021-01619-9

Please consider citing this work if you find MaldiAMRKit useful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maldiamrkit-0.4.0.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

maldiamrkit-0.4.0-py3-none-any.whl (23.3 kB view details)

Uploaded Python 3

File details

Details for the file maldiamrkit-0.4.0.tar.gz.

File metadata

  • Download URL: maldiamrkit-0.4.0.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for maldiamrkit-0.4.0.tar.gz
Algorithm Hash digest
SHA256 528a8d981b279acd6ce106d363bd89cad5ee9b4fafcf02e9fceea7f51277caac
MD5 bb6f2b5880439e7898a0cd453536093d
BLAKE2b-256 ec6ef6ebad137268b0f516f601b842518288e70e6265b068fd6f19477d92a3d8

See more details on using hashes here.

File details

Details for the file maldiamrkit-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: maldiamrkit-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for maldiamrkit-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae8c7a95c54dc68a915e755a7e2379e0fe24d3a703ec547b88e2e7611b75fb76
MD5 1405bb2ecac1337d43b2e27aea293928
BLAKE2b-256 92e16f5d4a3cb77c27a4f4579001b08d6b3f3ffa2693a4841c7465fbfd4fd4ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page