Skip to main content

Annotate LC-MS1 data, MS imaging data or pseudo MS/MS spectra using reference MS/MS libraries

Project description

ms1_id

Developer License Python

Full-scan MS data from both LC-MS and MS imaging capture multiple ion forms, including their in/post-source fragments. Here we leverage such fragments to structurally annotate full-scan data from LC-MS or MS imaging by matching against MS/MS spectral libraries.

ms1_id is a Python package that annotates full-scan MS data using tandem MS libraries, specifically:

  • annotate LC-MS data: mzML or mzXML files
  • annotate MS imaging data: imzML and ibd files
  • annotate pseudo-MS/MS spectra: mgf files
  • build indexed MS/MS libraries from mgf or msp files (see Flash entropy for more details)

Workflow

Annotation workflow

Example annotations

Example annotation

Installation

pip install ms1_id

Python 3.9+ is required. It has been tested on macOS (14.6, M2 Max) and Linux (Ubuntu 20.04).

Usage

Note: Indexed libraries are needed for the workflow. You can download the indexed GNPS library here.

wget https://github.com/Philipbear/ms1_id/releases/latest/download/indexed_gnps_libs.zip
unzip indexed_gnps_libs.zip

Annotate pseudo MS/MS spectra

If you have pseudo MS/MS spectra in mgf format, you can directly annotate them:

ms1_id annotate --input_file pseudo_msms.mgf --libs data/gnps.pkl data/gnps_k10.pkl --min_score 0.7 --min_matched_peak 3

Here, two indexed libraries are searched against, and the result tsv files will be saved in the same directory as the input file.

For more options, run:

ms1_id annotate --help

Annotate LC-MS data

To annotate LC-MS data, here is an example command:

ms1_id lcms --project_dir lc_ms --sample_dir data --ms1_id_libs data/gnps.pkl data/gnps_k10.pkl --ms2_id_libs data/gnps.pkl

Here, lc_ms is the project directory. Raw mzML or mzXML files are stored in the lc_ms/data folder. Both MS1 and MS/MS annotations will be performed, and the results can be accessed from aligned_feature_table.tsv.

For more options, run:

ms1_id lcms --help

Expected runtime is <3 min for a single LC-MS file. If it takes longer than 10 min, please increase the --mass_detect_int_tol parameter (default: 2e5 for Orbitraps, 5e2 for QTOFs).

Annotate MS imaging data

To annotate MS imaging data, here is an example command:

ms1_id msi --project_dir msi --libs data/gnps.pkl data/gnps_k10.pkl --n_cores 12

Here, msi is the project directory. Raw imzML and ibd files are stored in the msi folder, and 12 cores will be used for parallel processing. Annotation results can be accessed from ms1_id_annotations_derep.tsv

For more options, run:

ms1_id msi --help

Expected runtime <5 min for a single MS imaging dataset.

Build indexed MS/MS libraries

To build your own indexed library, run:

ms1_id index --ms2db library.msp --peak_scale_k 10 --peak_intensity_power 0.5

For more options, run:

ms1_id index --help

Citation

Shipei Xing, Vincent Charron-Lamoureux, Yasin El Abiead, Huaxu Yu, Oliver Fiehn, Theodore Alexandrov, Pieter C. Dorrestein. Annotating full-scan MS data using tandem MS libraries. bioRxiv 2024.

Data

License

This project is licensed under the Apache 2.0 License (Copyright 2024 Shipei Xing).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ms1_id-0.1.0.tar.gz (168.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ms1_id-0.1.0-py3-none-any.whl (220.5 kB view details)

Uploaded Python 3

File details

Details for the file ms1_id-0.1.0.tar.gz.

File metadata

  • Download URL: ms1_id-0.1.0.tar.gz
  • Upload date:
  • Size: 168.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.18

File hashes

Hashes for ms1_id-0.1.0.tar.gz
Algorithm Hash digest
SHA256 796e52a41b7a06b974e539132a52cbdf3fffd64eeec1b7171c867e32ed085ea2
MD5 126206eca03ed17a4a28c2014997c78c
BLAKE2b-256 2d178d142c7754805f9e555cc50b63d0ad5c8519e53f1b3a0c1acc6b3b425dd6

See more details on using hashes here.

File details

Details for the file ms1_id-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ms1_id-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 220.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.18

File hashes

Hashes for ms1_id-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2a17cca69a7f74c7c3baaa5cb9c2128c481800071e274c042e933bb8cd861244
MD5 2c16a6cc49241db195e1a686622738e8
BLAKE2b-256 fc9d58dd2c001aaea7e0b7520d4a39718b53a832dd8f9be983b850dad5e196d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page