Skip to main content

A package for preprocessing mzML files for GNPS molecular networking

Project description

mzml2gnps

mzml2gnps is a Python script designed for processing mzML files for GNPS (Global Natural Products Social Molecular Networking) molecular networking. It provides functionalities for loading mzML experiments, correcting precursor m/z values, merging spectra according to rt, precusor and similarity of MS2 spectra, filtering MS2 spectra based on various criteria, and exporting the processed spectra to a new mzML file.

Features

  • Load mzML Experiment: Load an mzML file and create an MSExperiment object.
  • Correct Precursors: Correct the precursor m/z values to the highest intensity MS1 peak within a specified tolerance.
  • Merge Spectra: Merge spectra based on retention times, precusors and similarities of MS2 spectra.
  • Filter MS2 Spectra: Filter MS2 spectra based on precursor m/z, retention time, and precursor intensity threshold.
  • Export to mzML: Export the filtered MS2 spectra to a new mzML file.

Requirements

  • Python 3.10.13
  • pandas 2.0.3
  • pyopenms 3.1.0-pre-HEAD-2023-10-13

Installation

To install mzml2gnps, ensure you have Python 3.10 or later installed. Then, follow these steps:

  1. Install the package directly from PyPI:
pip install mzml2gnps
  1. If you want to install from the source, first clone the repository:
git clone https://github.com/yourusername/mzml2gnps.git
cd mzml2gnps
  1. Then, use the following command to install:
pip install .

This will automatically handle the dependencies listed in pyproject.toml.

Usage

The script can be executed from the command line with various arguments to specify the input and output paths, filtering criteria, and whether to perform precursor correction and spectra merging.

Command Line Arguments

  • --file_path: Input mzML file path or folder path containing mzML files (required).
  • --output_path: Output folder path (required).
  • --precmz: List of precursor m/z values (optional).
  • --rt: List of retention times (optional).
  • --precmz_tolerance: Precursor m/z tolerance (default: 20).
  • --rt_tolerance: Retention time tolerance (default: 0.5).
  • --precinty_thre: Precursor intensity threshold (default: 0).
  • --csv: CSV file path including precmz and rt columns (optional).
  • --correct: Flag to enable precursor correction (default: False).
  • --merge: Flag to enable spectra merging (default: False).

Example

mzml2gnps --file_path input.mzML --output_path /path/to/output --correct --merge

This command processes the input.mzML file, corrects precursors, merges spectra, and saves the processed spectra to the specified output path.

Development
This script was developed using Python and relies on the pyopenms library for handling mzML files and MS data processing.

Contributing
If you have suggestions for how phyloBGC could be improved, or want to report a bug, open an issue! We'd love all and any contributions.

For more, check out the Contributing Guide.

License
MIT © Yunying Xie

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mzml2gnps-1.0.3.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mzml2gnps-1.0.3-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file mzml2gnps-1.0.3.tar.gz.

File metadata

  • Download URL: mzml2gnps-1.0.3.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for mzml2gnps-1.0.3.tar.gz
Algorithm Hash digest
SHA256 76912b695215ec65897eb3aed7a44a32c389bad535cb6c14d74383c60735fb3d
MD5 41aa0e986295cd89d77db5b0f7e224e8
BLAKE2b-256 636a93c24cd2518ca223a29426fa10a7cbcd219c90ec61f9ebedaf43ca6df806

See more details on using hashes here.

File details

Details for the file mzml2gnps-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: mzml2gnps-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for mzml2gnps-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b1c6f65eed903ae79edea098bcefa54bffc79198e500df6fa397f428c360df5d
MD5 d2fc505ddd15168e8127fc0f028b757c
BLAKE2b-256 6c75b99efeafeaf5773564581cc7a16eed71752d720118f551dea246f56d24de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page