A package for preprocessing mzML files for GNPS molecular networking
Project description
mzml2gnps
mzml2gnps is a Python script designed for processing mzML files for GNPS (Global Natural Products Social Molecular Networking) molecular networking. It provides functionalities for loading mzML experiments, correcting precursor m/z values, merging spectra according to rt, precusor and similarity of MS2 spectra, filtering MS2 spectra based on various criteria, and exporting the processed spectra to a new mzML file.
Features
- Load mzML Experiment: Load an mzML file and create an MSExperiment object.
- Correct Precursors: Correct the precursor m/z values to the highest intensity MS1 peak within a specified tolerance.
- Merge Spectra: Merge spectra based on retention times, precusors and similarities of MS2 spectra.
- Filter MS2 Spectra: Filter MS2 spectra based on precursor m/z, retention time, and precursor intensity threshold.
- Export to mzML: Export the filtered MS2 spectra to a new mzML file.
Requirements
- Python 3.10.13
- pandas 2.0.3
- pyopenms 3.1.0-pre-HEAD-2023-10-13
Installation
To install mzml2gnps, ensure you have Python 3.10 or later installed. Then, follow these steps:
- Install the package directly from PyPI:
pip install mzml2gnps
- If you want to install from the source, first clone the repository:
git clone https://github.com/yourusername/mzml2gnps.git
cd mzml2gnps
- Then, use the following command to install:
pip install .
This will automatically handle the dependencies listed in pyproject.toml.
Usage
The script can be executed from the command line with various arguments to specify the input and output paths, filtering criteria, and whether to perform precursor correction and spectra merging.
Command Line Arguments
--file_path: Input mzML file path or folder path containing mzML files (required).--output_path: Output folder path (required).--precmz: List of precursor m/z values (optional).--rt: List of retention times (optional).--precmz_tolerance: Precursor m/z tolerance (default: 20).--rt_tolerance: Retention time tolerance (default: 0.5).--precinty_thre: Precursor intensity threshold (default: 0).--csv: CSV file path includingprecmzandrtcolumns (optional).--correct: Flag to enable precursor correction (default: False).--merge: Flag to enable spectra merging (default: False).
Example
mzml2gnps --file_path input.mzML --output_path /path/to/output --correct --merge
This command processes the input.mzML file, corrects precursors, merges spectra, and saves the processed spectra to the specified output path.
Development
This script was developed using Python and relies on the pyopenms library for handling mzML files and MS data processing.
Contributing
If you have suggestions for how phyloBGC could be improved, or want to report a bug, open an issue! We'd love all and any contributions.
For more, check out the Contributing Guide.
License
MIT © Yunying Xie
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mzml2gnps-1.0.3.tar.gz.
File metadata
- Download URL: mzml2gnps-1.0.3.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76912b695215ec65897eb3aed7a44a32c389bad535cb6c14d74383c60735fb3d
|
|
| MD5 |
41aa0e986295cd89d77db5b0f7e224e8
|
|
| BLAKE2b-256 |
636a93c24cd2518ca223a29426fa10a7cbcd219c90ec61f9ebedaf43ca6df806
|
File details
Details for the file mzml2gnps-1.0.3-py3-none-any.whl.
File metadata
- Download URL: mzml2gnps-1.0.3-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1c6f65eed903ae79edea098bcefa54bffc79198e500df6fa397f428c360df5d
|
|
| MD5 |
d2fc505ddd15168e8127fc0f028b757c
|
|
| BLAKE2b-256 |
6c75b99efeafeaf5773564581cc7a16eed71752d720118f551dea246f56d24de
|