Skip to main content

3DMolMS: prediction of tandem mass spectra from 3D molecular conformations

Project description

3DMolMS

CC BY-NC-SA 4.0 (free for academic use)

3D Molecular Network for Mass Spectra Prediction (3DMolMS) is a deep neural network model to predict the MS/MS spectra of compounds from their 3D conformations. This model's molecular representation, learned through MS/MS prediction tasks, can be further applied to enhance performance in other molecular-related tasks, such as predicting retention times and collision cross sections.

Read our paper in Bioinformatics | Try our online service at GNPS | Install from PyPI

Installation

3DMolMS is available on PyPI. You can install the latest version using pip:

pip install molnetpack

# PyTorch must be installed separately. 
# For CUDA 11.6, install PyTorch with the following command:
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116

# For CUDA 11.7, use:
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117

# For CPU-only usage, use:
pip install torch==1.13.0+cpu torchvision==0.14.0+cpu torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cpu

3DMolMS can also be installed through source codes:

git clone https://github.com/JosieHong/3DMolMS.git
cd 3DMolMS

pip install .

Usage

To get started quickly, you can load a CSV or MGF file to predict MS/MS and then plot the predicted results.

import torch
from molnetpack import MolNet

# Set the device to CPU for CPU-only usage:
device = torch.device("cpu")

# For GPU usage, set the device as follows (replace '0' with your desired GPU index):
# gpu_index = 0
# device = torch.device(f"cuda:{gpu_index}")

# Instantiate a MolNet object
molnet_engine = MolNet(device, seed=42) # The random seed can be any integer. 

# Load input data (here we use a CSV file as an example)
molnet_engine.load_data(path_to_test_data='./test/input_msms.csv') # Increasing the batch size if you wanna speed up.
# molnet_engine.load_data(path_to_test_data='./test/input_msms.mgf') # MGF file is also supported
# molnet_engine.load_data(path_to_test_data='./test/input_msms.pkl') # PKL file is faster. 

# Predict MS/MS
spectra1 = molnet_engine.pred_msms(path_to_results='./test/output_qtof_msms.mgf', instrument='qtof')
# You could also download the checkpoint from release and set the 'path_to_checkpoint':
# spectra = molnet_engine.pred_msms(path_to_results='./test/output_msms.mgf', path_to_checkpoint='<path to the checkpoint>')
# Instrument can be 'qtof' or 'orbitrap'. 

# Plot the predicted MS/MS with 3D molecular conformation
molnet_engine.plot_msms(dir_to_img='./img/', instrument='qtof')

For CCS prediction, please use the following codes after instantiating a MolNet object.

# Load input data
molnet_engine.load_data(path_to_test_data='./test/input_ccs.csv')

# Pred CCS
ccs_df = molnet_engine.pred_ccs(path_to_results='./test/output_ccs.csv')

For RT prediction, please use the following code after instantiating a MolNet object. Please note that since this model is trained on the METLIN-SMRT dataset, the predicted retention time is under the same experimental conditions as the METLIN-SMRT set.

# Load input data
molnet_engine.load_data(path_to_test_data='./test/input_rt.csv')

# Pred RT
rt_df = molnet_engine.pred_rt(path_to_results='./test/output_rt.csv')

For saving the molecular embeddings, please use the following codes after instantiating a MolNet object.

# Load input data
molnet_engine.load_data(path_to_test_data='./test/input_savefeat.csv')

# Inference to get the features
features = molnet_engine.save_features()

print('Titles:', ids)
print('Features shape:', features.shape)

The sample input files, a CSV and an MGF, are located at ./test/demo_input.csv and ./test/demo_input.mgf, respectively. If the input data is only expected to be used in CCS prediction, you may assign an arbitrary numerical value to the Collision_Energy field in the CSV file or to COLLISION_ENERGY in the MGF file. It's important to note that during the data loading phase, any input formats that are not supported will be automatically excluded. Below is a table outlining the types of input data that are supported:

Item Supported input
Atom number <=300
Atom types 'C', 'O', 'N', 'H', 'P', 'S', 'F', 'Cl', 'B', 'Br', 'I', 'Na'
Precursor types '[M+H]+', '[M-H]-', '[M+H-H2O]+', '[M+Na]+', '[M+2H]2+'
Collision energy any number

The documents for running MS/MS prediction from source codes are at MSMS_PRED.md.

Citation

If you use 3DMolMS in your research, please cite our paper:

@article{hong20233dmolms,
  title={3DMolMS: prediction of tandem mass spectra from 3D molecular conformations},
  author={Hong, Yuhui and Li, Sujun and Welch, Christopher J and Tichy, Shane and Ye, Yuzhen and Tang, Haixu},
  journal={Bioinformatics},
  volume={39},
  number={6},
  pages={btad354},
  year={2023},
  publisher={Oxford University Press}
}
@article{hong2024enhanced,
  title={Enhanced structure-based prediction of chiral stationary phases for chromatographic enantioseparation from 3D molecular conformations},
  author={Hong, Yuhui and Welch, Christopher J and Piras, Patrick and Tang, Haixu},
  journal={Analytical Chemistry},
  volume={96},
  number={6},
  pages={2351--2359},
  year={2024},
  publisher={ACS Publications}
}

Thank you for considering 3DMolMS for your research needs!

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molnetpack-1.1.10.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

molnetpack-1.1.10-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file molnetpack-1.1.10.tar.gz.

File metadata

  • Download URL: molnetpack-1.1.10.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for molnetpack-1.1.10.tar.gz
Algorithm Hash digest
SHA256 7fd59ff223a7f1aa421a1765839986cdffaf6175feb645bfd752d96c0da8a74f
MD5 cba2154e2a0a3c162973dda45f9e19f0
BLAKE2b-256 db4b86b8bf689ade3c4b1d3d86891b96f65ecf50937e55559bb0cf18f9620df0

See more details on using hashes here.

File details

Details for the file molnetpack-1.1.10-py3-none-any.whl.

File metadata

  • Download URL: molnetpack-1.1.10-py3-none-any.whl
  • Upload date:
  • Size: 30.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for molnetpack-1.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 b26026e7d3a184f3925a2b1f5981567bfbaf9055994d38cc9d862dcc2eacbb68
MD5 c35726e592ae7a0fa01f06ab259e7bf0
BLAKE2b-256 364fbffa66684c50affdc6bbf108acbae28607a30a805acd8f5e3e2773ebe259

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page