No project description provided
Project description
MutAIverse
Facilitating the identification of DNA adducts from untargeted metabolomics mass spectrometry data along with predictive capabilities to determine potential source genotoxins responsible for the novel identified or pre-existing adduct formation.
The single strong dependency for this resource is RDKit, which can be installed in a local Conda environment.
Other dependencies
- matchms==0.13.0
- hnswlib==0.8.0
- gensim==4.3.3
- pandas==1.5.3
- numpy==1.23.0
- matplotlib==3.7.1
- tqdm==4.65.0
- seaborn==0.12.2
- rdkit==2023.3.1
Adduct Mapper module
MutAIverse provides two approaches for mapping query MS spectra against in silico MS MS spectral library of Experimentally validated adducts or Synthetic DNA adducts of MutAIverse.
MutAIverse Library setup
This is a one-time task that must be completed to use the Mapper module after installing the package for the first time
from MutAIverse import Mapper
Mapper.load_library()
The function fetches the library data (1.7G) to be used by the Mapper module in the future.
Brute force Approach
Cosine Similarity-based mapping
from MutAIverse import Mapper
Mapper.map('bonafide_adducts',sample_file_path='/path-to-mzML-file',MS_level=1,plot=True)
Additional arguments
Parameters:
- library (str): bonafide_adducts/MutAIversee
- sample_file_path (str): Path to the mzML file containing mass spectrometry data.
- ms level (int): 1 (MS spectrum) or 2 (MS/MS spectrum)
- plot (bool; default True): for visualizations
return
- Result CSV file with suffix _MutAIversee_results.csv or _bonafide_adducts_results.csv
Quick Search Approach
Approximate Nearest Neighbour-based mapping, which executes through 2 steps
- Generation of spectral embeddings from query MS spectra
- Mapping using the HNSW index of the spectral embeddings
from MutAIverse import Mapper
Mapper.fast_map(mzml_file_path)
Additional arguments
Parameters:
- mzml_file_path (str): Path to the mzML file containing mass spectrometry data.
- level (int; default 2): 1 (MS spectrum) or 2 (MS/MS spectrum)
- k (int; default 1): Number of nearest neighbors to search for.
- ef_query (int; default 300): Parameter controlling the number of elements to visit during a query.
- Energy (int; default 0): Collision energy
Returns:
- pandas.DataFrame: DataFrame containing search results with columns ['Query_Index', 'Nearest_Neighbor_Index', 'Cosine Similarity', 'SMILES', 'COMPID', 'Structures'].
- visualizations(density plot and histograms)
Adduct Linker module
MutAIverse is also capable of re-tracing a DNA adduct to its possible source Genotoxin.
Fragment-based linking
biotransformation backtracking based on abnormalities spliced from the base nucleotides
from MutAIverse import Linker
query_smiles = 'OC[C@H]1O[C@H](CC1O)n1c[n+](c2c1nc(N)[nH]c2=O)C1OC2C(C1O)c1c(O2)cc(c2c1oc(=O)c1c2CCC1=O)OC'
Linker.backtrace(Adduct = query_smiles)
Additional arguments
Parameters:
- Adduct (str): Path to the mzML file containing mass spectrometry data.
- knn (int; default 20): Number of nearest neighbors to narrow down the search space.
- tophit (int; default 5): Minimum number of Genotoxins to be linked.
- plot (bool; default False): Traced SMILES 2D structures in rows
- cutoff (int; default 80): Link Probability(%) cutoff
Returns:
- pandas.DataFrame: DataFrame containing search results with columns ['Query', 'Fragment', 'Metabolites', 'N-Transformation', 'Genotoxin', 'Probability'].
- visualizations(Traced smiles 2D structures in rows)
This module also has a sub-function dedicated only to visualize backtrace() output with a user-supplied probability threshold.
import pandas as pd
from MutAIverse import Linker
Linker.plot_trace(file='/Path-to-Output_file.csv')
Additional arguments
Parameters:
- file (str): Output CSV file (with path) of Linker.backtrace() function
- cutoff (int; default 80): Minimum probability threshold
Returns:
- visualizations(Traced SMILES 2D structures in rows)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mutaiverse-0.2.4.tar.gz.
File metadata
- Download URL: mutaiverse-0.2.4.tar.gz
- Upload date:
- Size: 39.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60d0d2be387043de340aeb7910b2f4f5fdb152a271dd7c4631c601ed93a1d252
|
|
| MD5 |
bd9f545a059f375c13adc8649f059fd2
|
|
| BLAKE2b-256 |
9322b4dface0ada9f194198069584ad1e99d760fe01120593b5f56ccc040751e
|
File details
Details for the file mutaiverse-0.2.4-py3-none-any.whl.
File metadata
- Download URL: mutaiverse-0.2.4-py3-none-any.whl
- Upload date:
- Size: 41.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7aa7255abac85699b6f3723f776ed6bc614ac5d408dad9373cb41c3780a34613
|
|
| MD5 |
a3c95ccddd7055c33ff644addd5bab1f
|
|
| BLAKE2b-256 |
70ff924afd8d5f2bc6fd5016bfcf3d658d5d74156af1110b4e88639d118c413d
|