Skip to main content

SMPrecursorPrediction

Project description

SMPrecursorPredictor

A ML pipeline for the prediction of specialised metabolites starting substances.

Installation

Manually

  1. Clone the repository and move into the directory:
git clone
cd SMPrecursorPredictor
  1. Create a conda environment and activate it:
conda create -n sm_precursor_predictor python=3.10
conda activate sm_precursor_predictor
  1. Install the dependencies:
pip install -r requirements.txt
  1. Install the package:
pip install .

Pypi

  1. Create a conda environment and activate it:
conda create -n sm_precursor_predictor python=3.10
conda activate sm_precursor_predictor
pip install SMPrecursorPrediction

Making predictions

Models available:

  • Layered FP + Low Variance FS + Ridge Classifier
  • Morgan FP + Ridge Classifier
from sm_precursor_predictor import predict_precursors
precursors = predict_precursors(
            ["[H][C@]89CN(CCc1c([nH]c2ccccc12)[C@@](C(=O)OC)(c3cc4c(cc3OC)N(C)[C@@]5([H])[C@@]"
             "(O)(C(=O)OC)[C@H](OC(C)=O)[C@]7(CC)C=CCN6CC[C@]45[C@@]67[H])C8)C[C@](O)(CC)C9",
             "COC1=C(C=CC(=C1)C2=C(C(=O)C3=C(C=C(C=C3O2)O)O)O[C@H]4[C@@H]([C@H]([C@H]([C@H](O4)CO)O)O)O)O"],
             model="Layered FP + Low Variance FS + Ridge Classifier")
print(precursors)

or

read a csv file with a column of SMILES and a column of IDs and save the predictions in a csv file:

from sm_precursor_predictor import predict_from_csv
predictions = predict_from_csv("path_to_csv", 
                               smiles_field="SMILES", 
                               ids_field="ID",
                               model="Layered FP + Low Variance FS + Ridge Classifier")
predictions.to_csv("path_to_save_predictions.csv")

Making and explaining predictions

This is only possible with one model: Morgan FP + Ridge Classifier.

Example with linalool:

from sm_precursor_predictor import get_prediction_and_explanation

prediction, images, plots = get_prediction_and_explanation(smiles="CC(=CCCC(C)(C=C)O)C", threshold=0.20)

feature_importance

prediction
['Geranyl diphosphate']
images[0]

Linalool

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

SMPrecursorPrediction-0.0.2.tar.gz (338.2 kB view details)

Uploaded Source

Built Distribution

SMPrecursorPrediction-0.0.2-py3-none-any.whl (338.3 kB view details)

Uploaded Python 3

File details

Details for the file SMPrecursorPrediction-0.0.2.tar.gz.

File metadata

  • Download URL: SMPrecursorPrediction-0.0.2.tar.gz
  • Upload date:
  • Size: 338.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for SMPrecursorPrediction-0.0.2.tar.gz
Algorithm Hash digest
SHA256 55870beb2ebded67b6a4c9a901ca774161a0958b8766a4c49e8363da2bf77810
MD5 ff9bbd409d6c97128f8280820ec231b4
BLAKE2b-256 363c1465c1c4144be9fddb6c30d158058cacd5e569f2961142b4f07d2f73c8cf

See more details on using hashes here.

File details

Details for the file SMPrecursorPrediction-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for SMPrecursorPrediction-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 455015b74bdb59ba867ccc6b6687b493416ed9a90b7666f85b5914bad4e19bdb
MD5 5993e5b47a18025a4d2317b36983d40f
BLAKE2b-256 1b30d5b82523924d55e3a23297a497ebbcc226115153aca3c7002a281a2b702b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page