Skip to main content

This tool provides methods to extract meaningful features from drug SMILES for Machine Learning operation

Project description

Pre-requisites

Install RdKit library:

Usage

  • Make sure you have Python installed in your system.
  • Run Following command in the CMD.
 pip install drug-smile-fet

Example

# example.py
from dsfet import fe_1mol
import pandas as pd
train_smiles = {'DRUG_NAME': {0: 'Luminespib', 1: 'Trametinib', 2: 'Venetoclax', 3: 'Olaparib', 4: 'Axitinib'},
               'PUBCHEM_ID': {0: 135539077.0, 1: 11707110.0, 2: 49846579.0, 3: 23725625.0, 4: 6450551.0},
               'SMILES': {0: 'CCNC(=O)C1=NOC(=C1C2=CC=C(C=C2)CN3CCOCC3)C4=CC(=C(C=C4O)O)C(C)C',
                          1: 'CC1=C2C(=C(N(C1=O)C)NC3=C(C=C(C=C3)I)F)C(=O)N(C(=O)N2C4=CC=CC(=C4)NC(=O)C)C5CC5',
                          2: 'CC1(CCC(=C(C1)C2=CC=C(C=C2)Cl)CN3CCN(CC3)C4=CC(=C(C=C4)C(=O)NS(=O)(=O)C5=CC(=C(C=C5)NCC6CCOCC6)[N+](=O)[O-])OC7=CN=C8C(=C7)C=CN8)C',
                          3: 'C1CC1C(=O)N2CCN(CC2)C(=O)C3=C(C=CC(=C3)CC4=NNC(=O)C5=CC=CC=C54)F',
                          4: 'CNC(=O)C1=CC=CC=C1SC2=CC3=C(C=C2)C(=NN3)/C=C/C4=CC=CC=N4'}
               }
train_smiles_df = pd.DataFrame(data=train_smiles)

test_smile = train_smiles
test_smile_df = pd.DataFrame(test_smile)

#Example 1: to call NLP-based feature extraction method
#Train, Test, feature_sequences, feature_to_token_map = fe_1mol.oneMolFeatureExtraction(trainSMILES=train_smiles_df, testSMILES=train_smiles_df,ngram_list=[1,2,3,4,5,6,7,8])
Train, Test, feature_sequences, feature_to_token_map = fe_1mol.oneMolFeatureExtraction(trainSMILES=train_smiles_df, testSMILES=None,ngram_list=[1,2,3,4,5,6,7,8])

#Example 2: to call Morgan Fingerprints based feature extraction method
#nBits is the number of bits in the fingerprint
result= fe_1mol.morganFingerPrint(train_smiles_df, nBits=1024)

Note:

The input to the method oneMolFeatureExtraction() and morganFingerprints() must be a pandas DataFrame and the Drug SMILES column name must be in uppercase:

  • e.g., SMILES

Cite us at:

Rahul Sharma, & Jake Y. Chen. (2022). Drug SMILE Feature Extraction Tool (1.0.3). Zenodo. https://doi.org/10.5281/zenodo.7072304

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drug-smile-fet-1.0.4.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

drug_smile_fet-1.0.4-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file drug-smile-fet-1.0.4.tar.gz.

File metadata

  • Download URL: drug-smile-fet-1.0.4.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for drug-smile-fet-1.0.4.tar.gz
Algorithm Hash digest
SHA256 ba320c6bdc3b11caf9f76a8a08a89a2e6e077bdc290f3c053677888cbe9a5ff9
MD5 6f30d0af05ff816eb4bd54c81c522505
BLAKE2b-256 5bc03e86266d0aff58fb379a8954b7fc964eeded71afc9b1abd5fdd6d20b6576

See more details on using hashes here.

File details

Details for the file drug_smile_fet-1.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for drug_smile_fet-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2afa3fcdb8743389e08682b1151f8957d57704597e8dc0d688f857b359801f85
MD5 59aa1e867d3a1b2e3fc851c4daaad6c3
BLAKE2b-256 614e547ed6f9a0d2164faf10b36ca2749d7c5aeb89bd73f32ba3a2a0931b3d0c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page