Skip to main content

This tool provides methods to extract meaningful features from drug SMILES for Machine Learning operation

Project description

Pre-requisites

Install RdKit library:

Usage

  • Make sure you have Python installed in your system.
  • Run Following command in the CMD.
 pip install drug-smile-fet

Example

# example.py
from dsfet import fe_1mol
import pandas as pd
train_smiles = {'DRUG_NAME': {0: 'Luminespib', 1: 'Trametinib', 2: 'Venetoclax', 3: 'Olaparib', 4: 'Axitinib'},
               'PUBCHEM_ID': {0: 135539077.0, 1: 11707110.0, 2: 49846579.0, 3: 23725625.0, 4: 6450551.0},
               'SMILES': {0: 'CCNC(=O)C1=NOC(=C1C2=CC=C(C=C2)CN3CCOCC3)C4=CC(=C(C=C4O)O)C(C)C',
                          1: 'CC1=C2C(=C(N(C1=O)C)NC3=C(C=C(C=C3)I)F)C(=O)N(C(=O)N2C4=CC=CC(=C4)NC(=O)C)C5CC5',
                          2: 'CC1(CCC(=C(C1)C2=CC=C(C=C2)Cl)CN3CCN(CC3)C4=CC(=C(C=C4)C(=O)NS(=O)(=O)C5=CC(=C(C=C5)NCC6CCOCC6)[N+](=O)[O-])OC7=CN=C8C(=C7)C=CN8)C',
                          3: 'C1CC1C(=O)N2CCN(CC2)C(=O)C3=C(C=CC(=C3)CC4=NNC(=O)C5=CC=CC=C54)F',
                          4: 'CNC(=O)C1=CC=CC=C1SC2=CC3=C(C=C2)C(=NN3)/C=C/C4=CC=CC=N4'}
               }
train_smiles_df = pd.DataFrame(data=train_smiles)

test_smile = train_smiles
test_smile_df = pd.DataFrame(test_smile)

#Train, Test, feature_sequences, feature_to_token_map = fe_1mol.oneMolFeatureExtraction(trainSMILES=train_smiles_df, testSMILES=train_smiles_df,ngram_list=[1,2,3,4,5,6,7,8])
Train, Test, feature_sequences, feature_to_token_map = fe_1mol.oneMolFeatureExtraction(trainSMILES=train_smiles_df, testSMILES=None,ngram_list=[1,2,3,4,5,6,7,8])

Note:

The input to the method oneMolFeatureExtraction() must be a pandas DataFrame with atleats two columns:

  • DRUG_NAME
  • SMILES

The column name should be in capital letters.

Cite us at:

Rahul Sharma, & Jake Y. Chen. (2022). Drug SMILE Featture Extraction Tool (1.0.1). Zenodo. https://doi.org/10.5281/zenodo.7072304

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drug-smile-fet-1.0.2.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

drug_smile_fet-1.0.2-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file drug-smile-fet-1.0.2.tar.gz.

File metadata

  • Download URL: drug-smile-fet-1.0.2.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for drug-smile-fet-1.0.2.tar.gz
Algorithm Hash digest
SHA256 0028e5d7fac633d8f32f28654254dc9a5d130a51bdbce8ae8d94186a0ac3b3c5
MD5 238305d53c923b9aa8d066a8dad1a1b1
BLAKE2b-256 0117d8803fb7ed725ff195782ad5fcc51ac768d7c92556416bfca5b0883924d2

See more details on using hashes here.

File details

Details for the file drug_smile_fet-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for drug_smile_fet-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5d62f2012d8870858ce69683711b9e70f84309188ea306d8c997c21924ac5491
MD5 950a62642f55905ede46e2f21cee9876
BLAKE2b-256 0fa7e15c5c2699cf581efac0a5d32a6e3528b2f5a4320ac7bdd4ecdd941c6f5e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page