One-hot encoding for simple molecular-input line-entry system (SMILES) strings
Project description
smiles-encoder: One-hot encoding for simple molecular-input line-entry system (SMILES) strings
smiles-encoder is a Python package used to generate one-hot vectors representing SMILES strings (each string element is a one-hot vector).
Installation
Installation via pip:
$ pip install smiles-encoder
Installation via cloned repository:
$ git clone https://github.com/tjkessler/smiles-encoder
$ cd smiles-encoder
$ python setup.py install
smiles-encoder does not require any dependencies.
Basic Usage
First, assemble a list of SMILES strings:
smiles_strings = [
'O=Cc1ccc(O)c(OC)c1', # Vanillin
'CC(=O)NCCC1=CNc2c1cc(OC)cc2', # Melatonin
'C1CCCCC1', # Cyclohexane
'C1=CC=CC=C1' # Benzene
]
Import the SmilesEncoder object, and pass it the list of SMILES strings during initialization to construct the element dictionary:
from smiles_encoder import SmilesEncoder
encoder = SmilesEncoder(smiles_strings)
Use the encoder to encode SMILES strings:
encoded_smiles = encoder.encode_many(smiles_strings)
# OR
encoded_smiles = [encoder.encode(s) for s in smiles_strings]
Use the encoder to decode encoded SMILES strings:
decoded_smiles = encoder.decode_many(encoded_smiles)
# OR
decoded_smiles = [encoder.decode(e) for e in encoded_smiles]
Contributing, Reporting Issues and Other Support
To contribute to smiles-encoder, make a pull request. Contributions should include extensive documentation.
To report problems with the software or feature requests, file an issue. When reporting problems, include information such as error messages, your OS/environment and Python version.
For additional support/questions, contact Travis Kessler (travis.j.kessler@gmail.com).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for smiles_encoder-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 62de822cb7ad29552e5442d8f2e00b0e7ed6605f629cb792e02581637a5cb299 |
|
MD5 | d2805e8e2acabfef5f7f84fb430b05d9 |
|
BLAKE2b-256 | 17dd157e8384a98110377f89a91cfd9cef25b83d13c1c1ec63432593a8e43aca |