Skip to main content

One-hot encoding for simple molecular-input line-entry system (SMILES) strings

Project description

smiles-encoder: One-hot encoding for simple molecular-input line-entry system (SMILES) strings

GitHub version PyPI version GitHub license

smiles-encoder is a Python package used to generate one-hot vectors representing SMILES strings (each string element is a one-hot vector).

Installation

Installation via pip:

$ pip install smiles-encoder

Installation via cloned repository:

$ git clone https://github.com/tjkessler/smiles-encoder
$ cd smiles-encoder
$ python setup.py install

smiles-encoder does not require any dependencies.

Basic Usage

First, assemble a list of SMILES strings:

smiles_strings = [
    'O=Cc1ccc(O)c(OC)c1',  # Vanillin
    'CC(=O)NCCC1=CNc2c1cc(OC)cc2',  # Melatonin
    'C1CCCCC1',  # Cyclohexane
    'C1=CC=CC=C1'  # Benzene
]

Import the SmilesEncoder object, and pass it the list of SMILES strings during initialization to construct the element dictionary:

from smiles_encoder import SmilesEncoder

encoder = SmilesEncoder(smiles_strings)

Use the encoder to encode SMILES strings:

encoded_smiles = encoder.encode_many(smiles_strings)

# OR

encoded_smiles = [encoder.encode(s) for s in smiles_strings]

Use the encoder to decode encoded SMILES strings:

decoded_smiles = encoder.decode_many(encoded_smiles)

# OR

decoded_smiles = [encoder.decode(e) for e in encoded_smiles]

Contributing, Reporting Issues and Other Support

To contribute to smiles-encoder, make a pull request. Contributions should include extensive documentation.

To report problems with the software or feature requests, file an issue. When reporting problems, include information such as error messages, your OS/environment and Python version.

For additional support/questions, contact Travis Kessler (travis.j.kessler@gmail.com).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smiles_encoder-0.1.1.tar.gz (4.5 kB view hashes)

Uploaded Source

Built Distribution

smiles_encoder-0.1.1-py3-none-any.whl (5.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page