Skip to main content

Augmentation of SELFIES encoding

Project description

augSELFIES

Augmented implementation and modified versions of Self-Referencing Embedded Strings (SELFIES).

The core distinction here is that while the original SELFIES implementation focuses on interconverting between SMILES and SELFIES strings with molecular graphs predominantly used as an afterthought, here graphs and operations on them are at the forefront. Secondarily, native SELFIES re-uses chemical symbol tokens for indexing, whereas this implementation supports alterative indexing schemes.

For the original implmentation of SELFIES and references, see https://github.com/aspuru-guzik-group/selfies.

===

Installation

First install pixi onto your machine as well as a mirror of group-selfies (https://github.com/aspuru-guzik-group/group-selfies). Next, while in a directory for augselfies, run

pixi install 

to install default required dependencies and

pixi shell 

from the root directory of this project to install the dependencies and to activate the envioronment.

PyPI/pip Instructions:

It is not recommended to install from pip, which loses some of the precision that pixi provides for general dependency management. However, for ease of installation this project may be installed from source via

pip install .["pip"]

or from PyPI via

pip install augselfies["pip"]

This is provided as a convenience but may cause unintended behavior.

12.02.2025: GroupSELFIES must be separately sourced at this time.

Usage

augSELFIES should be used as a library, predominantly for data processing.

It contains methods for converting SELFIES to numeric SELFIES (numSELFIES), implementations of which are in augselfies.numeralization.

augSELFIES also contains processes for data augmentation, creating multiple equivalent SELFIES/numSELFIES for the some underlying molcular graph. See augselfies.augmentation for details and implmentation.

Testing

This repository is desgined to be tested via pytest. Run

pixi shell --environment test
pytest --cov src 

to run all unit tests and determine current code coverage.

API Documentation

See the HTML files in /docs. To regenerate documentation, run

pixi shell --environment docs 
sphinx-build -M html docs docs/_build 

Copyright Notice

This project is MIT-licensed (see LICENSE.txt).

© 2025. Triad National Security, LLC. All rights reserved.

This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos

National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.

Department of Energy/National Nuclear Security Administration. All rights in the program are

reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear

Security Administration. The Government is granted for itself and others acting on its behalf a

nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare

derivative works, distribute copies to the public, perform publicly and display publicly, and to permit

others to do so.

This project has been approved for open-source release under number O#: O4990.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

augselfies-0.1.1.tar.gz (103.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

augselfies-0.1.1-py2.py3-none-any.whl (10.2 kB view details)

Uploaded Python 2Python 3

File details

Details for the file augselfies-0.1.1.tar.gz.

File metadata

  • Download URL: augselfies-0.1.1.tar.gz
  • Upload date:
  • Size: 103.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for augselfies-0.1.1.tar.gz
Algorithm Hash digest
SHA256 03d4e139dcc6f7b567107903af6c9c35914621f396890d5585a06d011c65615c
MD5 8a8cb2e59e1496710a8147661d0a86b1
BLAKE2b-256 2176003a4f59dde1a975679e869328b7a8ab4d20c1d3bb598f3f186c569ccaf7

See more details on using hashes here.

File details

Details for the file augselfies-0.1.1-py2.py3-none-any.whl.

File metadata

  • Download URL: augselfies-0.1.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for augselfies-0.1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 62b82bc2f7d21ae4071cadeec9baf93d039750bc6937d127ab342b614c2080e2
MD5 e3b268ae31b4ba30cedbfc990b048364
BLAKE2b-256 963c6ac4e88365542597caa79174817941bcd09c33f253672d4d53cab1965e8b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page