Skip to main content

Augmentation of SELFIES encoding

Project description

augSELFIES

Augmented implementation and modified versions of Self-Referencing Embedded Strings (SELFIES).

The core distinction here is that while the original SELFIES implementation focuses on interconverting between SMILES and SELFIES strings with molecular graphs predominantly used as an afterthought, here graphs and operations on them are at the forefront. Secondarily, native SELFIES re-uses chemical symbol tokens for indexing, whereas this implementation supports alterative indexing schemes.

For the original implmentation of SELFIES and references, see https://github.com/aspuru-guzik-group/selfies.

===

Installation

First install pixi onto your machine as well as a mirror of group-selfies (https://github.com/aspuru-guzik-group/group-selfies). Next, while in a directory for augselfies, run

pixi install 

to install default required dependencies and

pixi shell 

from the root directory of this project to install the dependencies and to activate the envioronment.

Usage

augSELFIES should be used as a library, predominantly for data processing.

It contains methods for converting SELFIES to numeric SELFIES (numSELFIES), implementations of which are in augselfies.numeralization.

augSELFIES also contains processes for data augmentation, creating multiple equivalent SELFIES/numSELFIES for the some underlying molcular graph. See augselfies.augmentation for details and implmentation.

Testing

This repository is desgined to be tested via pytest. Run

pixi shell --environment test
pytest --cov src 

to run all unit tests and determine current code coverage.

API Documentation

See the HTML files in /docs. To regenerate documentation, run

pixi shell --environment docs 
sphinx-build -M html docs docs/_build 

Copyright Notice

This project is MIT-licensed (see LICENSE.txt).

© 2025. Triad National Security, LLC. All rights reserved.

This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos

National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.

Department of Energy/National Nuclear Security Administration. All rights in the program are

reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear

Security Administration. The Government is granted for itself and others acting on its behalf a

nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare

derivative works, distribute copies to the public, perform publicly and display publicly, and to permit

others to do so.

This project has been approved for open-source release under number O#: O4990.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

augselfies-0.1.0.tar.gz (102.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

augselfies-0.1.0-py2.py3-none-any.whl (10.0 kB view details)

Uploaded Python 2Python 3

augselfies-0.1.0-1-py2.py3-none-any.whl (10.2 kB view details)

Uploaded Python 2Python 3

File details

Details for the file augselfies-0.1.0.tar.gz.

File metadata

  • Download URL: augselfies-0.1.0.tar.gz
  • Upload date:
  • Size: 102.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for augselfies-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fec338812a5d3cc92d08f1dd0a9552b5df64a38a025940bd4fff24c1a0afeb7f
MD5 410401433944fdeaeb2e11ae7f3a5e22
BLAKE2b-256 46f2f698223829e89d6ed14037389418adc9d2cc2afa3c084c6ad1dd7ca20dab

See more details on using hashes here.

File details

Details for the file augselfies-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: augselfies-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for augselfies-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 98e1d7a142c3cc93ceca57cbd8fac0ea8ad784c27eb081ec4300b3c3a8611478
MD5 86b02cf3a0b69b7ca0aba64275cec69a
BLAKE2b-256 98baf19d83ec2ff9f051ced49db7c6a7571ce7d6c7409abe5daa6895d2e4f447

See more details on using hashes here.

File details

Details for the file augselfies-0.1.0-1-py2.py3-none-any.whl.

File metadata

  • Download URL: augselfies-0.1.0-1-py2.py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for augselfies-0.1.0-1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 532eec2a9202add10573b3d4d5c411e83c75ff53adda18eee15c8af798336ef1
MD5 094f25d550d2668c3e8bfbd75957b8b9
BLAKE2b-256 107102225681fa03d6bf1aa78b9e5e2571d2d6d59408e1488c85ea9f204a40bc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page