Skip to main content

A lightweight generative model that extends SMILES fragments into syntactically valid molecules

Project description

Chempleter

Demo Gif

Chempleter is lightweight generative model which utlises a simple Gated Recurrent Unit (GRU) to predict syntactically valid extensions of a provided molecular fragment. It accepts SMILES notation as input and enforces chemical syntax validity using SELFIES for the generated molecules.

  • What can Chempleter do?

    • Currently, Chempleter accepts an intial molecule/molecular fragment in SMILES format and generates a larger molecule with that intial structure included, while respecting chemical syntax. It also shows some interesting descriptors.

    • It can be used to generate a wide range of structural analogs which the share same core structure (by changing the sampling temperature) or decorate a core scaffold iteratively (by increasing generated token lengths)

    • In the future, it might be adapated to predict structures with a specific chemical property using a regressor to rank predictions and transition towards more "goal-directed" predictions.

Demo Gif

Prerequisites

  • Python ">=3.13"
  • uv (optional but recommended)

Getting started

Visit Chempleter's docs.

Quick start

You can find more information about installing Chempleter (also via pip) in installation instructions.

  • Run the GUI directly without installing (via uv):

    • On windows:

      uvx --from chempleter chempleter-gui.exe

    • On linux/MacOS:

      uvx --from chempleter chempleter-gui

  • Install using uv

    uv pip install chempleter

  • Use the GUI

    • To start the Chempleter GUI after installing, execute in a terminal:

      uv run chempleter-gui

    • Type in the SMILES notation for the starting structure or leave it empty to generate random molecules. Click on GENERATE button to generate a molecule.

    • To know more about using the GUI and various options, see here.

    Or

  • Use as a python library

    • To use Chempleter as a python library:

      from chempleter.inference import extend
      generated_mol, generated_smiles, generated_selfies = extend(smiles="c1ccccc1")
      print(generated_smiles)
      >> C1=CC=CC=C1C2=CC=C(CN3C=NC4=CC=CC=C4C3=O)O2
      

      To draw the generated molecule :

      from rdkit import Chem
      Chem.Draw.MolToImage(generated_mol)
      
    • For details on available paramenters and inference functions, see generating molecules.

Project structure

  • src/chempleter: Contains python modules relating to different functions.
  • src/chempleter/processor.py: Contains fucntions for processing csv files containing SMILES data and generating training-related files.
  • src/chempleter/dataset.py: ChempleterDataset class
  • src/chempleter/model.py: ChempleterModel class
  • src/chempleter/inference.py: Contains functions for inference
  • src/chempleter/train.py: Contains functions for training
  • src/chempleter/gui.py: Chempleter GUI built using NiceGUI
  • src/chempleter/data : Contains trained model, vocabulary files

License

MIT License

Copyright (c) 2025-2026 Davis Thomas Daniel

Contributing

Any contribution, improvements, feature ideas or bug fixes are always welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chempleter-0.1.0b4.tar.gz (32.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chempleter-0.1.0b4-py3-none-any.whl (32.2 MB view details)

Uploaded Python 3

File details

Details for the file chempleter-0.1.0b4.tar.gz.

File metadata

  • Download URL: chempleter-0.1.0b4.tar.gz
  • Upload date:
  • Size: 32.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.3

File hashes

Hashes for chempleter-0.1.0b4.tar.gz
Algorithm Hash digest
SHA256 02b9b92c069e15b3f5a201416c57c1e84956e5f4f391549df2d7c12a3d29e897
MD5 a0503169f0d03e26006a96e0ffd443b5
BLAKE2b-256 d5f95dcd5bad85d867786268636c903db14f91df6d7dd5228c0ea6ff6aa74d95

See more details on using hashes here.

File details

Details for the file chempleter-0.1.0b4-py3-none-any.whl.

File metadata

File hashes

Hashes for chempleter-0.1.0b4-py3-none-any.whl
Algorithm Hash digest
SHA256 e2066c0ee4f177d9a9b737c7034b0bd964915a9aba3490183f6b9e2b11ee2015
MD5 20da718c88f4b4ae71052e0304a8c9ca
BLAKE2b-256 85526f6257a6838cdbc85663bfbbecbda5b20f74d23184490c0f3d7481901bfb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page