Skip to main content

Streamlit app to explore chemical clustering!

Project description

Logo ChemCluster

- ChemCluster -

ChemCluster is an interactive web application for cheminformatics and molecular analysis, focusing on forming and visualizing molecular clusters built using Streamlit, RDKit, and scikit-learn.

Final project for the course Practical Programming in Chemistry — EPFL CH-200

📦 Package overview

ChemCluster is an interactive cheminformatics platform developed at EPFL in 2025 as part of the Practical Programming in Chemistry course. It is a user-friendly web application designed to explore and analyze chemical structures, either individually via the formation of conformers or as datasets.

This tool enables users to compute key molecular properties, visualize 2D and 3D structures, and perform clustering based on molecular similarity or conformer geometry. It also offers filtering options to help select clusters matching specific physicochemical criteria.

🌟 Features

  • Upload .sdf, .mol, or .csv files with SMILES
  • Compute key molecular properties (MW, logP, TPSA, etc.)
  • Visualize molecules in 2D (RDKit) and interactive 3D (Py3Dmol)
  • Reduce dimensionality with PCA and auto-optimize KMeans clustering
  • Click points on the PCA plot to inspect molecules and properties
  • Export cluster data to .csv

🛠️ Installation

  1. Install from PyPI:
pip install chemcluster
  1. Run the app:
chemcluster

This will open the ChemCluster interface in your browser.

To contribute or run locally from source:

git clone https://github.com/erubbia/ChemCluster.git
cd ChemCluster
conda env create -f environment.yml
conda activate chemcluster-env
pip install -e .

▶️ Testing

Testing can be done with 'pytest' or 'tox':

pytest
# or with tox
tox

📖 Usage

  • Analyze a single molecule by inputting a SMILES string or drawing the structure
  • Upload a dataset of molecules to perform PCA and clustering
  • Click on any point in the scatter plot to view its structure and properties
  • Use filters to identify clusters with desirable properties (e.g., high LogP, low MW)
  • Export selected clusters as CSV files for further analysis

📂 License

MIT License


👨‍🔬 Developers

  • Elisa Rubbia, Master's student in Molecular and Biological Chemistry at EPFL GitHub - erubbia

  • Romain Guichonnet, Master's student in Molecular and Biological Chemistry at EPFL GitHub - Romainguich

  • Flavia Zabala Perez, Master's student in Molecular and Biological Chemistry at EPFL GitHub - Flaviazab

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chemcluster-0.1.2.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chemcluster-0.1.2-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file chemcluster-0.1.2.tar.gz.

File metadata

  • Download URL: chemcluster-0.1.2.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for chemcluster-0.1.2.tar.gz
Algorithm Hash digest
SHA256 417fa229e22bfa7d76295186de1624e242d8477d0bfcf6fbe17308dde7b59917
MD5 5d0c6fd0b889096a657d8f7a419236ab
BLAKE2b-256 6b0eb6aeffb168e8d9bb61a086c80cb91cf1d5c998e5a2092498cd3e219fa2e4

See more details on using hashes here.

File details

Details for the file chemcluster-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: chemcluster-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for chemcluster-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a8dfd4b46d9aff58339759946d32bd2d1a1ef49ba6f0e8193407d39acc331cb2
MD5 ae1dcdff6b3978a9b88ccb3fca0062fd
BLAKE2b-256 ab1b2ec4037afa05ccf3cca646c20e67e1ca657ec3c150f6d2cea8d63751a9c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page