Skip to main content

A comprehensive Python package for molecular structure-activity relationship (QSAR/QSPR) studies with interactive SHAP visualization and AI-powered analysis

Project description

Surfacia

Surface Atomic Chemical Interaction Analyzer for descriptor extraction and interpretable machine learning in computational chemistry.

At a Glance

  • End-to-end workflow: SMILES -> 3D -> xTB/Gaussian -> Multiwfn -> descriptors -> ML/SHAP
  • CLI-first design for local and remote Linux/HPC usage
  • Supports full workflow mode and modular step-by-step execution
  • Includes SPES-C, a SHAP-landscape candidate-prioritization layer for high-value test-set ranking

Current Release

  • Latest stable version: 3.0.3
  • Python requirement: >= 3.9
  • License: MIT

PyPI: https://pypi.org/project/surfacia/

Quick Install (Recommended for Most Users)

# optional but recommended
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

pip install --upgrade pip
pip install surfacia

Check installation:

surfacia --help
python -c "import surfacia; print(surfacia.__version__)"

Important Compatibility Note (ML Stage)

To avoid ML parsing failures such as:

could not convert string to float: '[-3.xxxE0]'

use the validated dependency range:

  • xgboost>=2.1.4,<3.0.0
  • shap>=0.48.0,<0.49.0

Quick fix:

pip install --force-reinstall "xgboost==2.1.4" "shap==0.48.0"

External Software Requirements

Surfacia orchestrates external quantum-chemistry tools. Ensure these commands are available in your PATH:

  • xtb
  • g16 (and formchk)
  • Multiwfn or Multiwfn_noGUI

Minimal Usage

Run full workflow:

surfacia workflow -i molecules.csv --test-samples "1,3"

Run ML only (for existing descriptor table):

surfacia ml-analysis -i FinalFull_Mode3_20_168.csv \
  --max-features 1 --stepreg-runs 1 \
  --train-test-split 0.85 --epoch 32 --cores 8 \
  --test-samples "1,2,3"

Launch SHAP visualization with SPES overlay:

surfacia shap-viz \
  -i Training_Set_Detailed_Final_5feats_20260209_102423.csv \
  -x ./xyz_files \
  --test-csv Test_Set_Detailed_Final_5feats_20260209_102423.csv \
  --spes-csv SPES_Test_Set_Detailed_Final_5feats_20260209_102423.csv

SPES Candidate Prioritization

SPES is not a replacement regression model. It is a post-prediction prioritization layer built on the selected Surfacia model and its SHAP landscape.

  • The raw selected-model prediction remains the main interpretable regression output.
  • SPES-C adds a conservative ranking score for high-value candidate discovery.
  • In the interactive SHAP dashboard, users can switch the external overlay from the raw test set to the SPES test-set layer.
  • When a test set is present, current ML analysis outputs now automatically write SPES_Test_Set_Detailed_*.csv and matching metadata JSON files.

Source Install (Developers)

git clone https://github.com/sym823808458/Surfacia.git
cd Surfacia
pip install -e .

Documentation

Citation

If Surfacia helps your research, please cite the project and related publication when available.

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surfacia-3.0.3.tar.gz (431.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

surfacia-3.0.3-py3-none-any.whl (148.2 kB view details)

Uploaded Python 3

File details

Details for the file surfacia-3.0.3.tar.gz.

File metadata

  • Download URL: surfacia-3.0.3.tar.gz
  • Upload date:
  • Size: 431.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for surfacia-3.0.3.tar.gz
Algorithm Hash digest
SHA256 838ea01db96a8a825cca321a02e162bb2fbc886c8c5c04a249e2b608a943e39b
MD5 ad238ba298e16111dc7e8cff8e61898e
BLAKE2b-256 39923fdd76e2e64e772a188e6f11f491d3d4074931abef05f457556d569346de

See more details on using hashes here.

File details

Details for the file surfacia-3.0.3-py3-none-any.whl.

File metadata

  • Download URL: surfacia-3.0.3-py3-none-any.whl
  • Upload date:
  • Size: 148.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for surfacia-3.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1e623ff2359f4d013ae2fe9b86dd2266a45c656e02a967cfbc29315962ba52a2
MD5 0baace111d66a16b76063a73c916d060
BLAKE2b-256 03c776ef4fb6516126a34cbd6c57c16dd114c1c90e76d7289a28e5250d4805d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page