Skip to main content

Smart consensus QSAR

Project description

QSARcons - smart searching for consensus of QSAR models

QSARcons is a package designed to identify optimal consensus combinations of QSAR models. The project is motivated by the large number of available chemical descriptors and machine learning methods, which can be combined into many different QSAR models. Selecting the most effective subset - and combining them into a consensus - can significantly improve prediction accuracy and robustness.

Motivation

1. Simple design - unlike many existing frameworks, QSARcons focuses on simplicity and ease of use. It minimizes the number of parameters a user must adjust, making QSAR model construction more accessible and intuitive.

2. Traditional QSAR - QSARcons includes a wide range of traditional molecular descriptors and machine learning algorithms, providing a transparent baseline for comparison with more advanced approaches like deep learning-based or complex QSAR workflows.

3. Universal workflow - QSARcons to be applied to any type of chemical property modeling.

Overview

QSARcons provides a two-layer workflow.

1. Model generation

Build multiple QSAR models (>100) using 2D chemical descriptors and traditional machine learning algorithms. The individual model building pipeline is kept simple, without advanced data preprocessing. Optional in-house stepwise hyperparameter optimization is available for all ML methods.

2. Consensus search

Identify the optimal subset of QSAR models using several search strategies:

  • Random search

  • Systematic search

  • Genetic search

Installation

pip install qsarcons

QSARcons benchmarking

QSARcons can be easily benchmarked against alternative approaches. For that, just call the default pipeline function below. Input data are dataframes where the first column is molecule SMILES and the second column is molecule property (regression or binary classification).

import polaris
from sklearn.model_selection import train_test_split
from qsarcons.cli import run_qsarcons

# Load Polaris benchmark
benchmark = polaris.load_benchmark("tdcommons/caco2-wang")
data_train, data_test = benchmark.get_train_test_split()

df_train, df_test = data_train.as_dataframe(), data_test.as_dataframe()
df_train, df_val = train_test_split(df_train, test_size=0.2, random_state=42)

# Run QSARcons
test_pred = run_qsarcons(df_train, df_val, df_test, task="regression", output_folder="results")

# Evaluate predictions
results = benchmark.evaluate(test_pred)

Colab

See an example in QSARcons pipeline .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qsarcons-1.1.1.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qsarcons-1.1.1-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file qsarcons-1.1.1.tar.gz.

File metadata

  • Download URL: qsarcons-1.1.1.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for qsarcons-1.1.1.tar.gz
Algorithm Hash digest
SHA256 359ff970c3d9cfedbb7b7b47f1745b6130de507b4a99617f04ef5a3d15ec665c
MD5 707eddfd0a570c8a3805c486b364fcc6
BLAKE2b-256 45c66f7423d8cc3319fc1d31e3d7bcebe0b3bca859c1c1b884c46b1c88256a0c

See more details on using hashes here.

File details

Details for the file qsarcons-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: qsarcons-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for qsarcons-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 60094a6efe27c61315058406e7816fc74da2650d46fd25abd118e34609fdd295
MD5 09dc2c07680456c74748ed3b0bb01e04
BLAKE2b-256 7aa0c1c81a15ee6c328286318c0e255d1e1c25469fba004929bacd098a88c4fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page