Skip to main content

A library to quickly build QSAR models

Project description

Ersilia's LazyQSAR

A library to build supervised models for chemistry fastly.

Installation

Install LazyQSAR from source:

git clone https://github.com/ersilia-os/lazy-qsar.git
cd lazy-qsar
python -m pip install -e .

To use the default Lazy QSAR descriptors, please install them:

python -m pip install -e .[descriptors]

And to use a light version of TuneTables as an estimator, also please install it:

pip install "lazyqsar[tune-tables]"
pip install "git+https://github.com/ersilia-os/TuneTablesLight.git@main"

Binary Classification

LazyQSAR's binary classifier can run either with default descriptors or with custom descriptors passed by the user.

Built-in descriptors

Instantiate the LazyBinaryQSAR class with either of the available descriptors (Chemeleon or Morgan fingerprints) and estimators (Logistic Regression, Random Forest or Tune Tables) and simply fit and predict results:

import lazyqsar

model = lazyqsar.LazyBinaryQSAR(
    descriptor_type="chemeleon", model_type="logistic_regression"
    )
model.fit(smiles_train, y_train)
model.save_model(model_path)
y_hat = model.predict_proba(smiles_test)

Custom-made descriptors

Pre-calculate your descriptors using the preferred method. We recommend using the Ersilia Model Hub to that end. The .h5 format generated by Ersilia can be directly passed to the LazyQSAR pipeline, or, alternatively, an array with the descriptors.

import lazyqsar

X_train = "my_descriptors" #path to descriptors
X_test = "my_descriptors" #path to descriptors

 model = lazyqsar.LazyBinaryClassifier(
    model_type="logistic_regression"
    )
model.fit(X_train, y_train)
model.save_model(model_path)
y_hat = model.predict_proba(X_test)

Tests and benchmarks

In the /benchmark folder you will find the performance of the default estimators and descriptors on the TDCommons ADMET dataset. In the /tests folder you can find a quick implementation of the methods described for easily checking any change in the code. The Bioavailability dataset is used as an example.

Disclaimer

This library is only intended for quick-and-dirty QSAR modeling. For a more complete automated QSAR modeling, please refer to Zaira Chem

About us

Learn about the Ersilia Open Source Initiative!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazyqsar-1.0.tar.gz (40.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lazyqsar-1.0-py3-none-any.whl (53.7 kB view details)

Uploaded Python 3

File details

Details for the file lazyqsar-1.0.tar.gz.

File metadata

  • Download URL: lazyqsar-1.0.tar.gz
  • Upload date:
  • Size: 40.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for lazyqsar-1.0.tar.gz
Algorithm Hash digest
SHA256 b81e5de28816b873c4c4652df5dbe272990bbeebb5d5758fb0a667b70b759f20
MD5 6236ff0b4099df62729a5c085eea2689
BLAKE2b-256 9d4fcf4567e6bc24be4de7658fade1e0335069e18bae74d715e1e3541d00053d

See more details on using hashes here.

File details

Details for the file lazyqsar-1.0-py3-none-any.whl.

File metadata

  • Download URL: lazyqsar-1.0-py3-none-any.whl
  • Upload date:
  • Size: 53.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for lazyqsar-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 711df5c62e150c72bdd2c4c7940c5449bc28fbbcb1a630d0bd5f4da1c475c301
MD5 99b7cf23f4aa9f59aeb443a3b6372f8c
BLAKE2b-256 6b683d3a7546195c3092d4080808e2af5b8ab7357ce5b47977cb1dbfdfe8dd73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page