Skip to main content

A library to quickly build QSAR models

Project description

Ersilia's LazyQSAR

A library to build supervised models for chemistry fastly.

Installation

Install LazyQSAR from source:

git clone https://github.com/ersilia-os/lazy-qsar.git
cd lazy-qsar
python -m pip install -e .

To use the default Lazy QSAR descriptors, please install them:

python -m pip install -e .[descriptors]

This command will enable descriptors (featurizers) calculation. The first time you run LazyQSAR it, it will download the Chemeleon and the CDDD checkpoints, as well as install other dependencies. If you want to finalize this setup upfront, simply run:

lazyqsar-setup

Binary Classification

LazyQSAR's binary classifier can run either with default descriptors or with custom descriptors passed by the user.

Built-in descriptors

Instantiate the LazyBinaryQSAR class with a mode of choice (fast, default, slow):

from lazyqsar.qsar import LazyBinaryQSAR

model = LazyBinaryQSAR(mode="fast")
model.fit(smiles_list=smiles_train, y=y_train)
y_hat = model.predict_proba(smiles_list=smiles_test)[:,1]

Custom-made descriptors

Pre-calculate your descriptors using the preferred method. We recommend using the Ersilia Model Hub to that end. The .h5 format generated by Ersilia can be directly passed to the LazyQSAR pipeline. Alternatively, just pass the descriptors as an array in-memory.

from lazyqsar.agnostic import LazyBinaryClassifier

model = LazyBinaryClassifier()
model.fit(X=X_train, y=y_train)
y_hat = model.predict_proba(X=X_test)[:,1]

Using saved models at inference time

By default, models are saved as ONNX files. When a model is trained, you can simply load it using an artifact. In this case, the only crucial dependency is the ONNX runtime.

To save a model, simply run:

model.save(model_dir)

This will create a folder with ONNX files in it. You can use with the artifact.

from lazyqsar.artifacts import LazyBinaryClassifierArtifact

model = LazyBinaryClassifier.load(model_dir)
y_hat = model.predict_proba(X=X)[:,1]

Tests and benchmarks

Quick testing

In the /tests folder you can find a quick implementation of the methods described for easily checking that code is working. The Bioavailability dataset and Chemeleon descriptors are used as an example.

python test/test_binary_classification.py
python test/test_binary_classification.py --agnostic

Benchmarking

In the benchmark repository you will find the performance of the default estimators and descriptors on the TDCommons ADMET dataset. This is a provisional benchmark. The team is working on a more exhaustive one.

Disclaimer

This library is only intended for quick-and-dirty QSAR modeling. For a more complete automated QSAR modeling, please refer to Zaira Chem.

About us

Learn about the Ersilia Open Source Initiative!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazyqsar-2.1.6.tar.gz (54.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lazyqsar-2.1.6-py3-none-any.whl (72.7 kB view details)

Uploaded Python 3

File details

Details for the file lazyqsar-2.1.6.tar.gz.

File metadata

  • Download URL: lazyqsar-2.1.6.tar.gz
  • Upload date:
  • Size: 54.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for lazyqsar-2.1.6.tar.gz
Algorithm Hash digest
SHA256 9267d3c411d5852d07790cad7098a866aba29710ac36125bdbf365c58e6133aa
MD5 b904ecc9b9f27ed69dcd969f80d253b7
BLAKE2b-256 d730576a7b16b40b024d87762229e399aebd4d276de5a2b0efc551434afbc938

See more details on using hashes here.

File details

Details for the file lazyqsar-2.1.6-py3-none-any.whl.

File metadata

  • Download URL: lazyqsar-2.1.6-py3-none-any.whl
  • Upload date:
  • Size: 72.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for lazyqsar-2.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 31807e30c55fd7c2f111a2e34b9eeb495fa0e554804624f9dd48091ff6473def
MD5 918a28182496d6f919dc3bbd91a09d76
BLAKE2b-256 5626e29ae0d3e2771e245503f1ae3fbc825ff6c93edd2ee805a0ce2619fdd4d9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page