Skip to main content

Neural network virtual screening for antileishmanial activity against Leishmania donovani

Project description

leishmania-screen

Neural network–based virtual screening for antileishmanial activity against Leishmania donovani.

Trained on a curated dataset of 6,699 compounds. Achieves ROC-AUC 0.884 on the held-out test set.


Installation

pip install leishmania-screen

RDKit note: RDKit is listed as a dependency (rdkit>=2023.3). If your environment manages RDKit via conda, install it there first:

conda install -c conda-forge rdkit
pip install leishmania-screen

Quick start — Python API

from leishmania_screen import predict

# Single compound
result = predict("CCO")
print(result.label)        # "Active" or "Inactive"
print(result.probability)  # float in [0, 1]

# Batch
results = predict(["CCO", "CC(=O)Oc1ccccc1C(=O)O", "not_valid"])
for r in results:
    print(r.smiles, r.label, r.probability, r.error)

PredictionResult fields

Field Type Description
smiles str The input SMILES string
label str "Active", "Inactive", or "Invalid"
probability float | None Sigmoid output of the model (0–1); None for invalid inputs
error str | None Reason for invalidity; None for valid inputs

Quick start — Command line

# Single SMILES
leishscreen --smiles "CC(=O)Oc1ccccc1C(=O)O"

# Batch from CSV (must have a column named 'smiles')
leishscreen --file compounds.csv --output results.csv

# Batch from plain text (one SMILES per line)
leishscreen --file smiles.txt --output results.csv

# Custom column name
leishscreen --file library.csv --smiles-col SMILES --output results.csv

# Version
leishscreen --version

Model details

Item Value
Target Leishmania donovani (binary: Active / Inactive)
Training dataset 6,699 compounds (2,574 active, 4,125 inactive)
Feature pipeline 218 RDKit descriptors + 2,728 fingerprint bits → 900-feature MI selection → StandardScaler → PCA (100 components)
Architecture Linear(100→512)→BN→GELU→Drop(0.30) → Linear(512→256)→BN→GELU→Drop(0.25) → Linear(256→128)→GELU→Drop(0.15) → Linear(128→1)
Loss BCEWithLogitsLoss with class-imbalance pos_weight
Optimizer AdamW (lr=8e-4, weight_decay=2e-4) + gradient clipping
Scheduler ReduceLROnPlateau (factor=0.5, patience=5)
Training strategy Multi-seed ensemble (seeds 42, 52, 62); early stopping (patience=18)
Classification threshold 0.60 (optimised on validation set: precision ≥ 0.70, max F1)
Test ROC-AUC 0.884
Test PR-AUC 0.828
Test Accuracy 0.815
Test Balanced Accuracy 0.810

Fingerprints used

Type Parameters Bits
Morgan (ECFP-like) radius=2 512
Avalon 512
Topological Torsion 512
Atom Pair 512
MACCS Keys 167
RDKit Path-Based minPath=5, maxPath=7 512

Citation

If you use this package in your research, please cite:

[Citation to be added upon publication]

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

leishmania_screen-1.0.0.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

leishmania_screen-1.0.0-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file leishmania_screen-1.0.0.tar.gz.

File metadata

  • Download URL: leishmania_screen-1.0.0.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for leishmania_screen-1.0.0.tar.gz
Algorithm Hash digest
SHA256 7d799d18b2d179ea05bfa5b816f17655b3b47f77a6713679d344f95512d3628f
MD5 a8f7a0735954b2879e30174a36fd0ed1
BLAKE2b-256 f5587b892283133c88d87f6d01f07d3484e733f04a3da8bb47fd7bf0ae7a60a3

See more details on using hashes here.

File details

Details for the file leishmania_screen-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for leishmania_screen-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2f23c08a3e25b549199396c924c7ac25ca8d6ab2f7cf76402a73f181ff943cb6
MD5 a53c2211c314ee5d4a4bb6507cc45565
BLAKE2b-256 c1b6d8e74043df83a785a19b3b0ca42690cbad8919c66a3671eeabc85f21b5fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page