Skip to main content

Neural network virtual screening for antileishmanial activity against Leishmania donovani

Project description

leishmania-screen

Neural network–based virtual screening for antileishmanial activity against Leishmania donovani.

Trained on a curated dataset of 6,699 compounds. Achieves ROC-AUC 0.884 on the held-out test set.


Installation

pip install leishmania-screen

RDKit note: RDKit is listed as a dependency (rdkit>=2023.3). If your environment manages RDKit via conda, install it there first:

conda install -c conda-forge rdkit
pip install leishmania-screen

Quick start — Python API

from leishmania_screen import predict

# Single compound
result = predict("CCO")
print(result.label)        # "Active" or "Inactive"
print(result.probability)  # float in [0, 1]

# Batch
results = predict(["CCO", "CC(=O)Oc1ccccc1C(=O)O", "not_valid"])
for r in results:
    print(r.smiles, r.label, r.probability, r.error)

PredictionResult fields

Field Type Description
smiles str The input SMILES string
label str "Active", "Inactive", or "Invalid"
probability float | None Sigmoid output of the model (0–1); None for invalid inputs
error str | None Reason for invalidity; None for valid inputs

Quick start — Command line

# Single SMILES
leishscreen --smiles "CC(=O)Oc1ccccc1C(=O)O"

# Batch from CSV (must have a column named 'smiles')
leishscreen --file compounds.csv --output results.csv

# Batch from plain text (one SMILES per line)
leishscreen --file smiles.txt --output results.csv

# Custom column name
leishscreen --file library.csv --smiles-col SMILES --output results.csv

# Version
leishscreen --version

Model details

Item Value
Target Leishmania donovani (binary: Active / Inactive)
Training dataset 6,699 compounds (2,574 active, 4,125 inactive)
Feature pipeline 218 RDKit descriptors + 2,728 fingerprint bits → 900-feature MI selection → StandardScaler → PCA (100 components)
Architecture Linear(100→512)→BN→GELU→Drop(0.30) → Linear(512→256)→BN→GELU→Drop(0.25) → Linear(256→128)→GELU→Drop(0.15) → Linear(128→1)
Loss BCEWithLogitsLoss with class-imbalance pos_weight
Optimizer AdamW (lr=8e-4, weight_decay=2e-4) + gradient clipping
Scheduler ReduceLROnPlateau (factor=0.5, patience=5)
Training strategy Multi-seed ensemble (seeds 42, 52, 62); early stopping (patience=18)
Classification threshold 0.60 (optimised on validation set: precision ≥ 0.70, max F1)
Test ROC-AUC 0.884
Test PR-AUC 0.828
Test Accuracy 0.815
Test Balanced Accuracy 0.810

Fingerprints used

Type Parameters Bits
Morgan (ECFP-like) radius=2 512
Avalon 512
Topological Torsion 512
Atom Pair 512
MACCS Keys 167
RDKit Path-Based minPath=5, maxPath=7 512

Citation

If you use this package in your research, please cite:

[Citation to be added upon publication]

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

leishmania_screen-1.0.1.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

leishmania_screen-1.0.1-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file leishmania_screen-1.0.1.tar.gz.

File metadata

  • Download URL: leishmania_screen-1.0.1.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for leishmania_screen-1.0.1.tar.gz
Algorithm Hash digest
SHA256 93c3b37efe74d455e633f6991127979ebbdd18dc3358c4be2ce438e1c3a885a0
MD5 f601d5bff47492c52147296b7c61beee
BLAKE2b-256 5f67afd76f03e06b3b618cf0c63a6fe4f76bc9f53d0760dad36ad509ab7178ee

See more details on using hashes here.

File details

Details for the file leishmania_screen-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for leishmania_screen-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 37fd13ff1edcb7e0cf3d0f206cf54653cff60f43d9e25b68129768170e4dd3d3
MD5 d1a6126f06db6499b715ab3136d2309a
BLAKE2b-256 5e290e6854ddd07bee4c19c7831865c4f982cedf7e58fb1b2555d08cd456c6c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page