Skip to main content

Alethio Therapeutics Python Toolkit

Project description

alethiotx

Python Version License: MIT

Alethio Therapeutics Python Toolkit - A growing collection of open-source computational tools used by Alethio Therapeutics.

Overview

alethiotx is a modular Python package providing specialized tools for therapeutic research and drug discovery. Currently, the package features the Artemis module for drug target prioritization using public knowledge graphs. Additional modules and capabilities will be added in future releases.

Current Modules

Artemis Module (alethiotx.artemis)

The Artemis module enables accessible and scalable drug prioritization by integrating clinical trial data, drug databases (TTD), pathway information, and machine learning models. It leverages public knowledge graphs to prioritize therapeutic targets across multiple disease areas.

Artemis Module Features

  • Clinical Trials: Query and analyze clinical trials data from ClinicalTrials.gov
  • TTD: Match clinical interventions with TTD drug information and targets
  • Pathway Genes: Retrieve and analyze pathway genes using GeneShot API
  • Target Scoring: Calculate clinical target scores for drug targets based on trial phases and approvals
  • Machine Learning Pipeline: Built-in cross-validation and for target prediction
  • Multi-Disease Support: Pre-configured for breast, lung, prostate, melanoma, bowel cancer, diabetes, and cardiovascular disease

Future Modules

Additional modules for various aspects of drug discovery and therapeutic research are planned for future releases. Stay tuned!

Installation

pip install alethiotx

Quick Start

Note: The examples below demonstrate the Artemis module functionality. As new modules are added to the package, they will have their own usage examples.

1. Retrieve Clinical Trials Data

from alethiotx.artemis import trials, ttd, drugscores

# Query clinical trials for a specific indication
breast_trials = get_clinical_trials(search='Breast Cancer', last_6_years=True)

# Match trials with TTD to get target information
ttd_data = ttd(breast_trials)

# Calculate clinical development scores
scores = get_clinical_scores(ttd_data, include_approved=True)
print(scores.head())

2. Load Pre-computed Clinical Scores

from alethiotx.artemis import load_clinical_scores

# Load clinical scores for multiple diseases
breast, lung, prostate, melanoma, bowel, diabetes, cardio = load_clinical_scores(date='2025-11-11')

3. Pathway Gene Analysis

from alethiotx.artemis import get_pathway_genes load_pathway_genes

# Query GeneShot for disease-associated genes
aml_genes = get_pathway_genes("acute myeloid leukemia")
print(aml_genes.loc["FLT3", ["gene_count", "rank"]])

# Get top pathway genes for diseases
breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg = load_pathway_genes(n=100)

4. Machine Learning Pipeline

from alethiotx.artemis import pre_model, cv_pipeline, roc_curve
import pandas as pd

# Prepare your knowledge graph features (X) and clinical scores (y)
result = pre_model(X, y, pathway_genes=pathway_genes, bins=3)

# Run cross-validation pipeline
scores = cv_pipeline(X, y, n_iterations=10, scoring='roc_auc')
print(f"Mean AUC: {sum(scores)/len(scores):.3f}")

# Generate ROC curves
mean_auc = roc_curve(result['X'], result['y_binary'], n_splits=5, classifier='rf')

5. Visualize Gene Overlaps with UpSet Plots

from alethiotx.artemis import prepare_upset, create_upset_plot

# Load clinical scores or pathway genes for multiple diseases
breast, lung, prostate, melanoma, bowel, diabetes, cardio = load_clinical_scores()

# Prepare data for UpSet plot (mode='ct' for clinical targets)
upset_data = prepare_upset(breast, lung, prostate, melanoma, bowel, diabetes, cardio, mode='ct')

# Create and display the UpSet plot
plot = create_upset_plot(upset_data, min_subset_size=5)
plot.plot()

# For pathway genes, use mode='pg'
breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg = load_pathway_genes(n=100)
upset_data_pg = prepare_upset(breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg, mode='pg')
plot_pg = create_upset_plot(upset_data_pg, min_subset_size=10)
plot_pg.plot()

Supported Disease Indications (Artemis Module)

The Artemis module includes built-in support for:

  • Myeloproliferative Neoplasm (MPN)
  • Breast Cancer
  • Lung Cancer
  • Prostate Cancer
  • Bowel Cancer (Colorectal)
  • Melanoma
  • Diabetes Mellitus Type 2
  • Cardiovascular Disease

Artemis Module API Reference

Data Loading & Processing

  • get_clinical_trials() - Retrieve clinical trials from ClinicalTrials.gov
  • ttd() - Match trials with TTD drug/target data
  • get_clinical_scores() - Calculate per-target clinical development scores
  • load_clinical_scores() - Load pre-computed clinical scores from S3
  • get_pathway_genes() - Query Ma'ayan Lab's GeneShot API for gene associations
  • load_pathway_genes() - Retrieve pathway gene data

Data Preparation

  • get_all_targets() - Extract unique target genes from score lists
  • cut_clinical_scores() - Filter scores by threshold
  • find_overlapping_genes() - Identify genes present in multiple datasets
  • uniquify_clinical_scores() - Remove overlapping genes from clinical scores
  • uniquify_pathway_genes() - Remove overlapping genes from pathway lists

Machine Learning

  • pre_model() - Prepare datasets for ML model training
  • cv_pipeline() - Cross-validation pipeline with customizable classifiers

Visualization

  • prepare_upset() - Prepare disease-related data for UpSet plot visualization
  • create_upset_plot() - Create UpSet plots for visualizing gene set intersections across diseases

Data Storage (Artemis Module)

The Artemis module uses AWS S3 for storing pre-computed data:

s3://alethiotx-artemis/data/
├── clinical_targets/{date}/{disease}.csv
├── pathway_genes/{date}/{disease}.csv
└── ttd/{date}

Requirements

  • Python >= 3.9
  • requests
  • scikit-learn
  • pandas
  • numpy
  • matplotlib
  • setuptools
  • fsspec
  • s3fs
  • upsetplot

Citation

If you use the Artemis module in your research, please cite:

Artemis: public knowledge graphs enable accessible and scalable drug target discovery
Vladimir Kiselev, Alethio Therapeutics

For other modules, citation information will be provided as they are released.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Vladimir Kiselev
Email: vlad.kiselev@alethiomics.com

Links

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


Current Focus: Artemis - Enabling accessible and scalable drug target discovery through public knowledge graphs.
Coming Soon: Additional modules for expanded drug discovery capabilities.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alethiotx-2.0.2.tar.gz (28.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alethiotx-2.0.2-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file alethiotx-2.0.2.tar.gz.

File metadata

  • Download URL: alethiotx-2.0.2.tar.gz
  • Upload date:
  • Size: 28.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for alethiotx-2.0.2.tar.gz
Algorithm Hash digest
SHA256 be897589621c30e684de37f7d16d237dd516ea09cf3e9072e6eb299b1fc7cf3a
MD5 bf07b6884d936f8d01b3851fad94dc70
BLAKE2b-256 d24ab07ac4f4636ea7424812d3b7e853e0f68772447020860dabfbef8cf00b37

See more details on using hashes here.

File details

Details for the file alethiotx-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: alethiotx-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for alethiotx-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cc96efed73a5fc726d5cbd464b0f06ec1e4a09afaeb0d45c2456e2ba46ee8cc7
MD5 67aafdea05a600afdb8b29dbdbc50c18
BLAKE2b-256 602611d7ae7dc18b8ec5209faa3b3e5eee7a908b3c8b34c2b5a1fa1a4367c5a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page