Skip to main content

Alethio Therapeutics Python Toolkit

Project description

alethiotx

Python Version License: MIT

Alethio Therapeutics Python Toolkit - A growing collection of open-source computational tools used by Alethio Therapeutics.

Overview

alethiotx is a modular Python package providing specialized tools for therapeutic research and drug discovery. Currently, the package features the Artemis module for drug target prioritization using public knowledge graphs. Additional modules and capabilities will be added in future releases.

Current Modules

Artemis Module (alethiotx.artemis)

The Artemis module enables accessible and scalable drug prioritization by integrating clinical trial data, drug databases (TTD), pathway information, and machine learning models. It leverages public knowledge graphs to prioritize therapeutic targets across multiple disease areas.

Artemis Module Features

  • Clinical Trials: Query and analyze clinical trials data from ClinicalTrials.gov
  • TTD: Match clinical interventions with TTD drug information and targets
  • Pathway Genes: Retrieve and analyze pathway genes using GeneShot API
  • Target Scoring: Calculate clinical target scores for drug targets based on trial phases and approvals
  • Machine Learning Pipeline: Built-in cross-validation and for target prediction
  • Multi-Disease Support: Pre-configured for breast, lung, prostate, melanoma, bowel cancer, diabetes, and cardiovascular disease

Future Modules

Additional modules for various aspects of drug discovery and therapeutic research are planned for future releases. Stay tuned!

Installation

pip install alethiotx

Quick Start

Note: The examples below demonstrate the Artemis module functionality. As new modules are added to the package, they will have their own usage examples.

1. Retrieve Clinical Trials Data

from alethiotx.artemis import trials, ttd, drugscores

# Query clinical trials for a specific indication
breast_trials = get_clinical_trials(search='Breast Cancer', last_6_years=True)

# Match trials with TTD to get target information
ttd_data = ttd(breast_trials)

# Calculate clinical development scores
scores = get_clinical_scores(ttd_data, include_approved=True)
print(scores.head())

2. Load Pre-computed Clinical Scores

from alethiotx.artemis import load_clinical_scores

# Load clinical scores for multiple diseases
breast, lung, prostate, melanoma, bowel, diabetes, cardio = load_clinical_scores(date='2025-11-11')

3. Pathway Gene Analysis

from alethiotx.artemis import get_pathway_genes load_pathway_genes

# Query GeneShot for disease-associated genes
aml_genes = get_pathway_genes("acute myeloid leukemia")
print(aml_genes.loc["FLT3", ["gene_count", "rank"]])

# Get top pathway genes for diseases
breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg = load_pathway_genes(n=100)

4. Machine Learning Pipeline

from alethiotx.artemis import pre_model, cv_pipeline, roc_curve
import pandas as pd

# Prepare your knowledge graph features (X) and clinical scores (y)
result = pre_model(X, y, pathway_genes=pathway_genes, bins=3)

# Run cross-validation pipeline
scores = cv_pipeline(X, y, n_iterations=10, scoring='roc_auc')
print(f"Mean AUC: {sum(scores)/len(scores):.3f}")

# Generate ROC curves
mean_auc = roc_curve(result['X'], result['y_binary'], n_splits=5, classifier='rf')

5. Visualize Gene Overlaps with UpSet Plots

from alethiotx.artemis import prepare_upset, create_upset_plot

# Load clinical scores or pathway genes for multiple diseases
breast, lung, prostate, melanoma, bowel, diabetes, cardio = load_clinical_scores()

# Prepare data for UpSet plot (mode='ct' for clinical targets)
upset_data = prepare_upset(breast, lung, prostate, melanoma, bowel, diabetes, cardio, mode='ct')

# Create and display the UpSet plot
plot = create_upset_plot(upset_data, min_subset_size=5)
plot.plot()

# For pathway genes, use mode='pg'
breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg = load_pathway_genes(n=100)
upset_data_pg = prepare_upset(breast_pg, lung_pg, prostate_pg, melanoma_pg, bowel_pg, diabetes_pg, cardio_pg, mode='pg')
plot_pg = create_upset_plot(upset_data_pg, min_subset_size=10)
plot_pg.plot()

Supported Disease Indications (Artemis Module)

The Artemis module includes built-in support for:

  • Myeloproliferative Neoplasm (MPN)
  • Breast Cancer
  • Lung Cancer
  • Prostate Cancer
  • Bowel Cancer (Colorectal)
  • Melanoma
  • Diabetes Mellitus Type 2
  • Cardiovascular Disease

Artemis Module API Reference

Data Loading & Processing

  • get_clinical_trials() - Retrieve clinical trials from ClinicalTrials.gov
  • ttd() - Match trials with TTD drug/target data
  • get_clinical_scores() - Calculate per-target clinical development scores
  • load_clinical_scores() - Load pre-computed clinical scores from S3
  • get_pathway_genes() - Query Ma'ayan Lab's GeneShot API for gene associations
  • load_pathway_genes() - Retrieve pathway gene data

Data Preparation

  • get_all_targets() - Extract unique target genes from score lists
  • cut_clinical_scores() - Filter scores by threshold
  • find_overlapping_genes() - Identify genes present in multiple datasets
  • uniquify_clinical_scores() - Remove overlapping genes from clinical scores
  • uniquify_pathway_genes() - Remove overlapping genes from pathway lists

Machine Learning

  • pre_model() - Prepare datasets for ML model training
  • cv_pipeline() - Cross-validation pipeline with customizable classifiers

Visualization

  • prepare_upset() - Prepare disease-related data for UpSet plot visualization
  • create_upset_plot() - Create UpSet plots for visualizing gene set intersections across diseases

Data Storage (Artemis Module)

The Artemis module uses AWS S3 for storing pre-computed data:

s3://alethiotx-artemis/data/
├── clinical_targets/{date}/{disease}.csv
├── pathway_genes/{date}/{disease}.csv
└── ttd/{date}

Requirements

  • Python >= 3.9
  • requests
  • scikit-learn
  • pandas
  • numpy
  • matplotlib
  • setuptools
  • fsspec
  • s3fs
  • upsetplot

Citation

If you use the Artemis module in your research, please cite:

Artemis: public knowledge graphs enable accessible and scalable drug target discovery
Vladimir Kiselev, Alethio Therapeutics

For other modules, citation information will be provided as they are released.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Vladimir Kiselev
Email: vlad.kiselev@alethiomics.com

Links

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


Current Focus: Artemis - Enabling accessible and scalable drug target discovery through public knowledge graphs.
Coming Soon: Additional modules for expanded drug discovery capabilities.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alethiotx-2.0.0.tar.gz (23.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alethiotx-2.0.0-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file alethiotx-2.0.0.tar.gz.

File metadata

  • Download URL: alethiotx-2.0.0.tar.gz
  • Upload date:
  • Size: 23.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for alethiotx-2.0.0.tar.gz
Algorithm Hash digest
SHA256 9a784e6338c845d4126bd5e66d749a0980f4bcb55d2fc35973be4fd310aee867
MD5 4add079fb01267c3a9211e88105c7528
BLAKE2b-256 5753071419df426bab6360cdda8316c285134cd160e8af5b0d6aa40b9c954044

See more details on using hashes here.

File details

Details for the file alethiotx-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: alethiotx-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for alethiotx-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85843d3c0e58c5537fdfe6af4c3bc1cb473ef70c3907f41297c1aed9551d6612
MD5 4a7c2d7671954a64bf3216b62528b28c
BLAKE2b-256 b5573d9efb1609a0a20f662c9f357b365ab889a8e2c0a460de779539b5deacac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page