Skip to main content

Pan-cancer detection of T-cell clonal expansion from single-cell RNA sequencing

Project description

PyPI version Python versions License Visitors

scXpand Logo

scXpand: Pan-cancer Detection of T-cell Clonal Expansion

Detect T-cell clonal expansion from single-cell RNA sequencing data without paired TCR sequencing

PreprintDocumentationInstallationQuick StartUsage GuideCitation

scXpand Datasets Overview

A framework for predicting T-cell clonal expansion from single-cell RNA sequencing data.

Manuscript in preparation - detailed methodology and benchmarks coming soon.

scXpand follows the scverse ecosystem <https://scverse.org/>_ standards for single-cell analysis tools.

View full documentation for comprehensive guides and API reference.


Features

  • Multiple Model Architectures:
    • Autoencoder-based: Encoder-decoder with reconstruction and classification heads
    • MLP: Multi-layer perceptron
    • LightGBM: Gradient boosted decision trees
    • Linear Models: Logistic regression and support vector machines
  • Scalable Processing: Handles millions of cells with memory-efficient data streaming from disk during training
  • Automated Hyperparameter Optimization: Built-in Optuna integration for model tuning

Installation

Installing the Published Package: For regular use

scXpand is available in two variants to match your hardware:

NVIDIA GPU with CUDA Support

  • Using pip:
    pip install --upgrade scxpand-cuda --extra-index-url https://download.pytorch.org/whl/cu128
    
  • Using uv:
    uv pip install --upgrade scxpand-cuda --extra-index-url https://download.pytorch.org/whl/cu128 --index-strategy unsafe-best-match
    

CPU, Apple Silicon, or Non-CUDA GPUs

  • Using pip:
    pip install --upgrade scxpand
    
  • Using uv:
    uv pip install --upgrade scxpand
    
Local Development Setup: For contributing or working with the source code

If you want to contribute or work with the latest source code, follow these steps:

  1. Clone the repository:

    git clone https://github.com/yizhak-lab-ccg/scXpand.git
    cd scXpand
    
  2. Install uv (if not already installed):

    # Follow the installation guide: https://docs.astral.sh/uv/getting-started/installation/
    # Verify installation:
    uv --version
    
  3. Install in development mode:

    uv pip install -e ".[dev]"
    
  4. Install pre-commit hooks:

    pre-commit install
    
  5. For PyTorch installation (optional):

    Using uv with automatic backend selection (Recommended):

    uv pip install torch torchvision torchaudio --torch-backend=auto
    

    This automatically detects your system's optimal PyTorch backend (CUDA if available, otherwise CPU/MPS).

See the full installation guide for detailed setup instructions.


Quick Start

import scxpand
# Make sure that "your_data.h5ad" includes only T cells for the results to be meaningful
# Ensure that "your_data.var_names" are provided as Ensembl IDs (as the pre-trained models were trained using this gene representation)
# Please refer to our documentation for more information

# List available pre-trained models
scxpand.list_pretrained_models()

# Run inference with automatic model download
results = scxpand.run_inference(
    model_name="pan_cancer_autoencoder",  # default model
    data_path="your_data.h5ad"
)

# Access predictions
predictions = results.predictions
if results.has_metrics:
    print(f"AUROC: {results.get_auroc():.3f}")

See our Tutorial Notebook for a complete example with data preprocessing, T-cell filtering, gene ID conversion, and model application using a real breast cancer dataset.


Documentation

Setup & Getting Started:

Using Pre-trained Models:

Training Your Own Models:

Understanding Results:

📖 Full Documentation - Complete guides, API reference, and interactive tutorials


License

This project is licensed under the MIT License – see the LICENSE file for details.


Citation

If you use scXpand in your research, please cite:

Shorer, O., Amit, R., and Yizhak, K. (2025). scXpand: Pan-cancer detection of T-cell clonal expansion from single-cell RNA sequencing without paired single-cell TCR sequencing. Preprint at bioRxiv, https://doi.org/10.1101/2025.09.14.676069.

BibTeX
@article{shorer2025scxpand,
  title={scXpand: Pan-cancer detection of T-cell clonal expansion from single-cell RNA sequencing without paired single-cell TCR sequencing},
  author={Shorer, Ofir and Amit, Ron and Yizhak, Keren},
  year={2025},
  journal={bioRxiv},
  doi={https://doi.org/10.1101/2025.09.14.676069}
}

This project was created in favor of the scientific community worldwide, with a special dedication to the cancer research community.

We hope you'll find this repository helpful, and we warmly welcome any requests or suggestions - please don't hesitate to reach out!

Visitor Map

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scxpand-0.3.7.dev3.tar.gz (137.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scxpand-0.3.7.dev3-py3-none-any.whl (142.2 kB view details)

Uploaded Python 3

File details

Details for the file scxpand-0.3.7.dev3.tar.gz.

File metadata

  • Download URL: scxpand-0.3.7.dev3.tar.gz
  • Upload date:
  • Size: 137.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for scxpand-0.3.7.dev3.tar.gz
Algorithm Hash digest
SHA256 84fbd8e93036dca7cc65f5db8e897be7cb1dcb73b5dcec3deb482f56111d6b52
MD5 058d1d0e0a2c285fae10bb78f8e1ca9c
BLAKE2b-256 9e6ba32d38c75e8241d89a72d88fc19fe92f9c4de8a64cebf27bbe43a7c6e354

See more details on using hashes here.

File details

Details for the file scxpand-0.3.7.dev3-py3-none-any.whl.

File metadata

File hashes

Hashes for scxpand-0.3.7.dev3-py3-none-any.whl
Algorithm Hash digest
SHA256 35ed21f317fba2a23b7aaccc81389bfcdc42a64ad23e54ae0c2633e3b9cfb12a
MD5 69f74daabf3a8bfe3d1a89acd8584b3f
BLAKE2b-256 98bb895c86eed0b1ec0d181b014dacb153d9434469b6dad9680b0cf4b0f69d4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page