Skip to main content

Subtype Discovery using Splicing

Project description

SPECTRA: Unsupervised Analysis of Alternative Splicing

✂️ About

SPECTRA (Splicing-based Pattern Extraction and Clustering using TRAnscriptomics) is an end-to-end pipeline for discovering patient subtypes based on alternative splicing. It is a modernized and optimized version of the original OncoSplice algorithm.

SPECTRA leverages an iterative clustering strategy to identify stable and dominant splicing patterns across patient samples. The new implementation enhances the speed, accuracy, and modularity, allowing seamless integration into both command-line workflows and interactive Python environments.

SPECTRA Workflow

📌 Installation

SPECTRA can installed as a Python package via pip. We recommend using conda environment with python version set to 3.12.

pip3 install splicespectrax

📚 Documentation

Detailed documentation and tutorials on how to perform SubSplice analysis is provided on ReadTheDocs now.

👩‍🏫 Tutorial

Example datasets: PSI files for each TCGA cancer can be downloaded from here.

SubSplice can be used in two ways:

  • As a command-line tool for end-to-end execution
  • As a modular workflow, where individual functions are called step-by-step. See the tutorial here.

See the tutorials and example scripts for each approach:

Command-Line Interface (CLI)

Run the entire pipeline with a single command using main.py. This is ideal for multiple dataset processing and automated workflows.

Modular Usage

Import and run individual components such as preprocessing, clustering, or visualization in a custom step-by-step analysis.

📝 Overview of Modules

Module Description
main.py Entry point for running the complete SPECTRA pipeline. Handles argument parsing and execution flow.
round_wrapper.py Wraps a single iteration of clustering (SPECTRA performs 3 iterations by default).
preprocess.py Performs variance-based and intercorrelation-based filtering of splicing events prior to clustering.
remove_redundancy.py Removes redundant splicing events based on intra-gene correlation.
feature_selection.py Implements PCA-based feature selection, similar to the splice-ICGS method in the original OncoSplice.
median_impute.py Imputes missing values in the splicing matrix using the median for each event.
visualizations.py Generates visual summaries, including splicing event annotation bar plots and cluster heatmaps.
determine_rank.py Automatically determines the optimal NMF rank (if not user-specified).
run_nmf.py Performs NMF clustering and assigns multi-label cluster memberships.
metadata_analysis.py Analyzes and annotates differential splicing events across clusters.
linear_svm.py Applies linear SVM for final cluster assignment.
correlation_depletion.py Identifies and depletes splicing events associated with a clustering round.
correlation_depletion_vectorized.py A faster version of correlationDepletion.py using imputed values and optimized calculations.

📖 Citation

Coming soon — citation information for referencing SPECTRA in publications.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subsplice-0.0.0.tar.gz (31.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subsplice-0.0.0-py3-none-any.whl (34.6 kB view details)

Uploaded Python 3

File details

Details for the file subsplice-0.0.0.tar.gz.

File metadata

  • Download URL: subsplice-0.0.0.tar.gz
  • Upload date:
  • Size: 31.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for subsplice-0.0.0.tar.gz
Algorithm Hash digest
SHA256 769a56aaefd286986ee281fc0e68928899482e2fdc66b721a755b860db1db490
MD5 b4b8236bf1919bc93628a2fbc7089536
BLAKE2b-256 68d11e6953631e7d06593d2dc6a12c5f5594d2c029180df44508a50f56d95eed

See more details on using hashes here.

File details

Details for the file subsplice-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: subsplice-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 34.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for subsplice-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0da7e37846c9c32f357b381298ab4e0d5da6f3560f11c9c2f3d0239a57e69f00
MD5 f591f290f022762d0532ea77600a35c3
BLAKE2b-256 eee466e377356615fa094e9b4cbd7bcc6b4e6860f6ebccc317671706891e474a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page