Skip to main content

Python version of R package SLIDE

Project description

loveslide

A Python interface to the SLIDE framework for latent factor discovery and statistical inference.


๐Ÿ“˜ Overview

loveslide wraps key components of the original SLIDE R package into a user-friendly Python interface, making it easier to incorporate into machine learning pipelines and bioinformatics workflows.

SLIDE (Statistical Latent Inference for Discovery and Explanation) combines:

  • LOVE: A latent factor discovery algorithm using model-based overlapping clustering.
  • Knockoffs: For statistically rigorous identification of significant standalone and interacting latent factors.

This Python implementation retains R underpinnings via rpy2 and is structured to be modular, extensible, and accessible from both the command line and within Python scripts or notebooks.


๐Ÿ”— Related Repositories


๐Ÿš€ Installation

Set up a compatible Python environment:

module load anaconda3/2022.10
conda create -n loveslide_env python=3.9 r-base
conda activate loveslide_env
pip install loveslide

If needed, clone the environment used during development:

# On the cluster:
source activate /ix3/djishnu/alw399/envs/rhino

โšก Quick Start

๐Ÿ“ฟ Command Line

python slide.py \
  --x_path /path/to/your/features.csv \
  --y_path /path/to/your/labels.csv \
  --out_path /path/to/output/

Use full paths if not running from the src/loveslide directory.


๐Ÿงช In a Notebook

import loveslide

from loveslide import OptimizeSLIDE

input_params = {
    'x_path': '/path/to/features.csv',
    'y_path': '/path/to/labels.csv',
    'fdr': 0.1,
    'thresh_fdr': 0.1,
    'spec': 0.2,
    'y_factor': True,
    'niter': 500,
    'SLIDE_top_feats': 20,
    'rep_CV': 50,
    'pure_homo': True,
    'delta': [0.01],
    'lambda': [0.5, 0.1],
    'out_path': '/path/to/output/'
}

slider = OptimizeSLIDE(input_params)
slider.run_pipeline(verbose=True, n_workers=1)

๐Ÿ”ฌ Pipeline Overview

The run_pipeline() method follows three key stages:

๐Ÿงฉ Stage 1: Latent Factor Discovery

  • LOVE Algorithm: Identifies overlapping latent factors in the data.
  • Output: Latent factor matrix (z_matrix) and factor loadings.

๐Ÿ“Š Stage 2: Statistical Inference with Knockoffs

  • Identifies significant standalone and interacting latent factors.
  • Controls False Discovery Rate (FDR) to maintain statistical rigor.

๐Ÿ“ˆ Stage 3: Visualization

  • Diagnostic plots
  • Top genes/features for each latent factor (loadings > |0.05|)

โš™๏ธ Parameters

Name Type Description Default/Example
x_path str Path to feature matrix CSV Required
y_path str Path to response/labels CSV Required
fdr float Knockoff FDR threshold 0.1
thresh_fdr float FDR threshold in LOVE 0.1
spec float Minimum reproducibility for a factor 0.2
y_factor bool Treat y as categorical True
niter int Iterations for LOVE 500
SLIDE_top_feats int Number of top features to plot 20
rep_CV int Repeats for cross-validation 50
pure_homo bool Use pure variables with loadings = 1 True
delta list Regularization parameters [0.01]
lambda list Penalty parameters [0.5, 0.1]
out_path str Output directory Required

๐Ÿ—๏ธ Project Structure

SLIDE_py/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ loveslide/             # Main Python & R wrappers
โ”‚   โ”‚   โ”œโ”€โ”€ slide.py           # Main entry point
โ”‚   โ”‚   โ”œโ”€โ”€ love.py
โ”‚   โ”‚   โ”œโ”€โ”€ knockoffs.py
โ”‚   โ”‚   โ”œโ”€โ”€ ...
โ”‚   โ”‚   โ”œโ”€โ”€ LOVE-master/       # (Legacy) Original LOVE code
โ”‚   โ”‚   โ””โ”€โ”€ LOVE-SLIDE/        # Customized LOVE implementation for SLIDE
โ”œโ”€โ”€ dist/
โ”œโ”€โ”€ example/
โ”œโ”€โ”€ ...

๐Ÿง  Design Notes

  • Core statistical inference is done using R scripts via rpy2.
  • Python acts as an orchestration layer to allow integration into ML workflows.
  • Most plotting is done in R (e.g., pheatmap, ggplot2).

๐Ÿ“Œ Known Limitations and TODOs

  • YAML โ†’ dictionary conversion for easier parameter management
  • Extend y_factor handling to non-binary variables
  • Parallelization of knockoff inference (e.g., in select_short_freq)
  • Correlation networks visualization using networkx

๐Ÿ“ข Citation & Contact

If you use loveslide in your work, please cite the original R implementation and this repository. For bugs or feature requests, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

loveslide-0.0.7.tar.gz (51.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

loveslide-0.0.7-py3-none-any.whl (66.4 kB view details)

Uploaded Python 3

File details

Details for the file loveslide-0.0.7.tar.gz.

File metadata

  • Download URL: loveslide-0.0.7.tar.gz
  • Upload date:
  • Size: 51.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for loveslide-0.0.7.tar.gz
Algorithm Hash digest
SHA256 d8ea80583acf31a43347963a67e100e651a61443b2e731146e890051a22fc709
MD5 fa47285510fada859a3e714865ea21f9
BLAKE2b-256 059d0cfd58cb504c2e6fe99585b1cf69373c4d8bc6b801c62a4d5cca675fda46

See more details on using hashes here.

Provenance

The following attestation bundles were made for loveslide-0.0.7.tar.gz:

Publisher: python-publish.yml on alw399/SLIDE_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file loveslide-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: loveslide-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 66.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for loveslide-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e07a84fa1318fc44e86a2068a423712237d3a47cc375cfe5da188692383c8bea
MD5 bf1b0ab6005df7a60172212eff00e1da
BLAKE2b-256 487cd2e9d4407c1f964a6b1e79f6bcb2b1f570f7544357ffa946c81f6feed6c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for loveslide-0.0.7-py3-none-any.whl:

Publisher: python-publish.yml on alw399/SLIDE_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page