Python version of R package SLIDE
Project description
loveslide
A Python interface to the SLIDE framework for latent factor discovery and statistical inference.
๐ Overview
loveslide wraps key components of the original SLIDE R package into a user-friendly Python interface, making it easier to incorporate into machine learning pipelines and bioinformatics workflows.
SLIDE (Statistical Latent Inference for Discovery and Explanation) combines:
- LOVE: A latent factor discovery algorithm using model-based overlapping clustering.
- Knockoffs: For statistically rigorous identification of significant standalone and interacting latent factors.
This Python implementation retains R underpinnings via rpy2 and is structured to be modular, extensible, and accessible from both the command line and within Python scripts or notebooks.
๐ Related Repositories
- ๐ฆ Original R package: https://github.com/jishnu-lab/SLIDE
- ๐ Python wrapper: https://github.com/alw399/SLIDE_py
๐ Installation
Set up a compatible Python environment:
module load anaconda3/2022.10
conda create -n loveslide_env python=3.9 r-base
conda activate loveslide_env
pip install loveslide
If needed, clone the environment used during development:
# On the cluster:
source activate /ix3/djishnu/alw399/envs/rhino
โก Quick Start
๐ฟ Command Line
python slide.py \
--x_path /path/to/your/features.csv \
--y_path /path/to/your/labels.csv \
--out_path /path/to/output/
Use full paths if not running from the src/loveslide directory.
๐งช In a Notebook
import loveslide
from loveslide import OptimizeSLIDE
input_params = {
'x_path': '/path/to/features.csv',
'y_path': '/path/to/labels.csv',
'fdr': 0.1,
'thresh_fdr': 0.1,
'spec': 0.2,
'y_factor': True,
'niter': 500,
'SLIDE_top_feats': 20,
'rep_CV': 50,
'pure_homo': True,
'delta': [0.01],
'lambda': [0.5, 0.1],
'out_path': '/path/to/output/'
}
slider = OptimizeSLIDE(input_params)
slider.run_pipeline(verbose=True, n_workers=1)
๐ฌ Pipeline Overview
The run_pipeline() method follows three key stages:
๐งฉ Stage 1: Latent Factor Discovery
- LOVE Algorithm: Identifies overlapping latent factors in the data.
- Output: Latent factor matrix (
z_matrix) and factor loadings.
๐ Stage 2: Statistical Inference with Knockoffs
- Identifies significant standalone and interacting latent factors.
- Controls False Discovery Rate (FDR) to maintain statistical rigor.
๐ Stage 3: Visualization
- Diagnostic plots
- Top genes/features for each latent factor (loadings > |0.05|)
โ๏ธ Parameters
| Name | Type | Description | Default/Example |
|---|---|---|---|
x_path |
str | Path to feature matrix CSV | Required |
y_path |
str | Path to response/labels CSV | Required |
fdr |
float | Knockoff FDR threshold | 0.1 |
thresh_fdr |
float | FDR threshold in LOVE | 0.1 |
spec |
float | Minimum reproducibility for a factor | 0.2 |
y_factor |
bool | Treat y as categorical |
True |
niter |
int | Iterations for LOVE | 500 |
SLIDE_top_feats |
int | Number of top features to plot | 20 |
rep_CV |
int | Repeats for cross-validation | 50 |
pure_homo |
bool | Use pure variables with loadings = 1 | True |
delta |
list | Regularization parameters | [0.01] |
lambda |
list | Penalty parameters | [0.5, 0.1] |
out_path |
str | Output directory | Required |
๐๏ธ Project Structure
SLIDE_py/
โโโ src/
โ โโโ loveslide/ # Main Python & R wrappers
โ โ โโโ slide.py # Main entry point
โ โ โโโ love.py
โ โ โโโ knockoffs.py
โ โ โโโ ...
โ โ โโโ LOVE-master/ # (Legacy) Original LOVE code
โ โ โโโ LOVE-SLIDE/ # Customized LOVE implementation for SLIDE
โโโ dist/
โโโ example/
โโโ ...
๐ง Design Notes
- Core statistical inference is done using R scripts via
rpy2. - Python acts as an orchestration layer to allow integration into ML workflows.
- Most plotting is done in R (e.g.,
pheatmap,ggplot2).
๐ Known Limitations and TODOs
- YAML โ dictionary conversion for easier parameter management
- Extend
y_factorhandling to non-binary variables - Parallelization of knockoff inference (e.g., in
select_short_freq) - Correlation networks visualization using
networkx
๐ข Citation & Contact
If you use loveslide in your work, please cite the original R implementation and this repository. For bugs or feature requests, please open an issue on GitHub.
-
Homepage: SLIDE_py on GitHub
-
Issues: Report an Issue
-
Authors:
- Ally Wang (
alw399@pitt.edu) - Swapnil Keshari (
swk25@pitt.edu)
- Ally Wang (
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file loveslide-0.0.7.tar.gz.
File metadata
- Download URL: loveslide-0.0.7.tar.gz
- Upload date:
- Size: 51.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d8ea80583acf31a43347963a67e100e651a61443b2e731146e890051a22fc709
|
|
| MD5 |
fa47285510fada859a3e714865ea21f9
|
|
| BLAKE2b-256 |
059d0cfd58cb504c2e6fe99585b1cf69373c4d8bc6b801c62a4d5cca675fda46
|
Provenance
The following attestation bundles were made for loveslide-0.0.7.tar.gz:
Publisher:
python-publish.yml on alw399/SLIDE_py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
loveslide-0.0.7.tar.gz -
Subject digest:
d8ea80583acf31a43347963a67e100e651a61443b2e731146e890051a22fc709 - Sigstore transparency entry: 756498539
- Sigstore integration time:
-
Permalink:
alw399/SLIDE_py@0de3db3450e3dffbced849692b6c35dcb33c5776 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/alw399
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@0de3db3450e3dffbced849692b6c35dcb33c5776 -
Trigger Event:
push
-
Statement type:
File details
Details for the file loveslide-0.0.7-py3-none-any.whl.
File metadata
- Download URL: loveslide-0.0.7-py3-none-any.whl
- Upload date:
- Size: 66.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e07a84fa1318fc44e86a2068a423712237d3a47cc375cfe5da188692383c8bea
|
|
| MD5 |
bf1b0ab6005df7a60172212eff00e1da
|
|
| BLAKE2b-256 |
487cd2e9d4407c1f964a6b1e79f6bcb2b1f570f7544357ffa946c81f6feed6c6
|
Provenance
The following attestation bundles were made for loveslide-0.0.7-py3-none-any.whl:
Publisher:
python-publish.yml on alw399/SLIDE_py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
loveslide-0.0.7-py3-none-any.whl -
Subject digest:
e07a84fa1318fc44e86a2068a423712237d3a47cc375cfe5da188692383c8bea - Sigstore transparency entry: 756498556
- Sigstore integration time:
-
Permalink:
alw399/SLIDE_py@0de3db3450e3dffbced849692b6c35dcb33c5776 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/alw399
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@0de3db3450e3dffbced849692b6c35dcb33c5776 -
Trigger Event:
push
-
Statement type: