No project description provided
Project description
BSVAE: Biologically Structured Variational Autoencoder
BSVAE is a PyTorch package for Structured Factor Variational Autoencoders (StructuredFactorVAE).
It is designed for gene expression modeling with biological priors, integrating protein–protein interaction (PPI) networks and sparsity constraints for interpretable latent representations.
Features
- Structured VAE architecture (
StructuredFactorVAE)- Factorized encoder/decoder with group sparsity
- Optional Laplacian regularization from PPI networks
- Dataset utilities
- Load gene expression matrices (
GeneExpression) - Support for CSV full-matrix mode or pre-split train/test mode
- Load gene expression matrices (
- Biological priors
- Fetch and cache STRING PPI networks by NCBI TaxID
- Map gene symbols / Ensembl IDs to protein IDs using MyGene.info or BioMart
- Training and evaluation
- Unified training loop (
Trainer) - Evaluation with reconstruction, KL, sparsity, and Laplacian penalties (
Evaluator)
- Unified training loop (
- Reproducibility
- Save/load models + metadata (
modelIO) - Configurable hyperparameters via
hyperparam.ini
- Save/load models + metadata (
Installation
Clone the repo and install with Poetry (recommended):
git clone https://github.com/YOUR-LAB/BSVAE.git
cd BSVAE
poetry install
Or using pip:
pip install -e .
Dependencies:
- Python 3.9+
- PyTorch ≥ 2.0
- pandas, numpy, scikit-learn
- networkx, scipy
- mygene (for gene annotation)
Quickstart
1. Prepare gene expression data
BSVAE expects genes × samples CSVs.
- Full-matrix mode:
Provide
expr.csvwith all samples → 10-fold CV split is created. - Pre-split mode:
Provide directory with
X_train.csvandX_test.csv.
2. Train a model
poetry run python -m bsvae.main exp1 \
--gene-expression-filename data/expr.csv \
--epochs 50 \
--latent-dim 10 \
--ppi-taxid 9606
- Results (checkpoints, logs, metadata) saved under
results/exp1/.
3. Evaluate a trained model
poetry run python -m bsvae.main exp1 \
--gene-expression-filename data/expr.csv \
--is-eval-only
⚙Configuration
Hyperparameters can be set via hyperparam.ini:
[Custom]
seed = 42
no_cuda = False
epochs = 100
batch_size = 64
latent_dim = 10
hidden_dims = [128, 64]
dropout = 0.1
l1_strength = 1e-3
lap_strength = 1e-4
Override from CLI if needed:
--epochs 50 --latent-dim 20
PPI Priors
BSVAE supports automatic download & caching of STRING v12.0 PPI networks.
-
Supported species (via NCBI TaxID):
- Human (
9606) - Mouse (
10090) - Rat (
10116) - Fly (
7227)
- Human (
-
Cache location defaults to
~/.bsvae/ppi(override via--ppi-cache).
Citation
If you use BSVAE in your research, please cite:
@article{Benjamin2025bsvae,
title={Structured Factor Variational Autoencoder with Biological Priors},
author={Kynon J. M. Benjamin},
year={2025},
journal={N/A}
}
License
This project is licensed under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bsvae-0.1.1.tar.gz.
File metadata
- Download URL: bsvae-0.1.1.tar.gz
- Upload date:
- Size: 19.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.0 CPython/3.10.9 Linux/4.18.0-553.22.1.el8_10.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1f0629eb8f52ec10e4ee0f6ff5409569472f767c59302652b4dff6c8f029273
|
|
| MD5 |
ce7a39f185cd2be1beb6ef61f5a1635a
|
|
| BLAKE2b-256 |
f318ee6d9844fce1f31ab3ba73e5248a242c0e08162ae7ac8ee18709a73b1d2b
|
File details
Details for the file bsvae-0.1.1-py3-none-any.whl.
File metadata
- Download URL: bsvae-0.1.1-py3-none-any.whl
- Upload date:
- Size: 25.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.0 CPython/3.10.9 Linux/4.18.0-553.22.1.el8_10.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
086e13e643d92379133e59a4459944a39e7d20ec8b14cfa8170a81fe763cb144
|
|
| MD5 |
7629db4ee958fbd7fb00d55a20e29051
|
|
| BLAKE2b-256 |
5fa12ba350d7ac0081b23fbe1bd402495b7c3b93daf1658eadd5ace2c681365f
|