Skip to main content

No project description provided

Project description

BSVAE: Biologically Structured Variational Autoencoder

PyTorch License

BSVAE is a PyTorch package for Structured Factor Variational Autoencoders (StructuredFactorVAE).
It is designed for gene expression modeling with biological priors, integrating protein–protein interaction (PPI) networks and sparsity constraints for interpretable latent representations.


Features

  • Structured VAE architecture (StructuredFactorVAE)
    • Factorized encoder/decoder with group sparsity
    • Optional Laplacian regularization from PPI networks
  • Dataset utilities
    • Load gene expression matrices (GeneExpression)
    • Support for CSV full-matrix mode or pre-split train/test mode
  • Biological priors
    • Fetch and cache STRING PPI networks by NCBI TaxID
    • Map gene symbols / Ensembl IDs to protein IDs using MyGene.info or BioMart
  • Training and evaluation
    • Unified training loop (Trainer)
    • Evaluation with reconstruction, KL, sparsity, and Laplacian penalties (Evaluator)
  • Reproducibility
    • Save/load models + metadata (modelIO)
    • Configurable hyperparameters via hyperparam.ini

Installation

Clone the repo and install with Poetry (recommended):

git clone https://github.com/YOUR-LAB/BSVAE.git
cd BSVAE
poetry install

Or using pip:

pip install -e .

Dependencies:

  • Python 3.9+
  • PyTorch ≥ 2.0
  • pandas, numpy, scikit-learn
  • networkx, scipy
  • mygene (for gene annotation)

Quickstart

1. Prepare gene expression data

BSVAE expects genes × samples CSVs.

  • Full-matrix mode: Provide expr.csv with all samples → 10-fold CV split is created.
  • Pre-split mode: Provide directory with X_train.csv and X_test.csv.

2. Train a model

poetry run python -m bsvae.main exp1 \
    --gene-expression-filename data/expr.csv \
    --epochs 50 \
    --latent-dim 10 \
    --ppi-taxid 9606
  • Results (checkpoints, logs, metadata) saved under results/exp1/.

3. Evaluate a trained model

poetry run python -m bsvae.main exp1 \
    --gene-expression-filename data/expr.csv \
    --is-eval-only

⚙Configuration

Hyperparameters can be set via hyperparam.ini:

[Custom]
seed = 42
no_cuda = False
epochs = 100
batch_size = 64
latent_dim = 10
hidden_dims = [128, 64]
dropout = 0.1
l1_strength = 1e-3
lap_strength = 1e-4

Override from CLI if needed:

--epochs 50 --latent-dim 20

PPI Priors

BSVAE supports automatic download & caching of STRING v12.0 PPI networks.

  • Supported species (via NCBI TaxID):

    • Human (9606)
    • Mouse (10090)
    • Rat (10116)
    • Fly (7227)
  • Cache location defaults to ~/.bsvae/ppi (override via --ppi-cache).

Prefetch PPI cache from the CLI

Use the lightweight downloader to cache a STRING network ahead of training:

poetry run bsvae-download-ppi --taxid 9606 --cache-dir ~/.bsvae/ppi

Citation

If you use BSVAE in your research, please cite:

@article{Benjamin2025bsvae,
  title={Structured Factor Variational Autoencoder with Biological Priors},
  author={Kynon J. M. Benjamin},
  year={2025},
  journal={N/A}
}

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bsvae-0.1.2.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bsvae-0.1.2-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file bsvae-0.1.2.tar.gz.

File metadata

  • Download URL: bsvae-0.1.2.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.0 CPython/3.10.9 Linux/4.18.0-553.22.1.el8_10.x86_64

File hashes

Hashes for bsvae-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4bcdf82b07047e1dd26433d41f27e2d14713b89869bb94ac0a4fd7a24b1d36b6
MD5 2d03e2f97ffa575775235b43b32c6b74
BLAKE2b-256 113bb65e3a6ece029cb187075f6d75f4a08f20bf57af0cbed855be825e003832

See more details on using hashes here.

File details

Details for the file bsvae-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: bsvae-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 28.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.0 CPython/3.10.9 Linux/4.18.0-553.22.1.el8_10.x86_64

File hashes

Hashes for bsvae-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 145880f3aac5693d758b7a0d13f92df0611aae2f6632575a132ab0c5b14b0af0
MD5 8317109763444a23c7e1f59bafb2b560
BLAKE2b-256 c9d5188f07cc75408c6e00b842e8eaa58be8fc55935497362a23a7042d43104f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page