Skip to main content

No project description provided

Project description

BSVAE: Biologically Structured Variational Autoencoder

PyTorch License

BSVAE is a PyTorch package for Structured Factor Variational Autoencoders (StructuredFactorVAE).
It is designed for gene expression modeling with biological priors, integrating protein–protein interaction (PPI) networks and sparsity constraints for interpretable latent representations.


Features

  • Structured VAE architecture (StructuredFactorVAE)
    • Factorized encoder/decoder with group sparsity
    • Optional Laplacian regularization from PPI networks
  • Dataset utilities
    • Load gene expression matrices (GeneExpression)
    • Support for CSV full-matrix mode or pre-split train/test mode
  • Biological priors
    • Fetch and cache STRING PPI networks by NCBI TaxID
    • Map gene symbols / Ensembl IDs to protein IDs using MyGene.info or BioMart
  • Training and evaluation
    • Unified training loop (Trainer)
    • Evaluation with reconstruction, KL, sparsity, and Laplacian penalties (Evaluator)
  • Reproducibility
    • Save/load models + metadata (modelIO)
    • Configurable hyperparameters via hyperparam.ini

Installation

Clone the repo and install with Poetry (recommended):

git clone https://github.com/YOUR-LAB/BSVAE.git
cd BSVAE
poetry install

Or using pip:

pip install -e .

Dependencies:

  • Python 3.9+
  • PyTorch ≥ 2.0
  • pandas, numpy, scikit-learn
  • networkx, scipy
  • mygene (for gene annotation)

Quickstart

1. Prepare gene expression data

BSVAE expects genes × samples CSVs.

  • Full-matrix mode: Provide expr.csv with all samples → 10-fold CV split is created.
  • Pre-split mode: Provide directory with X_train.csv and X_test.csv.

2. Train a model

poetry run python -m bsvae.main exp1 \
    --gene-expression-filename data/expr.csv \
    --epochs 50 \
    --latent-dim 10 \
    --ppi-taxid 9606
  • Results (checkpoints, logs, metadata) saved under results/exp1/.

3. Evaluate a trained model

poetry run python -m bsvae.main exp1 \
    --gene-expression-filename data/expr.csv \
    --is-eval-only

⚙Configuration

Hyperparameters can be set via hyperparam.ini:

[Custom]
seed = 42
no_cuda = False
epochs = 100
batch_size = 64
latent_dim = 10
hidden_dims = [128, 64]
dropout = 0.1
l1_strength = 1e-3
lap_strength = 1e-4

Override from CLI if needed:

--epochs 50 --latent-dim 20

PPI Priors

BSVAE supports automatic download & caching of STRING v12.0 PPI networks.

  • Supported species (via NCBI TaxID):

    • Human (9606)
    • Mouse (10090)
    • Rat (10116)
    • Fly (7227)
  • Cache location defaults to ~/.bsvae/ppi (override via --ppi-cache).


Citation

If you use BSVAE in your research, please cite:

@article{Benjamin2025bsvae,
  title={Structured Factor Variational Autoencoder with Biological Priors},
  author={Kynon J. M. Benjamin},
  year={2025},
  journal={N/A}
}

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bsvae-0.1.1.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bsvae-0.1.1-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file bsvae-0.1.1.tar.gz.

File metadata

  • Download URL: bsvae-0.1.1.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.0 CPython/3.10.9 Linux/4.18.0-553.22.1.el8_10.x86_64

File hashes

Hashes for bsvae-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d1f0629eb8f52ec10e4ee0f6ff5409569472f767c59302652b4dff6c8f029273
MD5 ce7a39f185cd2be1beb6ef61f5a1635a
BLAKE2b-256 f318ee6d9844fce1f31ab3ba73e5248a242c0e08162ae7ac8ee18709a73b1d2b

See more details on using hashes here.

File details

Details for the file bsvae-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: bsvae-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 25.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.0 CPython/3.10.9 Linux/4.18.0-553.22.1.el8_10.x86_64

File hashes

Hashes for bsvae-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 086e13e643d92379133e59a4459944a39e7d20ec8b14cfa8170a81fe763cb144
MD5 7629db4ee958fbd7fb00d55a20e29051
BLAKE2b-256 5fa12ba350d7ac0081b23fbe1bd402495b7c3b93daf1658eadd5ace2c681365f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page