Skip to main content

MuVI: A multi-view latent variable model with domain-informed structured sparsity for integrating noisy feature sets.

Project description

MuVI

A multi-view latent variable model with domain-informed structured sparsity, that integrates noisy domain expertise in terms of feature sets.

Examples | Paper | BibTeX

Build Coverage

Basic usage

The MuVI class is the main entry point for loading the data and performing inference:

import numpy as np
import pandas as pd
import anndata as ad
import mudata as md
import muvi

# Load processed input data (missing values are allowed)
rna_df = pd.read_csv(...)   # shape: n_samples x n_rna_features
prot_df = pd.read_csv(...)  # shape: n_samples x n_prot_features

# Load prior feature sets, e.g. gene sets
gene_sets = muvi.fs.from_gmt(...)
gene_sets_mask = gene_sets.to_mask(rna_df.columns)  # binary mask

# Create a MuVI object by passing both input data and prior information
model = muvi.MuVI(
    observations={"rna": rna_df, "prot": prot_df},
    prior_masks={"rna": gene_sets_mask},
    ...,
    device=device,
)

# Alternatively, create a MuVI model from AnnData (single-view)
rna_adata = ad.AnnData(rna_df, dtype=np.float32)
rna_adata.varm["gene_sets_mask"] = gene_sets_mask.T
model = muvi.tl.from_adata(
    rna_adata,
    prior_mask_key="gene_sets_mask",
    ...,
    device=device,
)

# Alternatively, create a MuVI model from MuData (multi-view)
mdata = md.MuData({"rna": rna_adata, "prot": prot_adata})
model = muvi.tl.from_mudata(
    mdata,
    prior_mask_key="gene_sets_mask",
    ...,
    device=device,
)

# Fit the model
model.fit(batch_size, n_epochs, ...)

# Continue with the downstream analysis (see below)

Saving & loading models

MuVI provides a versioned, pickle-free, lightweight serialization format:

# Save the model
model.save("path/to/dir")

# Load the model later
loaded_model = muvi.MuVI.load("path/to/dir")

The directory contains:

  • metadata.json – model metadata (JSON)
  • params.npz – variational guide parameters
  • structure.npz – observations, priors, covariates, factor order & signs
  • factor.h5ad / cov.h5ad – optional cached AnnData objects (if present)

This mechanism is fully reproducible, stable across MuVI versions, and does not rely on Python pickle.

Submodules

The package consists of three additional submodules for analysing the results post-training:

  • muvi.tl provides tools for downstream analysis, e.g.,
    • compute muvi.tl.variance_explained across all factors and views
    • muvi.tl.test the significance between the prior feature sets and the inferred factors
    • apply clustering on the latent space such as muvi.tl.leiden
    • muvi.tl.save the model in order to muvi.tl.load it at a later point in time
  • muvi.pl works in tandem with muvi.tl by providing visualization methods such as
    • muvi.pl.variance_explained (see above)
    • plotting the latent space via muvi.pl.tsne, muvi.pl.scatter or muvi.pl.stripplot
    • investigating factors in terms of their inferred loadings with muvi.pl.inspect_factor
  • muvi.fs serves the data structure and methods for loading, processing and storing the prior information from feature sets

Tutorials

Check out our basic tutorial to get familiar with MuVI, or jump straight to a single-cell multiome analysis!

R users can readily export a trained MuVI model into R with a single line of code and resume the analysis with the MOFA2 package.

muvi.ext.save_as_hdf5(model, "muvi.hdf5", save_metadata=True)

See this vignette for more details!

Installation

We suggest using conda to manage your environments, and pip to install muvi as a python package. Follow these steps to get muvi up and running!

  1. Create a python environment in conda:
conda create -n muvi python=3.12
  1. Activate freshly created environment:
source activate muvi
  1. Install muvi with pip:
python3 -m pip install muvi
  1. Alternatively, install the latest version with pip:
python3 -m pip install git+https://github.com/MLO-lab/MuVI.git

Make sure to install a GPU version of PyTorch to significantly speed up the inference.

Citation

If you use MuVI in your work, please use this BibTeX entry:

Encoding Domain Knowledge in Multi-view Latent Variable Models: A Bayesian Approach with Structured Sparsity

Arber Qoku and Florian Buettner

International Conference on Artificial Intelligence and Statistics (AISTATS) 2023

https://proceedings.mlr.press/v206/qoku23a.html

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

muvi-0.2.1.tar.gz (67.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

muvi-0.2.1-py3-none-any.whl (71.0 kB view details)

Uploaded Python 3

File details

Details for the file muvi-0.2.1.tar.gz.

File metadata

  • Download URL: muvi-0.2.1.tar.gz
  • Upload date:
  • Size: 67.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.11 Linux/6.8.0-60-generic

File hashes

Hashes for muvi-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c3f5025514d6e274f7e12d249ca388e9fd5b48a1eb7ad9a1a5fea3cb6950235e
MD5 f5f489d750f2ac228315cb61607d01cb
BLAKE2b-256 80f67365bd69a4248294f5b958de8ef50bedcb8057fa20bc0d43fe547fe4fa0b

See more details on using hashes here.

File details

Details for the file muvi-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: muvi-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 71.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.11 Linux/6.8.0-60-generic

File hashes

Hashes for muvi-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4b9441b2b67dc0c517b5b03db24e23b1c9796325336260124461a5c62321880a
MD5 01b97ee9e9b59c7267d92b197b171984
BLAKE2b-256 9046e7fdc8222b8b4f074daf9821fa4d6da59b8992f8497e36d11186e2f71a59

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page