Skip to main content

Load and interpret MOFA models

Project description

MOFA+ in Python

Work with trained factor models in Python.

This library provides convenience functions to load and visualize factor models trained with MOFA+ in Python. For more information on the multi-omics factor analysis v2 framework please see this GitHub repository.

Getting started

Installation

pip install git+https://github.com/gtca/mofax

Training a factor model

Please see the MOFA+ GitHub repository for more information on training the factor models with MOFA+.

Loading the model

Import the module and create a connection to the HDF5 file with the trained model:

import mofax as mfx

model = mfx.mofa_model("trained_mofaplus_model.hdf5")

The connection is created in the readonly mode by default and can be terminated by calling the close() method on the model object at the end of the working session:

model.close()

Model object

Model object is an instance of a mofa_model class that wraps around the HDF5 connection and provides a simple way to address the parts of the trained model such as expectations for factors and for their loadings (weights) eliminating the need to traverse the HDF5 file manually.

Model methods

Simple data structures (e.g. lists or dictionaries) are typically returned upon calling the properties of the mofa model, e.g. model.shape:

model.shape
# returns (10138, 1124)
#         cells^  ^features

More complex structures are typically returned when using methods such as model.get_cells() to get cell -> group assignment as a pandas.DataFrame while also providing the way to only get this information for specific groups or views of the model.

model.get_cells().head()
# returns a pandas.DataFrame object:
# 	group	cell
# 0	T_CD4	AATCCTGCACATCGCC-1
# 1	T_CD4	AAGACGTGTGATGCCC-1
# 2	T_CD4	AAGGAGCGTCGGCATG-1
# 3	T_CD4	AATCCGTCACGAGACG-1
# 4	T_CD4	ACACCGAGGAGGTTGA-1

Use model.metadata to get the metadata table — it's a shorhand for samples_metadata, there's also features_metadata available.

model.metadata.head()
# returns a pandas.DataFrame object:
#                     group  n_genes
# sample
# AATCCTGCACATCGCC-1  T_CD4     1087
# AAGACGTGTGATGCCC-1  T_CD4     1836
# AAGGAGCGTCGGCATG-1  T_CD4     2216
# AATCCGTCACGAGACG-1  T_CD4     1615
# ACACCGAGGAGGTTGA-1  T_CD4     1800

To get expectations of W (weights) and Z (factors) matrices, use get_weights() and get_factors(), respectively. There's also a df=True option to get expectations as a Pandas data frame rather than a NumPy 2D array.

model.get_factors(factors=range(3), df=True).head()
#                      Factor1   Factor2   Factor3
# AATCCTGCACATCGCC-1  0.012582 -0.093512 -0.011228
# AAGACGTGTGATGCCC-1  0.001091 -0.027217 -0.011331
# AAGGAGCGTCGGCATG-1 -0.015097  0.093493 -0.010593
# AATCCGTCACGAGACG-1 -0.046222  0.225920  0.010083
# ACACCGAGGAGGTTGA-1  0.011766 -0.055964 -0.011298

Utility functions

A few utility functions such as calculate_factor_r2 to calculate the variance explained by a factor are provided as well.

Plotting functions

A few basic plots can be constructed with plotting functions provided such as plot_factors and plot_weights. They rely on and limited by plotting functionality of Seaborn.

Please check the notebooks for detailed examples. Some of the implemented plots are demonstrated below.

mofax plots

Contributions

In case you work with MOFA+ models in Python, you might find mofax useful. Please consider contributing to this module by suggesting the missing functionality to be implemented in the form of issues and in the form of pull requests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mofax-0.3.2.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mofax-0.3.2-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file mofax-0.3.2.tar.gz.

File metadata

  • Download URL: mofax-0.3.2.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.5

File hashes

Hashes for mofax-0.3.2.tar.gz
Algorithm Hash digest
SHA256 d9e1efca133ae788199e1953e5b9ab62d9aa6d4e381f998363e1c78f63beec71
MD5 6cbb62b21a91caede00f553fda9d5899
BLAKE2b-256 da065fc5b5385af660171d7d75ec6a9c9d5c1c5747e8822f9ec27155138122a2

See more details on using hashes here.

File details

Details for the file mofax-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: mofax-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.5

File hashes

Hashes for mofax-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d003ae5c11818eba5b89e6470dcb88bc9708d71d41211ad3fd4df0f24c8b9c54
MD5 68c6195ba0f3bdd54cfc4f38621411fc
BLAKE2b-256 eeef4c96a651c6fd968a4cbfbb855faf06f3d9bf3095066b4540673f8d6ec0a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page