Skip to main content

A comprehensive framework for integrating multi-omics data with neural network embeddings.

Project description

BioNeuralNet: Multi-Omics Integration with Graph Neural Networks

License PyPI GitHub Issues GitHub Contributors Downloads Documentation

Welcome to BioNeuralNet 1.0.8

BioNeuralNet Logo

BioNeuralNet is a Python framework for integrating and analyzing multi-omics data using Graph Neural Networks (GNNs). It provides tools for network construction, embedding generation, clustering, and disease prediction, all within a modular, scalable, and reproducible pipeline.

BioNeuralNet Workflow

Table of Contents

1. Installation

BioNeuralNet supports Python 3.10, 3.11 and 3.12.

1.1. Install BioNeuralNet

pip install bioneuralnet

1.2. Install PyTorch and PyTorch Geometric

BioNeuralNet relies on PyTorch for GNN computations. Install PyTorch separately:

  • PyTorch (CPU):

    pip install torch torchvision torchaudio
    
  • PyTorch Geometric:

    pip install torch_geometric
    

For GPU acceleration, please refer to:

BioNeuralNet Core Features

For an end-to-end example of BioNeuralNet, see the Quick Start and TCGA-BRCA Dataset guides.

Network Embedding

  • Given a multi-omics network as input, BioNeuralNet can generate embeddings using Graph Neural Networks (GNNs).
  • Generate embeddings using methods such as GCN, GAT, GraphSAGE, and GIN.
  • Outputs can be obtained as native tensors or converted to pandas DataFrames for easy analysis and visualization.
  • Embeddings unlock numerous downstream applications, including disease prediction, enhanced subject representation, clustering, and more.

Graph Clustering

  • Identify functional modules or communities using correlated clustering methods (e.g., CorrelatedPageRank, CorrelatedLouvain, HybridLouvain) that integrate phenotype correlation to extract biologically relevant modules [1].
  • Clustering methods can be applied to any network representation, allowing flexible analysis across different domains.
  • All clustering components return either raw partition dictionaries or induced subnetwork adjacency matrices (as DataFrames) for visualization.
  • Use cases include feature selection, biomarker discovery, and network-based analysis.

Downstream Tasks

Subject Representation

  • Integrate node embeddings back into omics data to enrich subject-level profiles by weighting features with the learned embedding.
  • This embedding-enriched data can be used for downstream tasks such as disease prediction or biomarker discovery.
  • The result can be returned as a DataFrame or a PyTorch tensor, fitting naturally into downstream analyses.

Disease Prediction for Multi-Omics Network (DPMON) [2]

  • Classification end-to-end pipeline for disease prediction using Graph Neural Network embeddings.
  • DPMON supports hyperparameter tuning, when enabled, it finds the best configuration for the given data.
  • This approach, along with native pandas integration across modules, ensures that BioNeuralNet can be easily incorporated into your analysis workflows.

Metrics

  • Visualize embeddings, feature variance, clustering comparison, and network structure in 2D.
  • Evaluate embedding quality and clustering relevance using correlation with phenotype.
  • Performance benchmarking tools for classification tasks using various models.
  • Useful for assessing feature importance, validating network structure, and comparing cluster outputs.

Utilities

  • Build graphs using k-NN similarity, Pearson/Spearman correlation, RBF kernels, mutual information, or soft-thresholding.
  • Filter and preprocess omics or clinical data by variance, correlation, random forest importance, or ANOVA F-test.
  • Tools for network pruning, feature selection, and data cleaning.
  • Quickly summarize datasets with variance, zero-fraction, expression level, or correlation overviews.
  • Includes conversion tools for RData and integrated logging.

External Tools

  • Graph Construction:
    • BioNeuralNet provides additional tools in the bioneuralnet.external_tools module.
    • Includes support for SmCCNet (Sparse Multiple Canonical Correlation Network), an R-based tool for constructing phenotype-informed correlation networks [3].
    • These tools are optional but enhance BioNeuralNet’s graph construction capabilities and are recommended for more integrative or exploratory workflows.

3. Example: SmCCNet + DPMON for Disease Prediction

import pandas as pd
from bioneuralnet.external_tools import SmCCNet
from bioneuralnet.downstream_task import DPMON
from bioneuralnet.datasets import DatasetLoader

# Step 1: Load your data or use one of the provided datasets
Example = DatasetLoader("example1")
omics_proteins = Example.data["X1"]
omics_metabolites = Example.data["X2"]
phenotype_data = Example.data["Y"]
clinical_data = Example.data["clinical_data"]

# Step 2: Network Construction
smccnet = SmCCNet(
    phenotype_df=phenotype_data,
    omics_dfs=[omics_proteins, omics_metabolites],
    data_types=["protein", "metabolite"],
    kfold=5,
    summarization="PCA",
)
global_network, clusters = smccnet.run()
print("Adjacency matrix generated.")

# Step 3: Disease Prediction (DPMON)
dpmon = DPMON(
    adjacency_matrix=global_network,
    omics_list=[omics_proteins, omics_metabolites],
    phenotype_data=phenotype_data,
    clinical_data=clinical_data,
    model="GCN",
)
predictions = dpmon.run()
print("Disease phenotype predictions:\n", predictions)

4. Documentation and Tutorials

  • Full documentation: BioNeuralNet Documentation

  • Jupyter Notebook Examples:

  • Tutorials include:

    • Multi-omics graph construction.
    • GNN embeddings for disease prediction.
    • Subject representation with integrated embeddings.
    • Clustering using Hybrid Louvain and Correlated PageRank.
  • API details are available in the API Reference.

5. Frequently Asked Questions (FAQ)

  • Does BioNeuralNet support GPU acceleration? Yes, install PyTorch with CUDA support.

  • Can I use my own omics network? Yes, you can provide a custom network as an adjancy matrix instead of using SmCCNet.

  • What clustering methods are supported? BioNeuralNet supports Correlated Louvain, Hybrid Louvain, and Correlated PageRank.

For more FAQs, please visit our FAQ page.

6. Acknowledgments

BioNeuralNet integrates multiple open-source libraries. We acknowledge key dependencies:

  • PyTorch - GNN computations and deep learning models.
  • PyTorch Geometric - Graph-based learning for multi-omics.
  • NetworkX - Graph data structures and algorithms.
  • Scikit-learn - Feature selection and evaluation utilities.
  • pandas & numpy - Core data processing tools.
  • ray[tune] - Hyperparameter tuning for GNN models.
  • matplotlib - Data visualization.
  • cptac - Dataset handling for clinical proteomics.
  • python-louvain - Community detection algorithms.
  • statsmodels - Statistical models and hypothesis testing (e.g., ANOVA, regression).

We also acknowledge R-based tools for external network construction:

  • SmCCNet - Sparse multiple canonical correlation network.

7. Testing and Continuous Integration

  • Run Tests Locally:

    pytest --cov=bioneuralnet --cov-report=html
    open htmlcov/index.html
    
  • Continuous Integration: GitHub Actions runs automated tests on every commit.

8. Contributing

We welcome contributions! To get started:

git clone https://github.com/UCD-BDLab/BioNeuralNet.git
cd BioNeuralNet
pip install -r requirements-dev.txt
pre-commit install
pytest

How to Contribute

  • Fork the repository, create a new branch, and implement your changes.
  • Add tests and documentation for any new features.
  • Submit a pull request with a clear description of your changes.

9. License

BioNeuralNet is distributed under the MIT License.

10. Contact

11. References

[1] Abdel-Hafiz, M., Najafi, M., et al. "Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification." Frontiers in Big Data, 5 (2022). DOI: 10.3389/fdata.2022.894632

[2] Hussein, S., Ramos, V., et al. "Learning from Multi-Omics Networks to Enhance Disease Prediction: An Optimized Network Embedding and Fusion Approach." In 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Lisbon, Portugal, 2024, pp. 4371-4378. DOI: 10.1109/BIBM62325.2024.10822233

[3] Liu, W., Vu, T., Konigsberg, I. R., Pratte, K. A., Zhuang, Y., & Kechris, K. J. (2023). "Network-Based Integration of Multi-Omics Data for Biomarker Discovery and Phenotype Prediction." Bioinformatics, 39(5), btat204. DOI: 10.1093/bioinformatics/btat204

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioneuralnet-1.0.8.tar.gz (83.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bioneuralnet-1.0.8-py3-none-any.whl (78.6 MB view details)

Uploaded Python 3

File details

Details for the file bioneuralnet-1.0.8.tar.gz.

File metadata

  • Download URL: bioneuralnet-1.0.8.tar.gz
  • Upload date:
  • Size: 83.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bioneuralnet-1.0.8.tar.gz
Algorithm Hash digest
SHA256 547e6fb297e5e2fc1d910c4737368e58418212ee1ca3e53195d5e072e01ffad9
MD5 ace0d639e4115e0e789cd758462066cf
BLAKE2b-256 ddb93dda7cf1f8887cb3c79770349c9f31e0e9b18c7e02ab9f90be0f1da43dd1

See more details on using hashes here.

File details

Details for the file bioneuralnet-1.0.8-py3-none-any.whl.

File metadata

  • Download URL: bioneuralnet-1.0.8-py3-none-any.whl
  • Upload date:
  • Size: 78.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bioneuralnet-1.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 56182f4e7434cfca253ad6fae43d8c00f7ad54303f87bac679850b29bfedef2c
MD5 586f86739b34d3d19ab89e732f7f4686
BLAKE2b-256 c686a17a6e9a2d2366ae1113a40ef9645f216f9f4e7265196d6df2fa40661631

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page