Skip to main content

A comprehensive framework for integrating multi-omics data with neural network embeddings.

Project description

BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool

License: CC BY-NC-ND 4.0 PyPI GitHub Issues GitHub Contributors Downloads Documentation

Welcome to BioNeuralNet 1.1.0

BioNeuralNet Logo

BioNeuralNet is a Python framework for integrating and analyzing multi-omics data using Graph Neural Networks (GNNs). It provides tools for network construction, embedding generation, clustering, and disease prediction, all within a modular, scalable, and reproducible pipeline.

BioNeuralNet Workflow

Documentation

BioNeuralNet Documentation & Examples

Table of Contents

1. Installation

BioNeuralNet supports Python 3.10, 3.11 and 3.12.

1.1. Install BioNeuralNet

pip install bioneuralnet

1.2. Install PyTorch and PyTorch Geometric

BioNeuralNet relies on PyTorch for GNN computations. Install PyTorch separately:

  • PyTorch (CPU):

    pip install torch torchvision torchaudio
    
  • PyTorch Geometric:

    pip install torch_geometric
    

For GPU acceleration, please refer to:

BioNeuralNet Core Features

For an end-to-end example of BioNeuralNet, see the Quick Start and TCGA-BRCA Dataset guides.

Network Embedding

  • Given a multi-omics network as input, BioNeuralNet can generate embeddings using Graph Neural Networks (GNNs).
  • Generate embeddings using methods such as GCN, GAT, GraphSAGE, and GIN.
  • Outputs can be obtained as native tensors or converted to pandas DataFrames for easy analysis and visualization.
  • Embeddings unlock numerous downstream applications, including disease prediction, enhanced subject representation, clustering, and more.

Graph Clustering

  • Identify functional modules or communities using correlated clustering methods (e.g., CorrelatedPageRank, CorrelatedLouvain, HybridLouvain) that integrate phenotype correlation to extract biologically relevant modules [1].
  • Clustering methods can be applied to any network representation, allowing flexible analysis across different domains.
  • All clustering components return either raw partition dictionaries or induced subnetwork adjacency matrices (as DataFrames) for visualization.
  • Use cases include feature selection, biomarker discovery, and network-based analysis.

Downstream Tasks

Subject Representation

  • Integrate node embeddings back into omics data to enrich subject-level profiles by weighting features with the learned embedding.
  • This embedding-enriched data can be used for downstream tasks such as disease prediction or biomarker discovery.
  • The result can be returned as a DataFrame or a PyTorch tensor, fitting naturally into downstream analyses.

Disease Prediction for Multi-Omics Network (DPMON) [2]

  • Classification end-to-end pipeline for disease prediction using Graph Neural Network embeddings.
  • DPMON supports hyperparameter tuning, when enabled, it finds the best configuration for the given data.
  • This approach, along with native pandas integration across modules, ensures that BioNeuralNet can be easily incorporated into your analysis workflows.

Metrics

  • Visualize embeddings, feature variance, clustering comparison, and network structure in 2D.
  • Evaluate embedding quality and clustering relevance using correlation with phenotype.
  • Performance benchmarking tools for classification tasks using various models.
  • Useful for assessing feature importance, validating network structure, and comparing cluster outputs.

Utilities

  • Build graphs using k-NN similarity, Pearson/Spearman correlation, RBF kernels, mutual information, or soft-thresholding.
  • Filter and preprocess omics or clinical data by variance, correlation, random forest importance, or ANOVA F-test.
  • Tools for network pruning, feature selection, and data cleaning.
  • Quickly summarize datasets with variance, zero-fraction, expression level, or correlation overviews.
  • Includes conversion tools for RData and integrated logging.

External Tools

  • Graph Construction:
    • BioNeuralNet provides additional tools in the bioneuralnet.external_tools module.
    • Includes support for SmCCNet (Sparse Multiple Canonical Correlation Network), an R-based tool for constructing phenotype-informed correlation networks [3].
    • These tools are optional but enhance BioNeuralNet’s graph construction capabilities and are recommended for more integrative or exploratory workflows.

3. Example: SmCCNet + DPMON for Disease Prediction

import pandas as pd
from bioneuralnet.external_tools import SmCCNet
from bioneuralnet.downstream_task import DPMON
from bioneuralnet.datasets import DatasetLoader

# Step 1: Load your data or use one of the provided datasets
Example = DatasetLoader("example1")
omics_proteins = Example.data["X1"]
omics_metabolites = Example.data["X2"]
phenotype_data = Example.data["Y"]
clinical_data = Example.data["clinical_data"]

# Step 2: Network Construction
smccnet = SmCCNet(
    phenotype_df=phenotype_data,
    omics_dfs=[omics_proteins, omics_metabolites],
    data_types=["protein", "metabolite"],
    kfold=5,
    summarization="PCA",
)
global_network, clusters = smccnet.run()
print("Adjacency matrix generated.")

# Step 3: Disease Prediction (DPMON)
dpmon = DPMON(
    adjacency_matrix=global_network,
    omics_list=[omics_proteins, omics_metabolites],
    phenotype_data=phenotype_data,
    clinical_data=clinical_data,
    model="GCN",
)
predictions = dpmon.run()
print("Disease phenotype predictions:\n", predictions)

4. Documentation and Tutorials

  • Full documentation: BioNeuralNet Documentation

  • Jupyter Notebook Examples:

  • Tutorials include:

    • Multi-omics graph construction.
    • GNN embeddings for disease prediction.
    • Subject representation with integrated embeddings.
    • Clustering using Hybrid Louvain and Correlated PageRank.
  • API details are available in the API Reference.

5. Frequently Asked Questions (FAQ)

  • Does BioNeuralNet support GPU acceleration? Yes, install PyTorch with CUDA support.

  • Can I use my own omics network? Yes, you can provide a custom network as an adjancy matrix instead of using SmCCNet.

  • What clustering methods are supported? BioNeuralNet supports Correlated Louvain, Hybrid Louvain, and Correlated PageRank.

For more FAQs, please visit our FAQ page.

6. Acknowledgments

BioNeuralNet integrates multiple open-source libraries. We acknowledge key dependencies:

  • PyTorch - GNN computations and deep learning models.
  • PyTorch Geometric - Graph-based learning for multi-omics.
  • NetworkX - Graph data structures and algorithms.
  • Scikit-learn - Feature selection and evaluation utilities.
  • pandas & numpy - Core data processing tools.
  • ray[tune] - Hyperparameter tuning for GNN models.
  • matplotlib - Data visualization.
  • cptac - Dataset handling for clinical proteomics.
  • python-louvain - Community detection algorithms.
  • statsmodels - Statistical models and hypothesis testing (e.g., ANOVA, regression).

We also acknowledge R-based tools for external network construction:

  • SmCCNet - Sparse multiple canonical correlation network.

7. Testing and Continuous Integration

  • Run Tests Locally:

    pytest --cov=bioneuralnet --cov-report=html
    open htmlcov/index.html
    
  • Continuous Integration: GitHub Actions runs automated tests on every commit.

8. Contributing

We welcome contributions! To get started:

git clone https://github.com/UCD-BDLab/BioNeuralNet.git
cd BioNeuralNet
pip install -r requirements-dev.txt
pre-commit install
pytest

How to Contribute

  • Fork the repository, create a new branch, and implement your changes.
  • Add tests and documentation for any new features.
  • Submit a pull request with a clear description of your changes.

9. License

BioNeuralNet is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). See the LICENSE file for details.

10. Contact

11. References

[1] Abdel-Hafiz, M., Najafi, M., et al. "Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification." Frontiers in Big Data, 5 (2022). DOI: 10.3389/fdata.2022.894632

[2] Hussein, S., Ramos, V., et al. "Learning from Multi-Omics Networks to Enhance Disease Prediction: An Optimized Network Embedding and Fusion Approach." In 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Lisbon, Portugal, 2024, pp. 4371-4378. DOI: 10.1109/BIBM62325.2024.10822233

[3] Liu, W., Vu, T., Konigsberg, I. R., Pratte, K. A., Zhuang, Y., & Kechris, K. J. (2023). "Network-Based Integration of Multi-Omics Data for Biomarker Discovery and Phenotype Prediction." Bioinformatics, 39(5), btat204. DOI: 10.1093/bioinformatics/btat204

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioneuralnet-1.1.0.tar.gz (83.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bioneuralnet-1.1.0-py3-none-any.whl (78.6 MB view details)

Uploaded Python 3

File details

Details for the file bioneuralnet-1.1.0.tar.gz.

File metadata

  • Download URL: bioneuralnet-1.1.0.tar.gz
  • Upload date:
  • Size: 83.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bioneuralnet-1.1.0.tar.gz
Algorithm Hash digest
SHA256 8ca95a4f392f361eba98b1370e38b8e3ced591966e1887588bae12989e8b9482
MD5 f3b890543806de5c95247baeb0228fa4
BLAKE2b-256 7cd7bd609a1b91a9e819d562397b0466244bb5840fd6ac4bfff909d56f659668

See more details on using hashes here.

File details

Details for the file bioneuralnet-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: bioneuralnet-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 78.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bioneuralnet-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 49b1830d034c2c01b362638d1fc1ba48fc76aaaa56966086c503e004bc11995c
MD5 f9ef2965d3aff455791afbb99895543c
BLAKE2b-256 8cbe75a10f8470d09a1ce7ddecb536c3188681516f0f74532f0d1c876a0216b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page