Skip to main content

A comprehensive framework for integrating multi-omics data with neural network embeddings.

Project description

BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool

License PyPI GitHub Issues GitHub Contributors Downloads Documentation

Welcome to BioNeuralNet 1.0.9

BioNeuralNet Logo

BioNeuralNet is a Python framework for integrating and analyzing multi-omics data using Graph Neural Networks (GNNs). It provides tools for network construction, embedding generation, clustering, and disease prediction, all within a modular, scalable, and reproducible pipeline.

BioNeuralNet Workflow

Documentation

BioNeuralNet Documentation & Examples

Table of Contents

1. Installation

BioNeuralNet supports Python 3.10, 3.11 and 3.12.

1.1. Install BioNeuralNet

pip install bioneuralnet

1.2. Install PyTorch and PyTorch Geometric

BioNeuralNet relies on PyTorch for GNN computations. Install PyTorch separately:

  • PyTorch (CPU):

    pip install torch torchvision torchaudio
    
  • PyTorch Geometric:

    pip install torch_geometric
    

For GPU acceleration, please refer to:

BioNeuralNet Core Features

For an end-to-end example of BioNeuralNet, see the Quick Start and TCGA-BRCA Dataset guides.

Network Embedding

  • Given a multi-omics network as input, BioNeuralNet can generate embeddings using Graph Neural Networks (GNNs).
  • Generate embeddings using methods such as GCN, GAT, GraphSAGE, and GIN.
  • Outputs can be obtained as native tensors or converted to pandas DataFrames for easy analysis and visualization.
  • Embeddings unlock numerous downstream applications, including disease prediction, enhanced subject representation, clustering, and more.

Graph Clustering

  • Identify functional modules or communities using correlated clustering methods (e.g., CorrelatedPageRank, CorrelatedLouvain, HybridLouvain) that integrate phenotype correlation to extract biologically relevant modules [1].
  • Clustering methods can be applied to any network representation, allowing flexible analysis across different domains.
  • All clustering components return either raw partition dictionaries or induced subnetwork adjacency matrices (as DataFrames) for visualization.
  • Use cases include feature selection, biomarker discovery, and network-based analysis.

Downstream Tasks

Subject Representation

  • Integrate node embeddings back into omics data to enrich subject-level profiles by weighting features with the learned embedding.
  • This embedding-enriched data can be used for downstream tasks such as disease prediction or biomarker discovery.
  • The result can be returned as a DataFrame or a PyTorch tensor, fitting naturally into downstream analyses.

Disease Prediction for Multi-Omics Network (DPMON) [2]

  • Classification end-to-end pipeline for disease prediction using Graph Neural Network embeddings.
  • DPMON supports hyperparameter tuning, when enabled, it finds the best configuration for the given data.
  • This approach, along with native pandas integration across modules, ensures that BioNeuralNet can be easily incorporated into your analysis workflows.

Metrics

  • Visualize embeddings, feature variance, clustering comparison, and network structure in 2D.
  • Evaluate embedding quality and clustering relevance using correlation with phenotype.
  • Performance benchmarking tools for classification tasks using various models.
  • Useful for assessing feature importance, validating network structure, and comparing cluster outputs.

Utilities

  • Build graphs using k-NN similarity, Pearson/Spearman correlation, RBF kernels, mutual information, or soft-thresholding.
  • Filter and preprocess omics or clinical data by variance, correlation, random forest importance, or ANOVA F-test.
  • Tools for network pruning, feature selection, and data cleaning.
  • Quickly summarize datasets with variance, zero-fraction, expression level, or correlation overviews.
  • Includes conversion tools for RData and integrated logging.

External Tools

  • Graph Construction:
    • BioNeuralNet provides additional tools in the bioneuralnet.external_tools module.
    • Includes support for SmCCNet (Sparse Multiple Canonical Correlation Network), an R-based tool for constructing phenotype-informed correlation networks [3].
    • These tools are optional but enhance BioNeuralNet’s graph construction capabilities and are recommended for more integrative or exploratory workflows.

3. Example: SmCCNet + DPMON for Disease Prediction

import pandas as pd
from bioneuralnet.external_tools import SmCCNet
from bioneuralnet.downstream_task import DPMON
from bioneuralnet.datasets import DatasetLoader

# Step 1: Load your data or use one of the provided datasets
Example = DatasetLoader("example1")
omics_proteins = Example.data["X1"]
omics_metabolites = Example.data["X2"]
phenotype_data = Example.data["Y"]
clinical_data = Example.data["clinical_data"]

# Step 2: Network Construction
smccnet = SmCCNet(
    phenotype_df=phenotype_data,
    omics_dfs=[omics_proteins, omics_metabolites],
    data_types=["protein", "metabolite"],
    kfold=5,
    summarization="PCA",
)
global_network, clusters = smccnet.run()
print("Adjacency matrix generated.")

# Step 3: Disease Prediction (DPMON)
dpmon = DPMON(
    adjacency_matrix=global_network,
    omics_list=[omics_proteins, omics_metabolites],
    phenotype_data=phenotype_data,
    clinical_data=clinical_data,
    model="GCN",
)
predictions = dpmon.run()
print("Disease phenotype predictions:\n", predictions)

4. Documentation and Tutorials

  • Full documentation: BioNeuralNet Documentation

  • Jupyter Notebook Examples:

  • Tutorials include:

    • Multi-omics graph construction.
    • GNN embeddings for disease prediction.
    • Subject representation with integrated embeddings.
    • Clustering using Hybrid Louvain and Correlated PageRank.
  • API details are available in the API Reference.

5. Frequently Asked Questions (FAQ)

  • Does BioNeuralNet support GPU acceleration? Yes, install PyTorch with CUDA support.

  • Can I use my own omics network? Yes, you can provide a custom network as an adjancy matrix instead of using SmCCNet.

  • What clustering methods are supported? BioNeuralNet supports Correlated Louvain, Hybrid Louvain, and Correlated PageRank.

For more FAQs, please visit our FAQ page.

6. Acknowledgments

BioNeuralNet integrates multiple open-source libraries. We acknowledge key dependencies:

  • PyTorch - GNN computations and deep learning models.
  • PyTorch Geometric - Graph-based learning for multi-omics.
  • NetworkX - Graph data structures and algorithms.
  • Scikit-learn - Feature selection and evaluation utilities.
  • pandas & numpy - Core data processing tools.
  • ray[tune] - Hyperparameter tuning for GNN models.
  • matplotlib - Data visualization.
  • cptac - Dataset handling for clinical proteomics.
  • python-louvain - Community detection algorithms.
  • statsmodels - Statistical models and hypothesis testing (e.g., ANOVA, regression).

We also acknowledge R-based tools for external network construction:

  • SmCCNet - Sparse multiple canonical correlation network.

7. Testing and Continuous Integration

  • Run Tests Locally:

    pytest --cov=bioneuralnet --cov-report=html
    open htmlcov/index.html
    
  • Continuous Integration: GitHub Actions runs automated tests on every commit.

8. Contributing

We welcome contributions! To get started:

git clone https://github.com/UCD-BDLab/BioNeuralNet.git
cd BioNeuralNet
pip install -r requirements-dev.txt
pre-commit install
pytest

How to Contribute

  • Fork the repository, create a new branch, and implement your changes.
  • Add tests and documentation for any new features.
  • Submit a pull request with a clear description of your changes.

9. License

BioNeuralNet is distributed under the MIT License.

10. Contact

11. References

[1] Abdel-Hafiz, M., Najafi, M., et al. "Significant Subgraph Detection in Multi-omics Networks for Disease Pathway Identification." Frontiers in Big Data, 5 (2022). DOI: 10.3389/fdata.2022.894632

[2] Hussein, S., Ramos, V., et al. "Learning from Multi-Omics Networks to Enhance Disease Prediction: An Optimized Network Embedding and Fusion Approach." In 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Lisbon, Portugal, 2024, pp. 4371-4378. DOI: 10.1109/BIBM62325.2024.10822233

[3] Liu, W., Vu, T., Konigsberg, I. R., Pratte, K. A., Zhuang, Y., & Kechris, K. J. (2023). "Network-Based Integration of Multi-Omics Data for Biomarker Discovery and Phenotype Prediction." Bioinformatics, 39(5), btat204. DOI: 10.1093/bioinformatics/btat204

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioneuralnet-1.0.9.tar.gz (83.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bioneuralnet-1.0.9-py3-none-any.whl (78.6 MB view details)

Uploaded Python 3

File details

Details for the file bioneuralnet-1.0.9.tar.gz.

File metadata

  • Download URL: bioneuralnet-1.0.9.tar.gz
  • Upload date:
  • Size: 83.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bioneuralnet-1.0.9.tar.gz
Algorithm Hash digest
SHA256 fc7efab95017fe83422af94ff121a8c7dace0e1ee23f89073db17f17582138c3
MD5 88d912bcd38799bd3025a38b70c62526
BLAKE2b-256 45738d33aa7b2c8f739a9d5bafbf957533c70aaa9231445007654d466b842e90

See more details on using hashes here.

File details

Details for the file bioneuralnet-1.0.9-py3-none-any.whl.

File metadata

  • Download URL: bioneuralnet-1.0.9-py3-none-any.whl
  • Upload date:
  • Size: 78.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bioneuralnet-1.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 2720106c01477ed4f9e935b9ea4a56690086a4623dc727769e0d6530d7761925
MD5 9b0633772d567414860802684087b874
BLAKE2b-256 7297aa875d84dafba1002cda49edaf15cf2850c0191d5ae809de81f069c9a524

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page