Skip to main content

A package to generate and interpret biologically informed neural networks.

Project description

Biologically Informed Neural Network (BINN)

Docs License: MIT PyPI version Python application DOI

BINN documentation is avaiable here.

The BINN-package allows you to create a sparse neural network from a pathway and input file. The examples presented in docs use the Reactome pathway database and a proteomic dataset to generate the neural network. It also allows you to train and interpret the network using SHAP. Plotting functions are also available for generating sankey plots. The article presenting the BINN can currently be found here.


Installation

BINN can be installed via pip

pip install binn

The package can also be built from source and installed with git.

git clone git@github.com:InfectionMedicineProteomics/BINN.git
pip install -e BINN/

Usage

First, a network is created. This is the network that will be used to create the sparse BINN.

from binn import BINN, Network
import pandas as pd

input_data = pd.read_csv("../data/test_qm.tsv", sep="\t")
translation = pd.read_csv("../data/translation.tsv", sep="\t")
pathways = pd.read_csv("../data/pathways.tsv", sep="\t")

network = Network(
    input_data=input_data,
    pathways=pathways,
    mapping=translation,
    verbose=True
)

The BINN can thereafter be generated using the network:

binn = BINN(
    pathways=network,
    n_layers=4,
    dropout=0.2,
    validate=False,
)

An sklearn wrapper is also available:

from binn import BINNClassifier

binn = BINNClassifier(
    pathways=network,
    n_layers=4,
    dropout=0.2,
    validate=True,
    epochs=10,
    threads=10,
)

This generates the Pytorch sequential model:

Sequential(
  (Layer_0): Linear(in_features=446, out_features=953, bias=True)
  (BatchNorm_0): BatchNorm1d(953, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (Dropout_0): Dropout(p=0.2, inplace=False)
  (Tanh 0): Tanh()
  (Layer_1): Linear(in_features=953, out_features=455, bias=True)
  (BatchNorm_1): BatchNorm1d(455, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (Dropout_1): Dropout(p=0.2, inplace=False)
  (Tanh 1): Tanh()
  (Layer_2): Linear(in_features=455, out_features=162, bias=True)
  (BatchNorm_2): BatchNorm1d(162, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (Dropout_2): Dropout(p=0.2, inplace=False)
  (Tanh 2): Tanh()
  (Layer_3): Linear(in_features=162, out_features=28, bias=True)
  (BatchNorm_3): BatchNorm1d(28, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (Dropout_3): Dropout(p=0.2, inplace=False)
  (Tanh 3): Tanh()
  (Output layer): Linear(in_features=28, out_features=2, bias=True)
)

Example input

data

Data - this file should contain a column with the feature names (quantmatrix or some matrix containing input column - in this case "Protein"). These need to map to the input layer of the BINN, either directly or by providing a translation file.

Protein
P00746
P00746
P04004
P27348
P02751
...

Pathways file - this file should contain the mapping used to create the connectivity in the hidden layers.

target source
R-BTA-109581 R-BTA-109606
R-BTA-109581 R-BTA-169911
R-BTA-109581 R-BTA-5357769
R-BTA-109581 R-BTA-75153
R-BTA-109582 R-BTA-140877
...

Translation file - this file is alternative, but is useful if some translation is needed to map the input features to the pathways in the hiddenn layers. In this case, it is used to map proteins (UniProt IDs) to pathways (Reactome IDs).

input translation
A0A075B6P5 R-HSA-166663
A0A075B6P5 R-HSA-173623
A0A075B6P5 R-HSA-198933
A0A075B6P5 R-HSA-202733
A0A075B6P5 R-HSA-2029481
...

Plotting

Plotting a subgraph starting from a node generates the plot: Pathway sankey! A complete sankey may look like this: Complete sankey!

Testing

The software has been tested on desktop machines running Windows 10/Linux (Ubuntu). Small networks are not RAM-intensive and all experiments have been run comfortably with 16 GB RAM.

Cite

Please cite:

Hartman, E., Scott, A.M., Karlsson, C. et al. Interpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis. Nat Commun 14, 5359 (2023). https://doi.org/10.1038/s41467-023-41146-4

if you use this package.

Contributors

Erik Hartman, infection medicine proteomics, Lund University

Aaron Scott, infection medicine proteomics, Lund University

Contact

Erik Hartman - erik.hartman@hotmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

binn-0.0.3.tar.gz (26.0 kB view details)

Uploaded Source

Built Distribution

binn-0.0.3-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file binn-0.0.3.tar.gz.

File metadata

  • Download URL: binn-0.0.3.tar.gz
  • Upload date:
  • Size: 26.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for binn-0.0.3.tar.gz
Algorithm Hash digest
SHA256 4a349002219ff7c91882e9744e6c8e215b9c1d38198b7eafe48a66a9c4f19ace
MD5 491134e9dd1cf5e917fee88a231c6e91
BLAKE2b-256 473aab79f645365e8b285e2f39f3b60a410f8f5c2007e25009ccab315c70e0f6

See more details on using hashes here.

File details

Details for the file binn-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: binn-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for binn-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9ffce17960881fe10f233b4a426871bb047e5595e0e7e1e5b05a539d30ea9840
MD5 52ba82b3b69a5aee2ac862d29e575ed7
BLAKE2b-256 b18bd71811c56b5cbafa2c6684bf8f6ed0d6425a70697c9faaae2ef005ae1b70

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page