Multilayer networks for biological multimodal data fusion and analysis.

These details have not been verified by PyPI

Project links

Homepage

Project description

BioFusion

A tool for multimodal biological data integration and analysis with the help of multilayer networks.

This repository contains code developed during collaboration between Fujitsu Research of Europe and Barcelona Supercomputing Center.

Installation

You can install package from PyPI:

pip install biofusion

For developers, to install the last version of the package please run the command:

pip install -e .

from the package roor directory.

End-to-end example

1. Set up the project

1.1. Install `uv` package manager

Follow instructions from here. For Linux/MacOS the command is:

curl -LsSf https://astral.sh/uv/install.sh | sh

1.2. Create project dir and corresponding Python environment

mkdir biofusion-demo
cd biofusion-demo
uv venv --python=3.12.9

The last command has created .venv folder with local Python environment. Let’s activate it:

source .venv/bin/activate

Let’s install biofusion package:

uv pip install biofusion

2. Create the data files

2.1. Create the `data` folder

mkdir data

2.2. Populate the data folder

In the root of the project create the notebook (e.g. 01_demo.ipynb). Open notebook in your favorite IDE (e.g. VS Code) and select the Jupyter kernel from the environment that we created before. After this we are ready to generate some synthetic data to check the community detection algorithms. In the notebook enter and run the following cells:

from BioFusion.utils import generate_and_save_graphs
# each layer/graph is described by the tuple of parameters
# first tuple element is the number of unique nodes, second is a probability of the
# edge in between two random nodes and third is the label string
graph_params = [(300, 0.2, ""), (500, 0.2, ""), (400, 0.2, ""), (300, 0.4, "")]
# all generated graps will be stored in the dir below in the format `1.csv`, ... `<N>.csv`, wheree <N> is the number of tuples in the list `graph_params`
path_dir_to = "./data/"
generate_and_save_graphs(graph_params, path_dir_to)

2.3. Create the output folder

Folder to store the reesults of the analysis:

mkdir out

After running commands in this section the files in the project will be created:

biofusion-demo$ tree
.
├── 01_demo.ipynb
├── data
│   ├── 1.csv
│   ├── 2.csv
│   ├── 3.csv
│   └── 4.csv
└── out

3. Run community detection

Import required dependencies:

import os
from BioFusion.cmmd import cmmd

Define the layers of multiayer network:

prefix = "./data/"
input_layers = [prefix + x for x in os.listdir(prefix) if x.endswith(".csv")]
# sort the input layers, os ignores the alphanumeric order of the files
input_layers.sort()

Define parameters of the community detection algorithm:

gamma_min = 0
gamma_max = 10
gamma_step = 0.5
path_to_communities = "./out/"

Run the community detection algorithm:

cmmd_output = cmmd(
    nodelist = None,
    input_layers = input_layers,
    gamma_min = gamma_min,
    gamma_max = gamma_max,
    gamma_step = gamma_step,
    path_to_communities = path_to_communities,
    distmethod = "hamming")

Output of the algorithm is sotred in the ./out folder.

The whole script:

import os
from BioFusion.utils import generate_and_save_graphs
from BioFusion.cmmd import cmmd

graph_params = [(300, 0.2, ""), (500, 0.2, ""), (400, 0.2, ""), (300, 0.4, "")]

path_dir_to = "./data/"
generate_and_save_graphs(graph_params, path_dir_to)
prefix = "./data/"

input_layers = [prefix + x for x in os.listdir(prefix) if x.endswith(".csv")]
input_layers.sort()

gamma_min = 0
gamma_max = 10
gamma_step = 0.5

path_to_communities = "./out/"

cmmd_output = cmmd(
    nodelist = None,
    input_layers = input_layers,
    gamma_min = gamma_min,
    gamma_max = gamma_max,
    gamma_step = gamma_step,
    path_to_communities = path_to_communities,
    distmethod = "hamming")

Organisation

The directory structure is as follows:

.
|-- data
|   |-- GeneCelltypes
|   |   |-- gene_celltypes_all_common.txt
|   |   |-- gene_celltypes_all_common_cnv.txt
|   |   |-- gene_celltypes_all_common_rna.txt
|   |   |-- gene_celltypes_all_unique.txt
|   |   |-- gene_celltypes_all_unique_cnv.txt
|   |   `-- gene_celltypes_all_unique_rna.txt
|   |-- MultilayerCommunities
|   |   |-- <BSC-community-trajectories.tsv>
|   |   `-- <BSC-distance-matrix.tsv>
|   |-- MultilayerGraphs
|   |   |-- <BSC-MLN-layer-1.json>
|   |   |-- :
|   |   `-- <BSC-MLN-layer-5.json>
|   |-- TCGA_BRCA_Dic_Hover_files
|   |   `-- TCGA-E2-A1B6-01A-03-TSC.f0917d61-c963-42cf-86c7-48b1e70c662d.pt
|   |-- TopGenesWSI
|   |   |-- common_genes
|   |   |   |-- box_level
|   |   |   |   `-- TCGA-E2-A1B6-01A-03-TSC.f0917d61-c963-42cf-86c7-48b1e70c662d
|   |   |   |       `-- stats.csv
|   |   |   `-- wsi_level
|   |   `-- unique_genes
|   |       |-- box_level
|   |       `-- wsi_level
|   |-- cnv.csv
|   `-- rna.csv
|-- outputs
|   |-- TCGA_BRCA_spatial
|   |-- TCGA_Gene_Graphs
|   `-- TopGenesMLN
|-- scripts
|   |-- create_gene_graph.py
|   |-- create_gene_list.py
|   |-- get_WSI_celltype_weights.py
|   `-- get_WSI_gene_info.py
|-- README.md
`-- requirements.txt

Usage

The Python scripts can be run from the /scripts directory after installing all necessary Python modules as listed in requirements.txt.

The following scripts are provided:

create_gene_list.py - Description: This script finds the set of genes that are common between the MLN and the genomic data (CNV or RNA). Files in the folder that have suffix “_cnv” and “_rna” are generated using this script. - Input: /data/GeneCelltypes, /data/cnv.csv - Output: /data/GeneCelltypes

get_WSI_gene_info.py - This script/module reads top genes from WSI patches and retrieves gene associations and significant neighbourhood communities from multilayer network. - Input: /data/TopGenesWSI - Output: /outputs/TopGenesMLN

get_WSI_celltype_weights.py - This script takes WSI Graphs (where patches correspond to groups of nodes), gene celltype associations, and bulk-RNA data, and produces heatmaps of approximated spatial gene expression. - Input: /data/TCGA_BRCA_Dic_Hover_files, /data/GeneCelltypes, /data/rna.csv - Output: /outputs/TCGA_BRCA_spatial

create_gene_graph.py - Description: This script takes the genomic data (CNV or RNA) and MLN graphs (along with computes Louvain community based Hamming distance matrix) and generates a hierarchical clustering based similarity matrix for the genes and a gene graph with edge attributes reflecting the gene-gene similarities. - Input: /data/cnv.csv, /data/MultilayerGraphs, /dataa/MultilayerCommunities - Output: /outputs/TCGA_Gene_Graphs

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.6

Mar 25, 2025

0.0.5

Mar 25, 2025

0.0.4

Mar 25, 2025

0.0.3

Mar 17, 2025

0.0.2

Jan 31, 2025

0.0.1

Jan 31, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biofusion-0.0.6.tar.gz (113.2 kB view details)

Uploaded Mar 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

biofusion-0.0.6-py3-none-any.whl (113.0 kB view details)

Uploaded Mar 25, 2025 Python 3

File details

Details for the file biofusion-0.0.6.tar.gz.

File metadata

Download URL: biofusion-0.0.6.tar.gz
Upload date: Mar 25, 2025
Size: 113.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for biofusion-0.0.6.tar.gz
Algorithm	Hash digest
SHA256	`abae6e55901b0ff8be624a3cf982f178d185705fe0771c1709ae1d581c495b1c`
MD5	`1edbbacda2d6dc49120021c7ffe5dff7`
BLAKE2b-256	`62746a368b99ded0b779b7745f088407d9c49a0f53c14c2b98f151b0da19685c`

See more details on using hashes here.

File details

Details for the file biofusion-0.0.6-py3-none-any.whl.

File metadata

Download URL: biofusion-0.0.6-py3-none-any.whl
Upload date: Mar 25, 2025
Size: 113.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for biofusion-0.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4b9141d35b04626e6a95e4337ee2ea030c2674ae082115ac8a937f2585c00f54`
MD5	`172f850ace219884687e8e068b667035`
BLAKE2b-256	`316382e052383373f5b1f6d5c3b9e65bb044a9e961125e2f4ef4d16fd6c87a90`

See more details on using hashes here.

biofusion 0.0.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

BioFusion

Installation

End-to-end example

1. Set up the project

1.1. Install `uv` package manager

1.2. Create project dir and corresponding Python environment

2. Create the data files

2.1. Create the `data` folder

2.2. Populate the data folder

2.3. Create the output folder

3. Run community detection

Organisation

Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

biofusion 0.0.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

BioFusion

Installation

End-to-end example

1. Set up the project

1.1. Install uv package manager

1.2. Create project dir and corresponding Python environment

2. Create the data files

2.1. Create the data folder

2.2. Populate the data folder

2.3. Create the output folder

3. Run community detection

Organisation

Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1.1. Install `uv` package manager

2.1. Create the `data` folder