CFI: Cell Functionality and Interaction Analysis Tool
Project description
CFI: Cell Functionality and Interaction Analysis Tool
Author: Jakub Kubiś
Polish Academy of Sciences
Description
CFI (Cell Functionality and Interaction Analysis Tool) is an analytical platform designed for the investigation of cellular functions, intra-cellular gene interactions, and intercellular communication, including ligand–receptor interactions and adhesion junctions.
The tool integrates the advanced enrichment and interaction analysis capabilities of GEDSpy with the single-cell data processing functionalities provided by JDtI.
CFI extends these capabilities by enabling the identification of direct cell–cell interactions, dominant biological processes, and gene–gene interaction networks within individual cells, along with comprehensive visualization of these relationships.
CFI is designed to streamline the interpretation of biological data, enabling researchers to perform in-depth analyses of cellular functions and interactions for drug target discovery.
Included data bases:
- Gene Ontology (GO-TERM)
- KEGG (Kyoto Encyclopedia of Genes and Genomes)
- Reactome
- HPA (Human Protein Atlas)
- NCBI
- STRING
- IntAct
- CellTalk
- CellPhone
If you use CFI, please remember to cite both CFI and the original sources of the data you utilized in your work.
In the case of interactions network analyses, it is recommended to use the JVectorGraph library to easily adjust and customise graph and network visualisations from the Python side.
📚 Table of Contents
- Installation
- Documentation
- Example usage
Installation
In command line write:
pip install cfi
Documentation
Documentation for classes and functions is available here 👉 Documentation 📄
Example usage
1. Cell functionality
1.1. Create project
# ------------------------------------------------------------
# Import required standard and project-specific libraries
# ------------------------------------------------------------
import os
from jdti import COMPsc # JDtI module for handling single-cell projects
from cfi import CellFunCon # Cell functional connectivity / enrichment analysis
# ------------------------------------------------------------
# Load single-cell sequencing data using the JDtI framework
# ------------------------------------------------------------
# Define the project directory containing input data
# and specify the sample identifiers to be loaded
jseq_object = COMPsc.project_dir(
os.path.join(os.getcwd(), "data"), # path to data directory
["s1"] # list of project/sample IDs
)
# Load sparse expression matrices from the project structure
# normalized_data=True ensures that pre-normalized counts are used
jseq_object.load_sparse_from_projects(normalized_data=True)
# ------------------------------------------------------------
# Initialize CellFunCon analysis object
# ------------------------------------------------------------
# Create a CellFunCon instance using the loaded single-cell dataset.
# This object enables downstream functional enrichment,
# interaction analysis, and pathway inference.
instance = CellFunCon(jseq_object)
1.2. Calculate cell marker genes
# ------------------------------------------------------------
# Required step before functional enrichment analysis
# ------------------------------------------------------------
# Identify cell-type–specific marker genes based on:
# - minimum expression threshold (min_exp)
# - minimum fraction of expressing cells (min_pct)
# - parallel computation across multiple processes (n_proc)
#
# The resulting marker set is used as input for downstream
# functional enrichment and pathway analysis.
instance.calculate_cells_markers(
min_exp=0,
min_pct=0.05,
n_proc=10
)
1.3. Marker gene enrichment analysis
# ------------------------------------------------------------
# Perform functional enrichment analysis for each cell population
# ------------------------------------------------------------
# This step evaluates overrepresentation of biological functions,
# pathways, and gene sets using previously identified marker genes.
#
# Parameters:
# - p_value : significance threshold for enrichment results
# - log_fc : minimum log fold-change required for marker inclusion
# - top_max : maximum number of top-ranked markers used per cell type
#
# The output provides functionally annotated cell states
# for downstream biological interpretation.
instance.enrich_cells_fucntionality(
p_value=0.05,
log_fc=0.25,
top_max=500
)
- GO-TERM
data = instance.get_enrichment_data(
data_type = 'GO-TERM',
p_value = 0.05,
test = 'FISH',
adj = 'BH',
parent_inc = False,
top_n = 50)
| parent | parent_genes | parent_pval_FISH | parent_pval_BIN | parent_n | parent_pct | child | child_genes | child_pval_FISH | child_pval_BIN | child_n | child_pct | parent_name | child_name | cell | source |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GO:0060090 | ['LETMD1', 'TNRC6A'] | 0.01397 | 0.01402 | 2 | 0.04348 | GO:0030674 | ['TNRC6A', 'SRRT', 'CRADD', 'SLMAP'] | 3.22e-05 | 3.30e-05 | 4 | 0.08696 | MF : molecular adaptor activity | MF : protein-macromolecule adaptor activity | STRIATUM_1 # s1 | GO-TERM |
| GO:0032991 | ['CLU','SRRT','HSP90AA1','SNX4','CHEK1','PPP3CB'] | 9.38e-05 | 9.49e-05 | 6 | 0.13043 | GO:0005955 | ['PPP3CB'] | 0.00459 | 0.00458 | 1 | 0.02174 | CC : protein-containing complex | CC : calcineurin complex | STRIATUM_1 # s1 | GO-TERM |
| GO:0032991 | ['CLU','SRRT','HSP90AA1','SNX4','CHEK1','PPP3CB'] | 9.38e-05 | 9.49e-05 | 6 | 0.13043 | GO:0031428 | ['FBL'] | 0.00535 | 0.00535 | 1 | 0.02174 | CC : protein-containing complex | CC : box C/D methylation guide snoRNP complex | STRIATUM_1 # s1 | GO-TERM |
| GO:0032991 | ['CLU','SRRT','HSP90AA1','SNX4','CHEK1','PPP3CB'] | 9.38e-05 | 9.49e-05 | 6 | 0.13043 | GO:0005664 | ['ORC6'] | 0.00611 | 0.00611 | 1 | 0.02174 | CC : protein-containing complex | CC : nuclear origin of replication recognition complex | STRIATUM_1 # s1 | GO-TERM |
| GO:0032991 | ['CLU','SRRT','HSP90AA1','SNX4','CHEK1','PPP3CB'] | 9.38e-05 | 9.49e-05 | 6 | 0.13043 | GO:0008287 | ['PPP3CB'] | 0.00611 | 0.00611 | 1 | 0.02174 | CC : protein-containing complex | CC : protein serine/threonine phosphatase complex | STRIATUM_1 # s1 | GO-TERM |
Visualization
from cfi import encrichment_cell_heatmap
fig = encrichment_cell_heatmap(data = data,
fig_size = (3,3),
sets = None,
top_n = 3,
test = 'FISH',
adj = 'BH',
parent_inc = False,
font_size = 16,
clustering = 'ward',
scale = True)
- KEGG
data = instance.get_enrichment_data(
data_type = 'KEGG',
test = 'FISH',
adj = 'BH',
parent_inc = False,
top_n = 50)
| 2nd | 2nd_genes | 2nd_pval | 2nd_n | 2nd_pct | 3rd | 3rd_genes | 3rd_pval | 3rd_n | 3rd_pct | cell | source |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Neurodegenerative disease | ['ATXN10','VDAC2','PSMA3','APC','SETX','CREBBP','PSMD8','PSMD7','NDUFA4'] | 1.10e-05 | 9 | 0.093 | Spinocerebellar ataxia | ['ATXN10','VDAC2','PSMA3','PSMD8','PSMD7'] | 7.51e-06 | 5 | 0.052 | STRIATUM_2 # s1 | KEGG |
| Neurodegenerative disease | ['ATXN10','VDAC2','PSMA3','APC','SETX','CREBBP','PSMD8','PSMD7','NDUFA4'] | 1.10e-05 | 9 | 0.093 | Huntington disease | ['VDAC2','PSMA3','CREBBP','PSMD8','PSMD7','NDUFA4'] | 2.08e-05 | 6 | 0.062 | STRIATUM_2 # s1 | KEGG |
| Neurodegenerative disease | ['ATXN10','VDAC2','PSMA3','APC','SETX','CREBBP','PSMD8','PSMD7','NDUFA4'] | 1.10e-05 | 9 | 0.093 | Alzheimer disease | ['VDAC2','PSMA3','APC','PSMD8','PSMD7','NDUFA4'] | 8.20e-05 | 6 | 0.062 | STRIATUM_2 # s1 | KEGG |
| Neurodegenerative disease | ['ATXN10','VDAC2','PSMA3','APC','SETX','CREBBP','PSMD8','PSMD7','NDUFA4'] | 1.10e-05 | 9 | 0.093 | Prion disease | ['VDAC2','PSMA3','PSMD8','PSMD7','NDUFA4'] | 1.23e-04 | 5 | 0.052 | STRIATUM_2 # s1 | KEGG |
| Neurodegenerative disease | ['ATXN10','VDAC2','PSMA3','APC','SETX','CREBBP','PSMD8','PSMD7','NDUFA4'] | 1.10e-05 | 9 | 0.093 | Parkinson disease | ['VDAC2','PSMA3','PSMD8','PSMD7','NDUFA4'] | 1.77e-04 | 5 | 0.052 | STRIATUM_2 # s1 | KEGG |
Visualization
from cfi import encrichment_cell_heatmap
fig = encrichment_cell_heatmap(data = enr,
fig_size = (3,3),
sets = None,
top_n = 3,
test = 'FISH',
adj = 'BH',
parent_inc = False,
font_size = 16,
clustering = 'ward',
scale = True)
[!NOTE] In this case, no significant KEGG terms were detected for the second cell type.
To indicate the absence of specific terms, use
instance1.get_included_cells()to display both cells at the top.
Visualization
from cfi import encrichment_cell_heatmap
fig2 = encrichment_cell_heatmap(data = enr,
fig_size = (3,3),
sets = instance1.get_included_cells(),
top_n = 3,
test = 'FISH',
adj = 'BH',
parent_inc = False,
font_size = 16,
clustering = 'ward',
scale = True)
Visualization
- REACTOME
data = instance.get_enrichment_data(
data_type = 'REACTOME',
p_value = 0.05,
test = 'FISH',
adj = 'BH',
parent_inc = False,
top_n = 50)
| pathway | genes | p-value | n | pct | top level pathway | top genes | cell | source |
|---|---|---|---|---|---|---|---|---|
| Resistance of ERBB2 KD mutants to tesevatinib | [HSP90AA1] | 0.00306 | 1 | 0.0217 | Disease | [RNGTT,HSP90AA1,HMGB1] | STRIATUM_1 # s1 | REACTOME |
| Resistance of ERBB2 KD mutants to osimertinib | [HSP90AA1] | 0.00306 | 1 | 0.0217 | Disease | [RNGTT,HSP90AA1,HMGB1] | STRIATUM_1 # s1 | REACTOME |
| Resistance of ERBB2 KD mutants to AEE788 | [HSP90AA1] | 0.00306 | 1 | 0.0217 | Disease | [RNGTT,HSP90AA1,HMGB1] | STRIATUM_1 # s1 | REACTOME |
| Apoptosis induced DNA fragmentation | [HMGB1] | 0.00991 | 1 | 0.0217 | Programmed Cell Death | [HSP90AA1,HMGB1] | STRIATUM_1 # s1 | REACTOME |
| Regulation of CDH11 mRNA translation by miRNAs | [TNRC6A] | 0.00839 | 1 | 0.0217 | Cell-Cell communication | [TNRC6A] | STRIATUM_1 # s1 | REACTOME |
Visualization
from cfi import encrichment_cell_heatmap
fig = encrichment_cell_heatmap(data = enr,
fig_size = (3,3),
sets = instance1.get_included_cells(),
top_n = 3,
test = 'FISH',
adj = 'BH',
parent_inc = False,
font_size = 16,
clustering = 'ward',
scale = True)
- Specificity (HPA)
data = instance.get_enrichment_data(
data_type = 'specificity',
p_value = 1,
test = 'FISH',
adj = 'BH',
parent_inc = False,
top_n = 50)
| specificity | genes | p-value | n | pct | cell | source |
|---|---|---|---|---|---|---|
| Cardiomyocytes | [SLMAP] | 0.116 | 1 | 0.0217 | STRIATUM_1 # s1 | specificity |
| Proximal tubular cells | [ALDH6A1] | 0.098 | 1 | 0.0217 | STRIATUM_1 # s1 | specificity |
| granulocytes | [CLU, ALDH6A1, TIMP2] | 0.132 | 3 | 0.0652 | STRIATUM_1 # s1 | specificity |
| liver | [ALDH6A1] | 0.288 | 1 | 0.0217 | STRIATUM_1 # s1 | specificity |
| kidney | [ALDH6A1] | 0.146 | 1 | 0.0217 | STRIATUM_1 # s1 | specificity |
| Schwann cells | [GRIK3] | 0.030 | 1 | 0.0217 | STRIATUM_1 # s1 | specificity |
Visualization
from cfi import encrichment_cell_heatmap
fig = encrichment_cell_heatmap(data = enr,
fig_size = (3,3),
sets = instance1.get_included_cells(),
top_n = 3,
test = 'FISH',
adj = 'BH',
parent_inc = False,
font_size = 16,
clustering = 'ward',
scale = True)
1.4. Cell inside gene interactions
View available cell names
# ------------------------------------------------------------
# List all cell populations included in the dataset
# ------------------------------------------------------------
# Useful for verifying dataset structure before selecting
# specific cell populations for further investigation.
instance.get_included_cells()
Retrieve interaction data for the selected cell
cell_int = instance.get_gene_interactions('STRIATUM_1 # s1')
| A | B | interaction_type | connection_type | source |
|---|---|---|---|---|
| CLU | CLU | physical association | protein -> protein | Alzheimers |
| CLU | HSP90AA1 | physical association | protein -> protein | Alzheimers |
| FBL | KRR1 | gene -> gene | STRING | |
| FBL | KRR1 | protein -> protein | STRING | |
| KRR1 | FBL | gene -> gene | STRING | |
| KRR1 | FBL | protein -> protein | STRING |
Visualization
from cfi import gene_interaction_network
fig5 = gene_interaction_network(idata = cell_int, min_con = 2)
from JVG import JVG
nt = JVG.NxEditor(fig5)
nt.edit()
1.5. Cell-cell interactions
Calculate cells interactions
# ------------------------------------------------------------
# Infer functional and molecular connections between cell populations
# ------------------------------------------------------------
# This step identifies potential interactions based on
# inferred molecular communication signals
#
# The resulting network describes relationships between cells
# and can be used for downstream visualization, pathway analysis,
# or biological interpretation of tissue organization.
instance.calculate_cell_connections()
Get data
cell_con = instance.get_cell_connections()
| interaction | directionality | classification | modulatory_effect | interactor1 | interactor2 | cell1 | cell2 |
|---|---|---|---|---|---|---|---|
| CDH2 -> CDH2 | Adhesion-Adhesion | Adhesion by Cadherin | CDH2 | CDH2 | STRIATUM_1 | STRIATUM_2 | |
| CDH6 -> CDH6 | Adhesion-Adhesion | Adhesion by Cadherin | CDH6 | CDH6 | STRIATUM_1 | STRIATUM_2 | |
| CDH7 -> CDH7 | Adhesion-Adhesion | Adhesion by Cadherin | CDH7 | CDH7 | STRIATUM_1 | STRIATUM_2 | |
| COL11A1 -> ITGA1+ITGB1 | Adhesion-Adhesion | Adhesion by Collagen/Integrin | COL11A1 | ITGA1 | STRIATUM_1 | STRIATUM_2 | |
| COL11A1 -> ITGA1+ITGB1 | Adhesion-Adhesion | Adhesion by Collagen/Integrin | COL11A1 | ITGB1 | STRIATUM_1 | STRIATUM_2 |
Visualization
from cfi import draw_cell_conections
fig = draw_cell_conections(cell_con)
from JVG import JVG
nt = JVG.NxEditor(fig)
nt.edit()
1.6. Saving & loading project
Saving current project
instance.save_project('project')
Loading previously saved project
from cfi import CellFunCon
instance = CellFunCon.load_project('project.psc')
2. Comparison of cell interaction sets
2.1. Create projects
# ------------------------------------------------------------
# Import required libraries
# ------------------------------------------------------------
import os
from jdti import COMPsc # JDtI module for single-cell project handling
from cfi import CellFunCon # Functional analysis and cell interaction inference
# ------------------------------------------------------------
# Load single-cell datasets for two experimental conditions
# ------------------------------------------------------------
# Each COMPsc object represents an independent project/sample
# loaded from the same data directory but identified by
# different sample IDs (e.g., control vs. case).
jseq_object1 = COMPsc.project_dir(
os.path.join(os.getcwd(), "data"),
["s1"] # first dataset / condition
)
jseq_object1.load_sparse_from_projects(normalized_data=True)
jseq_object2 = COMPsc.project_dir(
os.path.join(os.getcwd(), "data"),
["s2"] # second dataset / condition
)
jseq_object2.load_sparse_from_projects(normalized_data=True)
# ------------------------------------------------------------
# Initialize CellFunCon analysis objects for comparative study
# ------------------------------------------------------------
# Separate instances enable:
# - independent marker detection
# - condition-specific functional enrichment
# - downstream comparison of cell functionality and interactions
instance1 = CellFunCon(jseq_object1)
instance2 = CellFunCon(jseq_object2)
2.2. Calculate interactions
# ------------------------------------------------------------
# Calculate cell–cell connection networks for each dataset
# ------------------------------------------------------------
# This step reconstructs potential functional and molecular
# interactions between cell populations independently for:
# - dataset 1 (e.g., control)
# - dataset 2 (e.g., disease / treated condition)
instance1.calculate_cell_connections()
instance2.calculate_cell_connections()
2.3. Comparison analysis
# ------------------------------------------------------------
# Compare cell–cell interaction networks between conditions
# ------------------------------------------------------------
# This step performs a comparative analysis of inferred
# cell–cell connections across multiple biological conditions.
#
# Each entry in `instances_dict` represents a fully processed
# CellFunCon object with reconstructed interaction networks.
# Here, we contrast:
# - healthy condition
# - disease condition
from cfi import compare_connections
instances_dict = {
"healthy": instance2,
"disease": instance1
}
comparison = compare_connections(instances_dict=instances_dict,
cells_compartment = None,
connection_type = ['Adhesion-Adhesion',
'Gap-Gap',
'Ligand-Ligand',
'Ligand-Receptor',
'Receptor-Receptor',
'Undefined'])
| feature | p-value | pct_valid | pct_ctrl | FC | log(FC) | norm_diff | group |
|---|---|---|---|---|---|---|---|
| HSP90B1 | 4.75e-14 | 0.164 | 0.935 | 0.160 | -2.65 | -6.58 | healthy |
| CANX | 5.90e-14 | 0.091 | 0.848 | 0.102 | -3.30 | -5.53 | healthy |
| ARPC5 | 1.34e-11 | 0.055 | 0.739 | 0.080 | -3.64 | -4.93 | healthy |
| CALR | 5.47e-09 | 0.182 | 0.761 | 0.226 | -2.14 | -4.41 | healthy |
| APLP2 | 1.16e-08 | 0.073 | 0.609 | 0.115 | -3.12 | -3.73 | healthy |
| ITGB1 | 1.44e-08 | 0.055 | 0.587 | 0.098 | -3.36 | -3.61 | healthy |
Visualization
from cfi import volcano_plot_conections
fig = volcano_plot_conections(
deg_data = comparison,
p_adj = True,
top = 25,
top_rank = "p_value",
p_val = 0.05,
lfc = 0.25,
rescale_adj = True,
image_width = 12,
image_high = 12,
)
An example analysis pipeline is available here → Example file
Have fun JBS©
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cfi_toolkit-0.1.0.tar.gz.
File metadata
- Download URL: cfi_toolkit-0.1.0.tar.gz
- Upload date:
- Size: 29.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f425c6ca85ea764f4bf5df3f245d2628e79df593fcabb4679b9d5fc48b34b0c
|
|
| MD5 |
c5dbd0caa46629c5675a9617d00875be
|
|
| BLAKE2b-256 |
05bbd298e81e5df263b7aa4e6f3762edf4a72bbe056f896cd5f156a93689f6b8
|
File details
Details for the file cfi_toolkit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cfi_toolkit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a9a586a87302111fcc7c1c5cf140062b922bae79bb3f9c97f6a27aab9d939e4
|
|
| MD5 |
1a5ba1b630f051063729e4b7a5d0994a
|
|
| BLAKE2b-256 |
02b265647eaa516ee87770b77b42fea4f4e9e6d657a842e3464e99ef37665d75
|