pflex is a benchmarking toolkit for evaluating CRISPR screen results against biological functional standards. The toolkit computes gene-level and module-level performance metrics, helping researchers systematically assess the biological relevance and resolution of their CRISPR screening data.
Project description
pFLEX
Abstract
Genetic networks derived from omics data are a powerful tool for systematic gene function prediction. Performance evaluation of such predictions is crucial to judge the data and computational pipeline for network construction, but unbalanced functional standards often cause hidden evaluation biases. To visualize and mitigate such biases, we previously developed the R package FLEX. Here, we present the pFLEX genetic network benchmarking tool as Python library with new and improved functionality. pFLEX improves overall runtime 4.1 to 15.8-fold. It offers additional evaluation metrics that allow for easy comparison of precision recall performance at the module or pathway resolution between genetic networks. We demonstrate the utility of pFLEX for evaluating tissue-specific co-essentiality networks and data normalization strategies of the Cancer Dependency Map, as well as for cell line-specific Perturb-Seq-derived networks. This illustrates the requirement for biological module-resolved precision recall metrics in pFLEX for sensitive and fast evaluation of genetic networks.
Features
- Precision-recall curve generation for ranked gene lists
- Evaluation using CORUM-derived modules, GO terms, and pathways
- Module-level resolution analysis and visualization
- Easy integration into CRISPR screen workflows
- Packaged DepMap example inputs filtered to CORUM genes
Installation
pFLEX is developed and tested with Python 3.10. We recommend installing it in a dedicated Python 3.10 environment to keep the package and its scientific Python dependencies separate from other projects.
Create venv:
conda create -n p310 python=3.10
conda activate p310
pip install uv
Install pFLEX via pip:
uv pip install pflex
or:
pip install pflex
or install pFLEX via git to develop the package locally:
git clone https://github.com/tyasird/pFLEX.git
cd pFLEX
uv pip install -e .
Usage
Full documentation is available at https://tyasird.github.io/pFLEX/.
Input Data
pFLEX expects each input dataset as a matrix with genes in rows and screens, samples, or cell lines in columns.
| Gene | ACH-000014 | ACH-000219 | ACH-000274 |
|---|---|---|---|
| A2M | -0.125 | -0.215 | 0.065 |
| AATF | 0.042 | -0.088 | -0.016 |
| BCL6 | -0.019 | 0.112 | -0.074 |
CSV, Excel, and Parquet files are supported. Parquet is recommended for larger matrices.
The packaged example inputs are real DepMap 25Q2 tissue subsets filtered to genes present in CORUM:
skin_cell_lines_corum_genes.parquet: 3,465 genes x 75 cell linessoft_tissue_cell_lines_corum_genes.parquet: 3,465 genes x 46 cell lines
Use flex.example_input_path() to resolve packaged example inputs:
import pflex as flex
inputs = {
"Skin": {
"path": flex.example_input_path("skin_cell_lines_corum_genes.parquet"),
"sort": "high",
"color": "#4E79A7",
},
"Soft Tissue": {
"path": flex.example_input_path("soft_tissue_cell_lines_corum_genes.parquet"),
"sort": "high",
"color": "#F28E2B",
},
}
Configuration
config = {
"functional_standard": "CORUM",
"min_genes_in_module": 2,
"min_genes_per_module_analysis": 2,
"output_folder": "output",
"analysis_genes": "shared",
"jaccard": True,
"preprocessing": {
"fill_na": True,
},
"corr_function": "numpy_without_mask",
"per_module": {
"n_jobs": 8,
},
"plotting": {
"save_plot": True,
"output_type": "png",
},
}
Common choices:
functional_standard:"CORUM","GOBP","PATHWAY", or a custom.csvpathanalysis_genes:"shared"or"dataset_specific"sort:"high"or"low"per input datasetpreprocessing.fill_na: fill missing values with gene meanscorr_function:"numpy","numpy_without_mask","numba", or"pandas"per_module.n_jobs: worker count for per-module analysis
Analysis Flow
flex.initialize(config)
data, common_genes = flex.load_datasets(inputs)
terms, _ = flex.load_functional_standard()
for name, dataset in data.items():
corr = flex.perform_corr(dataset, config["corr_function"])
flex.pra(name, corr, is_corr=True)
flex.pra_per_module(name, corr, is_corr=True)
flex.module_contributions(name)
flex.mpr_prepare(name)
flex.plot_precision_recall_curve()
flex.plot_auc_scores()
flex.plot_significant_modules()
flex.plot_per_module_scatter(n_top=10)
flex.plot_per_module_scatter_by_size(n_top=10)
flex.plot_module_contributions()
flex.plot_mpr_summary()
flex.save_results_to_csv()
See the User Guide for a detailed explanation of every input field, configuration key, function, return value, and output.
Quickstart
import pflex as flex
inputs = {
"Skin": {
"path": flex.example_input_path("skin_cell_lines_corum_genes.parquet"),
"sort": "high",
"color": "#4E79A7",
},
"Soft Tissue": {
"path": flex.example_input_path("soft_tissue_cell_lines_corum_genes.parquet"),
"sort": "high",
"color": "#F28E2B",
},
}
config = {
"functional_standard": "CORUM",
"output_folder": "output",
"analysis_genes": "shared",
"jaccard": True,
"preprocessing": {
"fill_na": True,
},
"corr_function": "numpy_without_mask",
}
flex.initialize(config)
data, _ = flex.load_datasets(inputs)
for name, dataset in data.items():
corr = flex.perform_corr(dataset, config["corr_function"])
flex.pra(name, corr, is_corr=True)
flex.plot_precision_recall_curve()
flex.plot_auc_scores()
For a runnable full workflow, see src/pflex/examples/basic_usage.py.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pflex-1.2.tar.gz.
File metadata
- Download URL: pflex-1.2.tar.gz
- Upload date:
- Size: 6.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f4848a70000350c809cd8c3046c59db46fd96ecd664c19be03e26d1b983837c
|
|
| MD5 |
fef19dda4886f6ff7374a3175ab8e1f2
|
|
| BLAKE2b-256 |
b321647b9a4e55f0f2b7e5c57cc0f3eeb600a72149edfe66c98f94c0d36f7548
|
File details
Details for the file pflex-1.2-py3-none-any.whl.
File metadata
- Download URL: pflex-1.2-py3-none-any.whl
- Upload date:
- Size: 5.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c95a0576fdce466eed2683724a197f72b22718e91a5d7763224bf555d99b681b
|
|
| MD5 |
a720346d9531472140a1999af00a1338
|
|
| BLAKE2b-256 |
64fe77db5e399e627402a5d2bbf786d6b39edacbab1cbc2cdb011cf94f7c1e7a
|