A package for inference and simulation of gene regulatory networks
Project description
CARDAMOM
CARDAMOM is an executable gene regulatory network (GRN) inference method, adapted to time-course scRNA-seq datasets. The algorithm consists in calibrating the parameters of a mechanistic model of gene expression: the calibrated model can then be simulated, which allows to reproduce the dataset used for inference. The inference method has been introduced in [1]. It has been benchmarked along with other GRN inference tools and applied to a real dataset in [2]. The simulation part is based on the Harissa package.
Dependencies
The package depends on standard scientific libraries numpy and scipy. It also depends on numba which accelerates the inference method as well as the simulations from harissa. The package alive_progress is used to show the progression of simulate_data. Finally, umap-learn and matplotlib are used for the function visualize_data. They can be installed using pip:
pip install umap-learn alive_progress harissa==3.0.7
Directory structure
For each analysis, the user must create a dedicated directory myproject containing the following folders:
cardamom/can be empty before inference, will contain the results of theinfer_network.pyscript;Data/must contain two files, one namedpanel_real.txtcontaining the dataset, and one namedpanel_genes.txtcontaining the gene names;Rates/must contain a filedegradation_rates.txtthat will be used by thesimulate_data.pyscript;Results/can be empty before inference, will contain the results of thevisualize_data.pyscript.
An example is tests/Network4 which corresponds to one of the networks used in [2].
Before using CARDAMOM
Prior to analyzing an experimental scRNA-seq data with CARDAMOM, we recommend that you perform the following steps:
- Select a list of relevant genes;
- Define degradation rates for the corresponding mRNAs and proteins;
- Generate the count table.
1. Select a list of relevant genes
This is a critical and not so easy task. You should aim to obtain a list in the hundred range. CARDAMOM can perform with more but the resulting output GRN will be less and less interpretable as the number of genes grows. For the time being, we recommend starting with say 50 genes, and add more if needed (i.e. if the overall data generation is improved).
The selection step can be performed using any combination of the following: (1) genes known in the litterature to be important for the process being studied; (2) HVG from custom algorithms like scran or Seurat; (3) “dynamically important” genes; (4) differentially expressed genes using a basic t-test between two time points; (5) most delta-entropic genes between two time points.
This selected list of genes will have to be incremented by adding on the first line a “gene” called “Stimulus”. It should take a zero value for the 0h time point and 1 for all the others (see below). The list of gene names should be saved as myproject/Data/panel_genes.txt.
2. Degradation rates of mRNAs and proteins
You can find half-lives for mRNAs and proteins in human in two papers: Blumberg et al. (2021) and Li et al. (2021). Only “short-lived” protein half-lives are estimated in the Li paper. In any case, the “maximum” half-life of proteins will be set by the cell cycle duration. So make sure that no protein has a larger half-life that the cell cycle duration. A compiled table can be found here.
You will then have to generate the degradation rates from the half-lives using the following formula: degradation_rate = log(2)/half_life. The list of degradation rates should be saved as myproject/Rates/degradation_rates.txt.
3. Generate the count table
CARDAMOM will need a count table in .txt format with cells as columns and genes as rows. The first row must correspond to the time points at which the cells are sampled, and the first column to the index of each gene. Then, each line represents the mRNA counts associated to a gene for each cell at each timepoint. Note that the second line corresponds to the stimulus, which is set to 0 at t = 0h and to 1 at t > 0h (see [1] section 5.1 for details). CARDAMOM will be expecting integer values (mRNA counts).
Tutorial
1. Calibrate the model from a reference dataset
Run the following script for calibrating the model from myproject/Data/panel.txt:
python infer_network.py -i [myproject]
The output consists of the following files saved in myproject/cardamom/ where G is the number of genes including stimulus and T is the number of data time points:
basal_t.npyMatrix of size T × G × 1 containing GRN basal parameters inferred at each timepoint;inter_t.npyMatrix of size T × G × G containing GRN interaction parameters inferred at each timepoint;basal.npyMatrix of size G × 1 withbasal = basal_t[-1]containing final basal parameters;inter.npyMatrix of size G × G withinter = inter_t[-1]containing final interaction parameters;kmin.npyVector of size G × 1 containing the minimal burst frequency for each gene;kmax.npyVector of size G × 1 containing the maximal burst frequency for each gene;bet.npyVector of size G × 1 containing the inverse burst size parameter for each gene;data_bool.npyMatrix of the same size as the data, used for initializing simulations.
2. Simulate a dataset from an inferred network
Run the following script for simulating the model from the parameters stored in myproject/cardamom:
python simulate_data.py -i [myproject]
The output is saved as myproject/Data/panel_simulated.txt.
3. Compare simulations to the reference dataset
Run the following script for comparing UMAP representations of myproject/Data/panel_real.txt and myproject/Data/panel_simulated.txt:
python visualize_data.py -i [myproject]
The output images are saved in myproject/Results/.
References
[1] E. Ventre. Reverse engineering of a mechanistic model of gene expression using metastability and temporal dynamics. In Silico Biology, 2021.
[2] E. Ventre, U. Herbach et al. One model fits all: Combining inference and simulation of gene regulatory networks. PLOS Computational Biology, 2023.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cardamom-0.0.3.tar.gz.
File metadata
- Download URL: cardamom-0.0.3.tar.gz
- Upload date:
- Size: 7.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99205ce94649e270094b239d298f575c72a4588257882e95841de8fc1dc496af
|
|
| MD5 |
fc9efcaa0567f98b4ee1d37b00bc9929
|
|
| BLAKE2b-256 |
32f8b7dc873a2948051216e9bfbe5f2c5f976325188d5d60bf393ef91a6c748c
|
Provenance
The following attestation bundles were made for cardamom-0.0.3.tar.gz:
Publisher:
pypi.yml on eliasventre/cardamom
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cardamom-0.0.3.tar.gz -
Subject digest:
99205ce94649e270094b239d298f575c72a4588257882e95841de8fc1dc496af - Sigstore transparency entry: 188405484
- Sigstore integration time:
-
Permalink:
eliasventre/cardamom@cf956ba670a352f3c50e60b901128cac635827cc -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/eliasventre
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@cf956ba670a352f3c50e60b901128cac635827cc -
Trigger Event:
release
-
Statement type:
File details
Details for the file cardamom-0.0.3-py3-none-any.whl.
File metadata
- Download URL: cardamom-0.0.3-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e48f186c9cff9e47eb773dd2d97a7440fb4f47fbded2aeeae059eaf083c3a22
|
|
| MD5 |
d1f8910323ba8b3b88231daedc0ade0d
|
|
| BLAKE2b-256 |
4bbd36a8dcdf6bf846db1cb05c7da26e07f32b4394de0c2c84d6a27a8689c489
|
Provenance
The following attestation bundles were made for cardamom-0.0.3-py3-none-any.whl:
Publisher:
pypi.yml on eliasventre/cardamom
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cardamom-0.0.3-py3-none-any.whl -
Subject digest:
9e48f186c9cff9e47eb773dd2d97a7440fb4f47fbded2aeeae059eaf083c3a22 - Sigstore transparency entry: 188405493
- Sigstore integration time:
-
Permalink:
eliasventre/cardamom@cf956ba670a352f3c50e60b901128cac635827cc -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/eliasventre
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@cf956ba670a352f3c50e60b901128cac635827cc -
Trigger Event:
release
-
Statement type: