Skip to main content

Simulation and inference of gene regulatory networks based on transcriptional bursting

Project description

Harissa 🌶

PyPI - Version Conda - Version

This is a Python package for both simulation and inference of gene regulatory networks from single-cell data. Its name comes from ‘HARtree approximation for Inference along with a Stochastic Simulation Algorithm.’ It is implemented in the context of a mechanistic approach to gene regulatory network inference from single-cell data, based upon an underlying stochastic dynamical model driven by the transcriptional bursting phenomenon.

Main functionalities:

  1. Network inference interpreted as calibration of a dynamical model;
  2. Data simulation (typically scRNA-seq) from the same dynamical model.

Other available tools:

  • Basic GRN visualization (directed graphs with positive or negative edge weights);
  • Binarization of scRNA-seq data (using gene-specific thresholds derived from the calibrated dynamical model).

The current version of Harissa has benefited from improvements introduced within Cardamom, which can be seen as an alternative method for the inference part. The two inference methods remain complementary at this stage and may be merged into the same package in the future. They were both evaluated in a recent benchmark.

Installation

Harissa can be installed using pip:

pip install harissa

This command will also check for all required dependencies (see below) and install them if necessary. If the installation is successful, all scripts in the tests folder should run smoothly (note that network4.py must be run before test_binarize.py).

Basic usage

from harissa import NetworkModel
model = NetworkModel()

# Inference
model.fit(data)

# Simulation
sim = model.simulate(time)

Here data should be a two-dimensional array of single-cell gene expression counts, where each row represents a cell and each column represents a gene, except for the first column, which contains experimental time points. A toy example is:

import numpy as np

data = np.array([
    #t g1 g2 g3
    [0, 4, 1, 0], # Cell 1
    [0, 5, 0, 1], # Cell 2
    [1, 1, 2, 4], # Cell 3
    [1, 2, 0, 8], # Cell 4
    [1, 0, 0, 3], # Cell 5
])

The time argument for simulations is either a single time or a list of time points. For example, a single-cell trajectory (not available from scRNA-seq) from t = 0h to t = 10h can be simulated using:

time = np.linspace(0, 10, 1000)

The sim output stores mRNA and protein levels as attributes sim.m and sim.p, respectively (each row is a time point and each column is a gene).

About the data

The inference algorithm specifically exploits time-course data, where single-cell profiling is performed at a number of time points after a stimulus (see this paper for an example with real data). Each group of cells collected at the same experimental time tk forms a snapshot of the biological heterogeneity at time tk. Due to the destructive nature of the measurement process, successive snapshots are made of different cells. Such data is therefore different from so-called ‘pseudotime’ trajectories, which attempt to reorder cells according to some smoothness hypotheses.

Tutorial

Please see the notebooks for introductory examples, or the tests folder for basic usage scripts. To get an idea of the main features, you can start by running the notebooks in order:

  • Notebook 1: simulate a basic repressilator network with 3 genes;
  • Notebook 2: perform network inference from a small dataset with 4 genes;
  • Notebook 3: compare two branching pathways with 4 genes from both ‘single-cell’ and ‘bulk’ viewpoints.

Numerical acceleration

# Inference
model.fit(data, use_numba=True)

# Simulation
sim = model.simulate(time, use_numba=True)

The use_numba option is not activated by default for simulations since it takes time to compile (~8s) but it is then much more efficient (~20 times faster) which is typically suited for large numbers of genes and/or cells.

Dependencies

The package depends on standard scientific libraries numpy and scipy. Optionally, it can load numba for accelerating the inference procedure (used by default) and the simulation procedure (not used by default). It also depends optionally on matplotlib and networkx for network visualization.

Citation

If you use Harissa in your work, please cite this paper (also available on arXiv).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harissa-3.1.2.tar.gz (28.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harissa-3.1.2-py3-none-any.whl (28.4 kB view details)

Uploaded Python 3

File details

Details for the file harissa-3.1.2.tar.gz.

File metadata

  • Download URL: harissa-3.1.2.tar.gz
  • Upload date:
  • Size: 28.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for harissa-3.1.2.tar.gz
Algorithm Hash digest
SHA256 16185ae583a86d8a93c0337b16fd959a2d8fcf9e64e30d4916b2786986c500a6
MD5 a2850f75528a53c145f46bfb8ba18628
BLAKE2b-256 79cbc7da33bd685cc37c2372bdb31136418773d49ece4aaaf60613e3693ddfe6

See more details on using hashes here.

Provenance

The following attestation bundles were made for harissa-3.1.2.tar.gz:

Publisher: pypi.yml on ulysseherbach/harissa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file harissa-3.1.2-py3-none-any.whl.

File metadata

  • Download URL: harissa-3.1.2-py3-none-any.whl
  • Upload date:
  • Size: 28.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for harissa-3.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f5b2362b0be0e81a1d35e65d5ff6ef13aae47e02337d8eb7759b442f40174123
MD5 addf6c7c0c2137f9d903bedac03ee8f6
BLAKE2b-256 cbdd44d1401e522c999b8d6f69c798d77cb8e925682de3761434aa5504f1bfcc

See more details on using hashes here.

Provenance

The following attestation bundles were made for harissa-3.1.2-py3-none-any.whl:

Publisher: pypi.yml on ulysseherbach/harissa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page