Skip to main content

GPU based SCENIC analysis with RegDiffusion and accelerated AUCell analysis

Project description

flashscenic

License: MIT Python 3.9+

GPU-accelerated SCENIC workflow for gene regulatory network analysis. Seconds instead of hours.

flashscenic replaces the bottleneck steps in the SCENIC pipeline with GPU-powered alternatives: RegDiffusion for GRN inference and vectorized PyTorch implementations of AUCell and cisTarget. The result is a complete GRN analysis pipeline that scales to 20,000 genes and millions of cells, running in seconds on a single GPU.

Installation

pip install flashscenic

For documentation development:

pip install flashscenic[docs]

Requirements: Python 3.9+, PyTorch with CUDA support (CPU fallback available).

Quick Start

import flashscenic as fs

# Run the full pipeline in one call
# exp_matrix: (n_cells, n_genes) log-transformed numpy array
# gene_names: list of gene symbols matching columns
result = fs.run_flashscenic(exp_matrix, gene_names, species='human')

# Results
auc_scores = result['auc_scores']       # (n_cells, n_regulons)
regulon_names = result['regulon_names']  # regulon labels

Required resource files (TF lists, ranking databases, motif annotations) are downloaded automatically on first run.

Downloading Data

flashscenic can automatically download the cistarget resource files needed for motif-based pruning:

import flashscenic as fs

# Download human v10 resources (default)
resources = fs.download_data(species='human', version='v10')
print(resources)

# Download mouse resources
resources = fs.download_data(species='mouse')

# List all available resource sets
for rs in fs.list_available_resources():
    print(f"{rs.datasource}/{rs.species}/{rs.version}")

Files are cached in ./flashscenic_data/ by default and skipped on subsequent calls.

Supported species and versions

Species Version Source
human v10 (recommended), v9 Aertslab
mouse v10, v9 Aertslab
drosophila v10 Aertslab

Step-by-Step Usage

For more control, you can run each pipeline step individually:

import numpy as np
import torch
import flashscenic as fs

# 1. GRN Inference (using RegDiffusion separately)
import regdiffusion as rd
trainer = rd.RegDiffusionTrainer(exp_matrix)
trainer.train()
adj_matrix = trainer.get_adj()

# 2. Filter to known TFs and sparsify
# (load your TF list, subset adj_matrix rows, zero out weak edges)

# 3. Module filtering
filtered_adj = fs.select_topk_targets(adj_matrix, k=50, device='cuda')
filtered_adj, tf_mask = fs.filter_by_min_targets(
    filtered_adj, min_targets=20, min_fraction=0.8
)

# 4. cisTarget pruning
pruner = fs.CisTargetPruner(device='cuda')
pruner.load_database(['db_500bp.feather', 'db_10kb.feather'])
pruner.load_annotations('motifs.tbl', filter_for_annotation=True)
regulons = pruner.prune_modules(modules, tf_names, gene_names)

# 5. AUCell scoring
regulon_adj = fs.regulons_to_adjacency(regulons, gene_names)
auc_scores = fs.get_aucell(exp_matrix, regulon_adj, k=50, device='cuda')

Pipeline Parameters

run_flashscenic exposes all tunable parameters with stage-based prefixes:

Prefix Stage Key Parameters
grn_ RegDiffusion grn_n_steps, grn_sparsity_threshold
module_ Module filtering module_k, module_min_targets, module_min_fraction
pruning_ cisTarget pruning_rank_threshold, pruning_nes_threshold, pruning_merge_strategy
annotation_ Motif filtering annotation_motif_similarity_fdr, annotation_orthologous_identity
aucell_ AUCell scoring aucell_k, aucell_auc_threshold, aucell_batch_size

Example with custom parameters:

result = fs.run_flashscenic(
    exp_matrix, gene_names,
    species='mouse',
    module_k=100,
    module_min_targets=10,
    module_min_fraction=None,  # disable fraction filter
    pruning_nes_threshold=2.5,
    device='cpu',
)

Core API

Function / Class Description
run_flashscenic() Full pipeline in one call
download_data() Download cistarget resource files
get_aucell() GPU-accelerated AUCell scoring
CisTargetPruner GPU cisTarget motif pruning
select_topk_targets() Top-k module filtering
filter_by_min_targets() Min-target module filtering
regulons_to_adjacency() Convert regulons to adjacency matrix

Authors

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flashscenic-0.1.0.tar.gz (43.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flashscenic-0.1.0-py2.py3-none-any.whl (26.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file flashscenic-0.1.0.tar.gz.

File metadata

  • Download URL: flashscenic-0.1.0.tar.gz
  • Upload date:
  • Size: 43.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.3

File hashes

Hashes for flashscenic-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cc0e4543b73c0da4e851e147b32d0ab064a7ba3aca8aaf1f0923d5988c6683a5
MD5 0708f4a12805e698c639d49501ece45e
BLAKE2b-256 60535c8d88ec86ae3e5518d9027c9042a50a4b43433a3a0f0d4295dc39f16666

See more details on using hashes here.

File details

Details for the file flashscenic-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: flashscenic-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 26.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.3

File hashes

Hashes for flashscenic-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 31ce24761a656338472c176a1bba4a017aade51e6ac1f1d9f932a3a2cfff011f
MD5 7fb8ad414db5e384097d2a92f55c76ea
BLAKE2b-256 9d53bfad55ca714fa1a0154768502e18c14cf4472088e9bef27a178c2c3964b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page