An adaptation of gene set enrichment analysis to clustered single cell data.
Project description
NOTE: The method for this module is still under active development, not finalized, and should not be used.
scGSEA
Description: scGSEA is an extension of ssGSEA designed to improve the assessment of pathway activity in single-cell data by addressing sparsity and ensuring stable enrichment scoring. This module is intended to be used subsequent to Seurat.Clustering module and the user can supply the Seurat RDS file.
Authors: John Jun; UCSD - Mesirov Lab, UCSD
Contact: Forum Link.
Summary
scGSEA is an extension of ssGSEA tailored for single-cell data analysis. It addresses the challenges of sparsity and unreliable enrichment scoring by employing specialized normalization methods and scoring metrics. By utilizing scGSEA, scientists can explore and interpret pathway activity and functional alterations within heterogeneous populations of cells, thereby advancing our understanding of complex biological systems.
Parameters
Name | Description | Default Value |
---|---|---|
input_file * | File to be read in RDS format | |
chip_file | Chip file used for conversion to gene symbols | |
gene_set_database_file * | Gene set data in GMT format | |
output_file_name * | The basename to use for output file | scGSEA_scores |
* required
Input Files
input_file
This is the Seurat RDS file from the Seurat.Clustering module.chip_file
This parameter’s drop-down allows you to select CHIP files from the Molecular Signatures Database (MSigDB) on the GSEA website. This drop-down provides access to only the most current version of MSigDB. You can also upload your own gene set file(s) in CHIP format.gene_set_database_file
- This parameter’s drop-down allows you to select gene sets from the Molecular Signatures Database (MSigDB) on the GSEA website. This drop-down provides access to only the most current version of MSigDB. You can also upload your own gene set file(s) in GMT format.
- If you want to use files from an earlier version of MSigDB you will need to download them from the archived releases on the website.
output_file_name
The prefix used for the name of the output GCT and CSV file. If unspecified, output prefix will be set toscGSEA_scores
. The output CSV and GCT files will contain the projection of input dataset onto a space of gene set enrichments scores.cluster_data_label
The name of the metadata label within the input Seurat object. This label will be used to access the annotations utilized for aggregating cells. The default value for this parameter isseurat_clusters
, which is the metadata label for cluster annotations generated upon running Seurat.Clustering module. Use the default value when using the RDS file generated from the Seurat.Clustering module.
Output Files
<output_file_name>.csv
This is a gene set by cell cluster data consisted of scGSEA scores.<output_file_name>.gct
This is a gene set by cell cluster data consisted of scGSEA scores. The HeatmapViewer module can accept this file as input for generating heatmap visualizations.cluster_expression.csv
This is a gene by cell cluster data consisted of normalized gene expression level.stdout.txt
This is standard output from the script.
For more details, please refer to the full documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.