Skip to main content

GUI tools for gene set enrichment analysis

Project description

中文

GSEA GUI Toolset

This is a powerful Gene Set Enrichment Analysis (GSEA) and visualization toolset that provides a comprehensive suite of graphical user interfaces (GUI) for the complete workflow from data preparation to analysis and result presentation.

Key Features

This toolset includes three core modules:

  1. Enrichment Analysis: Supports Over-Representation Analysis (ORA) using hypergeometric distribution and GSEA analysis.
  2. Visualization: Supports generating Dot Plots, Bar Plots, and GSEA-specific enrichment score plots.
  3. GMT Generator: Helps users convert custom annotation tables into standard GMT format gene set files.

Installation

Install via PyPI

pip install gseagui

Quick Start

After installation, you can launch each tool using the following commands:

# Launch the main launcher (contains entry points for all tools)
gseagui

# Launch the enrichment analysis tool separately
gseanrichment

# Launch the visualization tool separately
gseaplotter

# Launch the GMT generator separately
gmtgenerator

Detailed Usage Guide & Input File Formats

1. Enrichment Analysis

This module is used to perform gene enrichment analysis.

Input File Formats

A. Annotation File

  • Format: TSV (Tab-Separated Values) or TXT.
  • Content: Must contain at least two columns:
    • Gene/Protein ID Column: Unique identifier.
    • Annotation Column: Corresponding pathway or functional annotation (e.g., GO Term, KEGG Pathway).
  • Multiple Annotations: If a gene corresponds to multiple annotations, use a separator (e.g., |) to separate them.
  • Example:
GeneID Pathway
GeneA Pathway1|Pathway2
GeneB Pathway1
GeneC Pathway3

B. Gene List File

  • Format: TSV (Tab-Separated Values) or TXT.
  • Content:
    • Gene Column (Required): Gene IDs to be analyzed.
    • Rank Column (Required for GSEA): Numerical values used for GSEA ranking (e.g., log2FoldChange, t-statistic).
    • Group Column (Optional): If you need to analyze different groups separately, you can specify a group column.
  • Example:
GeneID Log2FC Cluster
GeneA 2.5 Group1
GeneB -1.2 Group1
GeneC 0.5 Group2

C. GMT File

  • Standard GMT format file, usually generated by the GMT Generator or downloaded from MSigDB.

Steps

  1. In the "Annotation Processing" tab, load the Annotation File and select the corresponding ID column and annotation column.
  2. Click the "Create Gene Sets" button to build the background library for analysis.
  3. Switch to the "Enrichment Analysis" tab.
  4. Select Input Method:
    • From File: Load the gene list file, select the gene column, rank column (if doing GSEA), and group column (optional).
    • Direct Input: Paste the gene list into the text box.
  5. Select Analysis Method: Hypergeometric or GSEA.
    • For GSEA, you can adjust the minimum/maximum gene set size.
  6. Set the output directory and prefix, then click "Run Enrichment Analysis".

Output Results

  • TSV File: Contains detailed statistical results of the enrichment analysis (P-value, FDR, NES, etc.).
  • Pickle (.pkl) File: (Optional) Saves the complete analysis object for subsequent plotting in the visualization tool.

2. Visualization

This module is used to visualize the results of enrichment analysis.

Input File Formats

A. TSV Result File

  • Usually generated by the enrichment analysis module, or follows a general enrichment analysis result format.
  • Required Columns (automatically identified, can also be manually specified):
    • Term/Pathway Name
    • P-value or Adjusted P-value (used for color mapping)
    • Count/Size (used for dot size)
    • NES (used for GSEA result display)

B. Pickle (.pkl) File

  • GSEA result object file generated by the enrichment analysis module.

Features and Chart Types

1. Dot Plot

  • Usage: Display enriched pathways and their significance.
  • Parameters:
    • Column: Numerical column mapped to dot size or color.
    • X/Group: X-axis grouping.
    • Hue: Color mapping column.
    • Threshold: Filter out non-significant pathways.
    • Style: Adjustable dot size scaling, shape, color map (Colormap).

2. Bar Plot

  • Usage: Simply display Top pathways.
  • Parameters: Supports custom colors, legend position, etc.

3. GSEA Plot

  • Usage: Only available after loading a PKL file.
  • Function: Draw classic GSEA Running Enrichment Score plots.
  • Interaction: Select one or multiple Terms from the list to plot. Supports showing/hiding Ranking metric, adjusting legend position and font size.

3. GMT Generator

This module is used to create custom GMT files.

Input File Formats

  • Same as the Annotation File in Enrichment Analysis.

Steps

  1. Load the annotation file (TSV/TXT).
  2. Select ID Column and Annotation Column.
  3. Annotation Processing Settings:
    • Enable Annotation Split: If a cell contains multiple annotations (e.g., GO:001|GO:002), check this and set the separator (e.g., |).
    • Min Gene Count: Filter out pathways with too few genes.
    • Invalid Values: Exclude rows containing specific characters (e.g., nan, None).
  4. Set the output directory and click "Generate GMT File".

Output Results

  • .gmt File: Standard format, can be directly used for GSEA analysis or GSEA software.
    • Format: PathwayName NA Gene1 Gene2 Gene3 ...

License

This project is licensed under the BSD License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gseagui-0.1.5.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gseagui-0.1.5-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file gseagui-0.1.5.tar.gz.

File metadata

  • Download URL: gseagui-0.1.5.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for gseagui-0.1.5.tar.gz
Algorithm Hash digest
SHA256 d0d7c42b5105696a5fbd7c355ef6da3d6c826ad970f3d2f7b9bab82b4c326723
MD5 b829e4ba68fac76fdc8f5fa48ee78234
BLAKE2b-256 58ece464b1fc3ab4e38a21718c71318db73d0a1e819a21b30609c1475b925d13

See more details on using hashes here.

File details

Details for the file gseagui-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: gseagui-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 30.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for gseagui-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ab80c98496d1aa3723fef628df95986bcf65b73dbe679c9bbaf878d7a5d8f8c9
MD5 26af227f237048fcfbe9e448b0bcfba8
BLAKE2b-256 9967bf722cc9c026b537b3dc0f0cdfe2a111fbfebbc95c6194ece73b10f5e527

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page