Skip to main content

GUI tools for gene set enrichment analysis

Project description

中文

GSEA GUI Toolset

This is a powerful Gene Set Enrichment Analysis (GSEA) and visualization toolset that provides a comprehensive suite of graphical user interfaces (GUI) for the complete workflow from data preparation to analysis and result presentation.

Key Features

This toolset includes three core modules:

  1. Enrichment Analysis: Supports Over-Representation Analysis (ORA) using hypergeometric distribution and GSEA analysis.
  2. Visualization: Supports generating Dot Plots, Bar Plots, and GSEA-specific enrichment score plots.
  3. GMT Generator: Helps users convert custom annotation tables into standard GMT format gene set files.

Installation

Install via PyPI

pip install gseagui

Quick Start

After installation, you can launch each tool using the following commands:

# Launch the main launcher (contains entry points for all tools)
gseagui

# Launch the enrichment analysis tool separately
gseanrichment

# Launch the visualization tool separately
gseaplotter

# Launch the GMT generator separately
gmtgenerator

Detailed Usage Guide & Input File Formats

1. Enrichment Analysis

This module is used to perform gene enrichment analysis.

Input File Formats

A. Annotation File

  • Format: TSV (Tab-Separated Values) or TXT.
  • Content: Must contain at least two columns:
    • Gene/Protein ID Column: Unique identifier.
    • Annotation Column: Corresponding pathway or functional annotation (e.g., GO Term, KEGG Pathway).
  • Multiple Annotations: If a gene corresponds to multiple annotations, use a separator (e.g., |) to separate them.
  • Example:
GeneID Pathway
GeneA Pathway1|Pathway2
GeneB Pathway1
GeneC Pathway3

B. Gene List File

  • Format: TSV (Tab-Separated Values) or TXT.
  • Content:
    • Gene Column (Required): Gene IDs to be analyzed.
    • Rank Column (Required for GSEA): Numerical values used for GSEA ranking (e.g., log2FoldChange, t-statistic).
    • Group Column (Optional): If you need to analyze different groups separately, you can specify a group column.
  • Example:
GeneID Log2FC Cluster
GeneA 2.5 Group1
GeneB -1.2 Group1
GeneC 0.5 Group2

C. GMT File

  • Standard GMT format file, usually generated by the GMT Generator or downloaded from MSigDB.

Steps

  1. In the "Annotation Processing" tab, load the Annotation File and select the corresponding ID column and annotation column.
  2. Click the "Create Gene Sets" button to build the background library for analysis.
  3. Switch to the "Enrichment Analysis" tab.
  4. Select Input Method:
    • From File: Load the gene list file, select the gene column, rank column (if doing GSEA), and group column (optional).
    • Direct Input: Paste the gene list into the text box.
  5. Select Analysis Method: Hypergeometric or GSEA.
    • For GSEA, you can adjust the minimum/maximum gene set size.
  6. Set the output directory and prefix, then click "Run Enrichment Analysis".

Output Results

  • TSV File: Contains detailed statistical results of the enrichment analysis (P-value, FDR, NES, etc.).
  • Pickle (.pkl) File: (Optional) Saves the complete analysis object for subsequent plotting in the visualization tool.

2. Visualization

This module is used to visualize the results of enrichment analysis.

Input File Formats

A. TSV Result File

  • Usually generated by the enrichment analysis module, or follows a general enrichment analysis result format.
  • Required Columns (automatically identified, can also be manually specified):
    • Term/Pathway Name
    • P-value or Adjusted P-value (used for color mapping)
    • Count/Size (used for dot size)
    • NES (used for GSEA result display)

B. Pickle (.pkl) File

  • GSEA result object file generated by the enrichment analysis module.

Features and Chart Types

1. Dot Plot

  • Usage: Display enriched pathways and their significance.
  • Parameters:
    • Column: Numerical column mapped to dot size or color.
    • X/Group: X-axis grouping.
    • Hue: Color mapping column.
    • Threshold: Filter out non-significant pathways.
    • Style: Adjustable dot size scaling, shape, color map (Colormap).

2. Bar Plot

  • Usage: Simply display Top pathways.
  • Parameters: Supports custom colors, legend position, etc.

3. GSEA Plot

  • Usage: Only available after loading a PKL file.
  • Function: Draw classic GSEA Running Enrichment Score plots.
  • Interaction: Select one or multiple Terms from the list to plot. Supports showing/hiding Ranking metric, adjusting legend position and font size.

3. GMT Generator

This module is used to create custom GMT files.

Input File Formats

  • Same as the Annotation File in Enrichment Analysis.

Steps

  1. Load the annotation file (TSV/TXT).
  2. Select ID Column and Annotation Column.
  3. Annotation Processing Settings:
    • Enable Annotation Split: If a cell contains multiple annotations (e.g., GO:001|GO:002), check this and set the separator (e.g., |).
    • Min Gene Count: Filter out pathways with too few genes.
    • Invalid Values: Exclude rows containing specific characters (e.g., nan, None).
  4. Set the output directory and click "Generate GMT File".

Output Results

  • .gmt File: Standard format, can be directly used for GSEA analysis or GSEA software.
    • Format: PathwayName NA Gene1 Gene2 Gene3 ...

License

This project is licensed under the BSD License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gseagui-0.1.8.tar.gz (43.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gseagui-0.1.8-py3-none-any.whl (44.2 kB view details)

Uploaded Python 3

File details

Details for the file gseagui-0.1.8.tar.gz.

File metadata

  • Download URL: gseagui-0.1.8.tar.gz
  • Upload date:
  • Size: 43.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.19

File hashes

Hashes for gseagui-0.1.8.tar.gz
Algorithm Hash digest
SHA256 ba04b2d80449e73e09e3de064070c9a2f12c93f4c03884f83ba3632eb07b4e7c
MD5 deedef5d1b87ff84441aada3ed512b3f
BLAKE2b-256 3e3980582ebf822323af616a2c85330c6bb603f0daad0078c3834e0d307e8061

See more details on using hashes here.

File details

Details for the file gseagui-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: gseagui-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 44.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.19

File hashes

Hashes for gseagui-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 227aa41e287b7672b6e335479d0eb01f575daaa4c90dda518a919e4aa66c207d
MD5 26167d5dcdbb349fc49e610e75807bc5
BLAKE2b-256 ea31170c848a95c119cd6f9184fde6faf667d3c9be7d9d6d04b16b18d98e7737

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page