Skip to main content

GUI tools for gene set enrichment analysis

Project description

中文

GSEA GUI Toolset

This is a powerful Gene Set Enrichment Analysis (GSEA) and visualization toolset that provides a comprehensive suite of graphical user interfaces (GUI) for the complete workflow from data preparation to analysis and result presentation.

Key Features

This toolset includes three core modules:

  1. Enrichment Analysis: Supports Over-Representation Analysis (ORA) using hypergeometric distribution and GSEA analysis.
  2. Visualization: Supports generating Dot Plots, Bar Plots, and GSEA-specific enrichment score plots.
  3. GMT Generator: Helps users convert custom annotation tables into standard GMT format gene set files.

Installation

Install via PyPI

pip install gseagui

Quick Start

After installation, you can launch each tool using the following commands:

# Launch the main launcher (contains entry points for all tools)
gseagui

# Launch the enrichment analysis tool separately
gseanrichment

# Launch the visualization tool separately
gseaplotter

# Launch the GMT generator separately
gmtgenerator

Detailed Usage Guide & Input File Formats

1. Enrichment Analysis

This module is used to perform gene enrichment analysis.

Input File Formats

A. Annotation File

  • Format: TSV (Tab-Separated Values) or TXT.
  • Content: Must contain at least two columns:
    • Gene/Protein ID Column: Unique identifier.
    • Annotation Column: Corresponding pathway or functional annotation (e.g., GO Term, KEGG Pathway).
  • Multiple Annotations: If a gene corresponds to multiple annotations, use a separator (e.g., |) to separate them.
  • Example:
GeneID Pathway
GeneA Pathway1|Pathway2
GeneB Pathway1
GeneC Pathway3

B. Gene List File

  • Format: TSV (Tab-Separated Values) or TXT.
  • Content:
    • Gene Column (Required): Gene IDs to be analyzed.
    • Rank Column (Required for GSEA): Numerical values used for GSEA ranking (e.g., log2FoldChange, t-statistic).
    • Group Column (Optional): If you need to analyze different groups separately, you can specify a group column.
  • Example:
GeneID Log2FC Cluster
GeneA 2.5 Group1
GeneB -1.2 Group1
GeneC 0.5 Group2

C. GMT File

  • Standard GMT format file, usually generated by the GMT Generator or downloaded from MSigDB.

Steps

  1. In the "Annotation Processing" tab, load the Annotation File and select the corresponding ID column and annotation column.
  2. Click the "Create Gene Sets" button to build the background library for analysis.
  3. Switch to the "Enrichment Analysis" tab.
  4. Select Input Method:
    • From File: Load the gene list file, select the gene column, rank column (if doing GSEA), and group column (optional).
    • Direct Input: Paste the gene list into the text box.
  5. Select Analysis Method: Hypergeometric or GSEA.
    • For GSEA, you can adjust the minimum/maximum gene set size.
  6. Set the output directory and prefix, then click "Run Enrichment Analysis".

Output Results

  • TSV File: Contains detailed statistical results of the enrichment analysis (P-value, FDR, NES, etc.).
  • Pickle (.pkl) File: (Optional) Saves the complete analysis object for subsequent plotting in the visualization tool.

2. Visualization

This module is used to visualize the results of enrichment analysis.

Input File Formats

A. TSV Result File

  • Usually generated by the enrichment analysis module, or follows a general enrichment analysis result format.
  • Required Columns (automatically identified, can also be manually specified):
    • Term/Pathway Name
    • P-value or Adjusted P-value (used for color mapping)
    • Count/Size (used for dot size)
    • NES (used for GSEA result display)

B. Pickle (.pkl) File

  • GSEA result object file generated by the enrichment analysis module.

Features and Chart Types

1. Dot Plot

  • Usage: Display enriched pathways and their significance.
  • Parameters:
    • Column: Numerical column mapped to dot size or color.
    • X/Group: X-axis grouping.
    • Hue: Color mapping column.
    • Threshold: Filter out non-significant pathways.
    • Style: Adjustable dot size scaling, shape, color map (Colormap).

2. Bar Plot

  • Usage: Simply display Top pathways.
  • Parameters: Supports custom colors, legend position, etc.

3. GSEA Plot

  • Usage: Only available after loading a PKL file.
  • Function: Draw classic GSEA Running Enrichment Score plots.
  • Interaction: Select one or multiple Terms from the list to plot. Supports showing/hiding Ranking metric, adjusting legend position and font size.

3. GMT Generator

This module is used to create custom GMT files.

Input File Formats

  • Same as the Annotation File in Enrichment Analysis.

Steps

  1. Load the annotation file (TSV/TXT).
  2. Select ID Column and Annotation Column.
  3. Annotation Processing Settings:
    • Enable Annotation Split: If a cell contains multiple annotations (e.g., GO:001|GO:002), check this and set the separator (e.g., |).
    • Min Gene Count: Filter out pathways with too few genes.
    • Invalid Values: Exclude rows containing specific characters (e.g., nan, None).
  4. Set the output directory and click "Generate GMT File".

Output Results

  • .gmt File: Standard format, can be directly used for GSEA analysis or GSEA software.
    • Format: PathwayName NA Gene1 Gene2 Gene3 ...

License

This project is licensed under the BSD License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gseagui-0.1.7.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gseagui-0.1.7-py3-none-any.whl (36.1 kB view details)

Uploaded Python 3

File details

Details for the file gseagui-0.1.7.tar.gz.

File metadata

  • Download URL: gseagui-0.1.7.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.19

File hashes

Hashes for gseagui-0.1.7.tar.gz
Algorithm Hash digest
SHA256 8d2f9f84399d442a02188ca49f158563f058c2ee6fad86fcdd34cf8e7e40efcd
MD5 feadd46f274725bf0383a637a02807c4
BLAKE2b-256 2645c5d31967137e0c783b07d7cf5762faf84ebfdcf5b1928b037eb2aa5e0db5

See more details on using hashes here.

File details

Details for the file gseagui-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: gseagui-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 36.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.19

File hashes

Hashes for gseagui-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 f34caca698b96b7c17d16ddec7489a7966d8a945cb960787072a2005e88d4c12
MD5 a865e6ec43244c7a22e277cdc7ee542c
BLAKE2b-256 5edacae7ccf24d4a14e046b0c1332abd5a9b7d9c20319c6fcc770ace6a53ee67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page