GUI tools for gene set enrichment analysis
Project description
GSEA GUI Toolset
This is a powerful Gene Set Enrichment Analysis (GSEA) and visualization toolset that provides a comprehensive suite of graphical user interfaces (GUI) for the complete workflow from data preparation to analysis and result presentation.
Key Features
This toolset includes three core modules:
- Enrichment Analysis: Supports Over-Representation Analysis (ORA) using hypergeometric distribution and GSEA analysis.
- Visualization: Supports generating Dot Plots, Bar Plots, and GSEA-specific enrichment score plots.
- GMT Generator: Helps users convert custom annotation tables into standard GMT format gene set files.
Installation
Install via PyPI
pip install gseagui
Quick Start
After installation, you can launch each tool using the following commands:
# Launch the main launcher (contains entry points for all tools)
gseagui
# Launch the enrichment analysis tool separately
gseanrichment
# Launch the visualization tool separately
gseaplotter
# Launch the GMT generator separately
gmtgenerator
Detailed Usage Guide & Input File Formats
1. Enrichment Analysis
This module is used to perform gene enrichment analysis.
Input File Formats
A. Annotation File
- Format: TSV (Tab-Separated Values) or TXT.
- Content: Must contain at least two columns:
- Gene/Protein ID Column: Unique identifier.
- Annotation Column: Corresponding pathway or functional annotation (e.g., GO Term, KEGG Pathway).
- Multiple Annotations: If a gene corresponds to multiple annotations, use a separator (e.g.,
|) to separate them. - Example:
| GeneID | Pathway |
|---|---|
| GeneA | Pathway1|Pathway2 |
| GeneB | Pathway1 |
| GeneC | Pathway3 |
B. Gene List File
- Format: TSV (Tab-Separated Values) or TXT.
- Content:
- Gene Column (Required): Gene IDs to be analyzed.
- Rank Column (Required for GSEA): Numerical values used for GSEA ranking (e.g., log2FoldChange, t-statistic).
- Group Column (Optional): If you need to analyze different groups separately, you can specify a group column.
- Example:
| GeneID | Log2FC | Cluster |
|---|---|---|
| GeneA | 2.5 | Group1 |
| GeneB | -1.2 | Group1 |
| GeneC | 0.5 | Group2 |
C. GMT File
- Standard GMT format file, usually generated by the GMT Generator or downloaded from MSigDB.
Steps
- In the "Annotation Processing" tab, load the Annotation File and select the corresponding ID column and annotation column.
- Click the "Create Gene Sets" button to build the background library for analysis.
- Switch to the "Enrichment Analysis" tab.
- Select Input Method:
- From File: Load the gene list file, select the gene column, rank column (if doing GSEA), and group column (optional).
- Direct Input: Paste the gene list into the text box.
- Select Analysis Method: Hypergeometric or GSEA.
- For GSEA, you can adjust the minimum/maximum gene set size.
- Set the output directory and prefix, then click "Run Enrichment Analysis".
Output Results
- TSV File: Contains detailed statistical results of the enrichment analysis (P-value, FDR, NES, etc.).
- Pickle (.pkl) File: (Optional) Saves the complete analysis object for subsequent plotting in the visualization tool.
2. Visualization
This module is used to visualize the results of enrichment analysis.
Input File Formats
A. TSV Result File
- Usually generated by the enrichment analysis module, or follows a general enrichment analysis result format.
- Required Columns (automatically identified, can also be manually specified):
- Term/Pathway Name
- P-value or Adjusted P-value (used for color mapping)
- Count/Size (used for dot size)
- NES (used for GSEA result display)
B. Pickle (.pkl) File
- GSEA result object file generated by the enrichment analysis module.
Features and Chart Types
1. Dot Plot
- Usage: Display enriched pathways and their significance.
- Parameters:
- Column: Numerical column mapped to dot size or color.
- X/Group: X-axis grouping.
- Hue: Color mapping column.
- Threshold: Filter out non-significant pathways.
- Style: Adjustable dot size scaling, shape, color map (Colormap).
2. Bar Plot
- Usage: Simply display Top pathways.
- Parameters: Supports custom colors, legend position, etc.
3. GSEA Plot
- Usage: Only available after loading a PKL file.
- Function: Draw classic GSEA Running Enrichment Score plots.
- Interaction: Select one or multiple Terms from the list to plot. Supports showing/hiding Ranking metric, adjusting legend position and font size.
3. GMT Generator
This module is used to create custom GMT files.
Input File Formats
- Same as the Annotation File in Enrichment Analysis.
Steps
- Load the annotation file (TSV/TXT).
- Select ID Column and Annotation Column.
- Annotation Processing Settings:
- Enable Annotation Split: If a cell contains multiple annotations (e.g.,
GO:001|GO:002), check this and set the separator (e.g.,|). - Min Gene Count: Filter out pathways with too few genes.
- Invalid Values: Exclude rows containing specific characters (e.g.,
nan,None).
- Enable Annotation Split: If a cell contains multiple annotations (e.g.,
- Set the output directory and click "Generate GMT File".
Output Results
- .gmt File: Standard format, can be directly used for GSEA analysis or GSEA software.
- Format:
PathwayName NA Gene1 Gene2 Gene3 ...
- Format:
License
This project is licensed under the BSD License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gseagui-0.1.5.tar.gz.
File metadata
- Download URL: gseagui-0.1.5.tar.gz
- Upload date:
- Size: 29.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0d7c42b5105696a5fbd7c355ef6da3d6c826ad970f3d2f7b9bab82b4c326723
|
|
| MD5 |
b829e4ba68fac76fdc8f5fa48ee78234
|
|
| BLAKE2b-256 |
58ece464b1fc3ab4e38a21718c71318db73d0a1e819a21b30609c1475b925d13
|
File details
Details for the file gseagui-0.1.5-py3-none-any.whl.
File metadata
- Download URL: gseagui-0.1.5-py3-none-any.whl
- Upload date:
- Size: 30.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab80c98496d1aa3723fef628df95986bcf65b73dbe679c9bbaf878d7a5d8f8c9
|
|
| MD5 |
26af227f237048fcfbe9e448b0bcfba8
|
|
| BLAKE2b-256 |
9967bf722cc9c026b537b3dc0f0cdfe2a111fbfebbc95c6194ece73b10f5e527
|