Single sample pathway analysis tools for omics data
Project description
sspa
Single sample pathway analysis toolkit
sspa provides a Python interface for metabolomics pathway analysis. In addition to conventional methods over-representation analysis (ORA) and gene/metabolite set enrichment analysis (GSEA), it also provides a wide range of single-sample pathway analysis (ssPA) methods.
Features
- Over-representation analysis
- Metabolite set enrichment analysis (based on GSEA)
- Single-sample pathway analysis
- Compound identifier conversion
- Pathway database download (KEGG, Reactome, and MetExplore metabolic networks)
Although this package is designed to provide a user-friendly interface for metabolomics pathway analysis, the methods are also applicable to other datatypes such as normalised RNA-seq data.
Documentation and tutorials
Full walkthrough notebook available on Google Colab:
Documentation is available on our Read the Docs page
Quickstart
pip install sspa
Load Reactome pathways
reactome_pathways = sspa.process_reactome(organism="Homo sapiens")
Load some example metabolomics data in the form of a pandas DataFrame:
covid_data_processed = sspa.load_example_data(omicstype="metabolomics", processed=True)
Generate pathway scores using kPCA method
kpca_scores = sspa.sspa_kpca(covid_data_processed, reactome_pathways)
Loading pathways
# Pre-loaded pathways
# Reactome v78
reactome_pathways = sspa.process_reactome(organism="Homo sapiens")
# KEGG v98
kegg_human_pathways = sspa.process_kegg(organism="hsa")
Load a custom GMT file (extension .gmt or .csv)
custom_pathways = sspa.process_gmt("wikipathways-20220310-gmt-Homo_sapiens.gmt")
Download latest version of pathways
# download KEGG latest
kegg_mouse_latest = sspa.process_kegg("mmu", download_latest=True, filepath=".")
# download Reactome latest
reactome_mouse_latest = sspa.process_reactome("Mus musculus", download_latest=True, filepath=".")
Identifier harmonization
# download the conversion table
compound_names = processed_data.columns.tolist()
conversion_table = sspa.identifier_conversion(input_type="name", compound_list=compound_names)
# map the identifiers to your dataset
processed_data_mapped = sspa.map_identifiers(conversion_table, output_id_type="ChEBI", matrix=processed_data)
Conventional pathway analysis
ORA
ora = sspa.sspa_ora(processed_data_mapped, covid_data["Group"], reactome_pathways, 0.05, DA_testtype='ttest', custom_background=None)
# perform ORA
ora_res = ora.over_representation_analysis()
# get t-test results
ora.ttest_res
# obtain list of differential molecules input to ORA
ora.DA_test_res
GSEA
sspa.sspa_gsea(processed_data_mapped, covid_data['Group'], reactome_pathways)
Single sample pathway analysis methods
# ssclustPA
ssclustpa_res = sspa.sspa_ssClustPA(processed_data_mapped, reactome_pathways)
# kPCA
kpca_scores = sspa.sspa_kpca(processed_data_mapped, reactome_pathways)
# z-score
zscore_res = sspa.sspa_zscore(processed_data_mapped, reactome_pathways)
# SVD (PLAGE)
svd_res = sspa.sspa_svd(processed_data_mapped, reactome_pathways)
# ssGSEA
ssgsea_res = sspa.sspa_ssGSEA(processed_data_mapped, reactome_pathways)
License
GNU GPL 3.0
Citing us
If you found this package useful, please consider citing us:
ssPA package
@article{Wieder22a,
author = {Cecilia Wieder and Nathalie Poupin and Clément Frainay and Florence Vinson and Juliette Cooke and Rachel PJ Lai and Jacob G Bundy and Fabien Jourdan and Timothy MD Ebbels},
doi = {10.5281/ZENODO.6959120},
month = {8},
title = {cwieder/py-ssPA: v1.0.4},
url = {https://zenodo.org/record/6959120},
year = {2022},
}
Single-sample pathway analysis in metabolomics
@article{Wieder2022,
author = {Cecilia Wieder and Rachel P J Lai and Timothy M D Ebbels},
doi = {10.1186/s12859-022-05005-1},
issn = {1471-2105},
issue = {1},
journal = {BMC Bioinformatics},
pages = {481},
title = {Single sample pathway analysis in metabolomics: performance evaluation and application},
volume = {23},
url = {https://doi.org/10.1186/s12859-022-05005-1},
year = {2022},
}
Contributing
Read our contributor's guide to get started
News
[v0.2.1] - 05/01/23
- Removal of rpy2 dependency for improved compatibility across systems
- Use GSEApy as backend for GSEA and ssGSEA
- Minor syntax changes
ora.ttest_resis nowora.DA_test_res(as we can implement t-test or MWU tests)sspa_fgsea()is nowsspa_gsea()and uses gseapy as the backend rather than R fgseasspa_gsva()is temporarily deprecated due to the need for the rpy2 compatability - use the GSVA R package
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sspa-0.2.2.tar.gz.
File metadata
- Download URL: sspa-0.2.2.tar.gz
- Upload date:
- Size: 8.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef7c6d04f7d6139012623617f59a4c139a145e84756734f4af15e9abbbaa1496
|
|
| MD5 |
87949bfe46b4c25b7df25420aeee98a8
|
|
| BLAKE2b-256 |
77fe313957bf25d46cca12fe6c77e9edaac5eda058dec0c9c4e39f7081af25a2
|
File details
Details for the file sspa-0.2.2-py3-none-any.whl.
File metadata
- Download URL: sspa-0.2.2-py3-none-any.whl
- Upload date:
- Size: 8.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8780af22ac1f056680afd7ef9a2d80faa3414aaac8acb0f833a1d3839d4932d6
|
|
| MD5 |
110eeaa0253b4a42911178e3cabbcb7a
|
|
| BLAKE2b-256 |
d3f281314f93c85a37f9d78a6b296654477a7bbfcf22c9972b6985cf01046e48
|