Perform permutation-based pathway enrichment analysis
Project description
Permutation-based pathway enrichment analysis
Python tools to perform a permutation-based pathway enrichment analysis. Currently supporting KEGG pathways.
Usage
from pathwayenrichment.representation import ClusterPermutator
from pathwayenrichment.databaseparser import KEGGPathwayParser
from pathwayenrichment.utils import randomPartition
First, let's download the KEGG database for Dokdonia, a marine bacterium. To this end, we employ KEGG's entry code for Dokdonia (dok). We will then parse the database to obtain a list of genes and associated cellular pathways and systems.
KEGGparser = KEGGPathwayParser.fromKEGGidentifier('dok', only_curated_pathways=True)
gene_pathways, gene_systems = KEGGparser.getGenePathways()
system_pathways = KEGGparser.getSystemPathways()
gene_info = KEGGparser.getGeneInfoFromKEGGorthology()
gene_list = list(gene_pathways.keys())
print(f'There are a total of {len(gene_list)} genes')
There are a total of 786 genes
Now, we simulate a set of gene clusters to perform a pathway enrichment analysis on them. To this end, we will randomly partition the set of genes into clusters.
genes_under_study = gene_list[:300]
clusters = dict(zip(
['A', 'B', 'C', 'D'],
randomPartition(gene_list, bin_sizes=[75, 25, 150, 50])
))
Now we are ready to instantiate a ClusterPermutator to run the enrichment analysis. We will permute the total set of genes to form new random clusters 10000 times, our sample size to compute the sample p-value.
permutator = ClusterPermutator(clusters, gene_pathways, system_pathways)
res = permutator.sampleClusterPermutationSpace(sample_size=10000, n_processes=4)
Finished permutation sampling
# Here are the first 10 pathways with lowest sample p-value
{k:v for k,v in list(res['pathway']['A'].items())[:10]}
{'03018 RNA degradation [PATH:dok03018]': (0.2777777777777778, 0.0484),
'00020 Citrate cycle (TCA cycle) [PATH:dok00020]': (0.18181818181818182,
0.0691),
'02020 Two-component system [PATH:dok02020]': (0.2, 0.1527),
'00541 O-Antigen nucleotide sugar biosynthesis [PATH:dok00541]': (0.19047619047619047,
0.1641),
'03060 Protein export [PATH:dok03060]': (0.2, 0.1683),
'02024 Quorum sensing [PATH:dok02024]': (0.14814814814814814, 0.218),
'00520 Amino sugar and nucleotide sugar metabolism [PATH:dok00520]': (0.14285714285714285,
0.2211),
'02010 ABC transporters [PATH:dok02010]': (0.15, 0.2422),
'00040 Pentose and glucuronate interconversions [PATH:dok00040]': (0.3333333333333333,
0.25),
'00053 Ascorbate and aldarate metabolism [PATH:dok00053]': (0.2, 0.25)}
Here, we see the 10 pathways with lowest sample p-values within cluster A.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pathwayenrichment-0.0.3.tar.gz
.
File metadata
- Download URL: pathwayenrichment-0.0.3.tar.gz
- Upload date:
- Size: 16.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.6.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f83773fe5cf63f37e66685f822e1d3573c9e8fc3446bf8c9b42866cb7ed07b86 |
|
MD5 | feb6f49e19b0f09c3eb4605a546a9780 |
|
BLAKE2b-256 | 6aa1e326e5f1784b57e58128e44a2f6f1f41ea94ba6f6326c8ccb4812462ebd2 |
File details
Details for the file pathwayenrichment-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: pathwayenrichment-0.0.3-py3-none-any.whl
- Upload date:
- Size: 18.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.6.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49f8ea08cef7fde6f6f453f169c11b7062034af90f42ad343dd0e073e2bf261e |
|
MD5 | df6eaf91f5faa5209eb0205df40a12f1 |
|
BLAKE2b-256 | e9be677f3447009f00bc71157a612b797352e48285cd5063d5429f57aee06c1b |