Skip to main content

Perform permutation-based pathway enrichment analysis

Project description

Permutation-based pathway enrichment analysis

Python tools to perform a permutation-based pathway enrichment analysis. Currently supporting KEGG pathways.

Usage

from pathwayenrichment.representation import ClusterPermutator
from pathwayenrichment.databaseparser import KEGGPathwayParser
from pathwayenrichment.utils import randomPartition

First, let's download the KEGG database for Dokdonia, a marine bacterium. To this end, we employ KEGG's entry code for Dokdonia (dok). We will then parse the database to obtain a list of genes and associated cellular pathways and systems.

KEGGparser = KEGGPathwayParser.fromKEGGidentifier('dok', only_curated_pathways=True)
gene_pathways, gene_systems = KEGGparser.getGenePathways()
system_pathways = KEGGparser.getSystemPathways()
gene_info = KEGGparser.getGeneInfoFromKEGGorthology()
gene_list = list(gene_pathways.keys())
print(f'There are a total of {len(gene_list)} genes')
There are a total of 786 genes

Now, we simulate a set of gene clusters to perform a pathway enrichment analysis on them. To this end, we will randomly partition the set of genes into clusters.

genes_under_study = gene_list[:300]
clusters = dict(zip(
    ['A', 'B', 'C', 'D'],
    randomPartition(gene_list, bin_sizes=[75, 25, 150, 50])
))

Now we are ready to instantiate a ClusterPermutator to run the enrichment analysis. We will permute the total set of genes to form new random clusters 10000 times, our sample size to compute the sample p-value.

permutator = ClusterPermutator(clusters, gene_pathways, system_pathways)
res = permutator.sampleClusterPermutationSpace(sample_size=10000, n_processes=4)
Finished permutation sampling
# Here are the first 10 pathways with lowest sample p-value
{k:v for k,v in list(res['pathway']['A'].items())[:10]}
{'03018 RNA degradation [PATH:dok03018]': (0.2777777777777778, 0.0484),
 '00020 Citrate cycle (TCA cycle) [PATH:dok00020]': (0.18181818181818182,
  0.0691),
 '02020 Two-component system [PATH:dok02020]': (0.2, 0.1527),
 '00541 O-Antigen nucleotide sugar biosynthesis [PATH:dok00541]': (0.19047619047619047,
  0.1641),
 '03060 Protein export [PATH:dok03060]': (0.2, 0.1683),
 '02024 Quorum sensing [PATH:dok02024]': (0.14814814814814814, 0.218),
 '00520 Amino sugar and nucleotide sugar metabolism [PATH:dok00520]': (0.14285714285714285,
  0.2211),
 '02010 ABC transporters [PATH:dok02010]': (0.15, 0.2422),
 '00040 Pentose and glucuronate interconversions [PATH:dok00040]': (0.3333333333333333,
  0.25),
 '00053 Ascorbate and aldarate metabolism [PATH:dok00053]': (0.2, 0.25)}

Here, we see the 10 pathways with lowest sample p-values within cluster A.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pathwayenrichment-0.0.3.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

pathwayenrichment-0.0.3-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file pathwayenrichment-0.0.3.tar.gz.

File metadata

  • Download URL: pathwayenrichment-0.0.3.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.6.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.4

File hashes

Hashes for pathwayenrichment-0.0.3.tar.gz
Algorithm Hash digest
SHA256 f83773fe5cf63f37e66685f822e1d3573c9e8fc3446bf8c9b42866cb7ed07b86
MD5 feb6f49e19b0f09c3eb4605a546a9780
BLAKE2b-256 6aa1e326e5f1784b57e58128e44a2f6f1f41ea94ba6f6326c8ccb4812462ebd2

See more details on using hashes here.

File details

Details for the file pathwayenrichment-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: pathwayenrichment-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.6.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.4

File hashes

Hashes for pathwayenrichment-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 49f8ea08cef7fde6f6f453f169c11b7062034af90f42ad343dd0e073e2bf261e
MD5 df6eaf91f5faa5209eb0205df40a12f1
BLAKE2b-256 e9be677f3447009f00bc71157a612b797352e48285cd5063d5429f57aee06c1b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page