Skip to main content

Tools for munging genomics data

Project description

Tools for munging genomic data such as: - Converting between different types of gene identifiers - Searching for terms in the Gene Ontology (GO) associated with a keyword - Looking up housekeeping genes and transcription factors - Getting a list of GO terms associated with a given gene - Looking up how a gene is expressed across tissues - Normalizing a matrix of gene expression data by converting to TPM

Unlearn.AI

When we’re not developing super awesome open source packages like genemunge, we help biopharma partners use unsupervised deep learning to extract insights from their omics data. Learn more at unlearn.health.

Install

This library is accompanied by the following data sources: - The Gene Ontology. The current version used here is the 2018-03-27 release. - recount2 data for GTEx. - HGNC gene symbols. - A list of transcription factors. - A list of housekeeping genes.

Installing this package through pip (pip install genemunge from PyPI, pip install . from GitHub) will use the static data that accompanies this repository.

If you wish to use the latest data from the above sources, you may install in “develop” mode from GitHub with pip -e install .. Notably, this will download and process the recount2 GTEx data, requiring R and the recount package from bioconductor:

source("https://bioconductor.org/biocLite.R")
biocLite("recount")

Citations

Please cite the following papers if you make use of genemunge for a publication.

This package:

Gene Ontology: Ashburner et al. Gene ontology: tool for the unification of biology (2000) Nat Genet 25(1):25-9 GO Consortium, Nucleic Acids Res., 2017

recount2: Collado-Torres L, Nellore A, Kammers K, Ellis SE, Taub MA, Hansen KD, Jaffe AE, Langmead B, Leek JT. Reproducible RNA-seq analysis using recount2. Nature Biotechnology, 2017.

HGNC: Gray KA, Yates B, Seal RL, Wright MW, Bruford EA. genenames.org: the HGNC resources in 2015. Nucleic Acids Res. 2015 Jan;43(Database issue):D1079-85.

Transcription factors: TFcheckpoint: a curated compendium of specific DNA-binding RNA polymerase II transcription factors Konika Chawla; Sushil Tripathi; Liv Thommesen; Astrid Laegreid; Martin Kuiper Bioinformatics 2013.

Housekeeping genes: E. Eisenberg and E.Y. Levanon, Trends in Genetics 29, (2013)

Similar tools

If you know of similar tools that would be helpful references for users, please contribute an attribution to them here.

  1. goatools

  2. goenrich

GO evidence codes

Experiment:
 - Inferred from Experiment (EXP)
 - Inferred from Direct Assay (IDA)
 - Inferred from Physical Interaction (IPI)
 - Inferred from Mutant Phenotype (IMP)
 - Inferred from Genetic Interaction (IGI)
 - Inferred from Expression Pattern (IEP)

Computational:
 - Inferred from Sequence or structural Similarity (ISS)
 - Inferred from Sequence Orthology (ISO)
 - Inferred from Sequence Alignment (ISA)
 - Inferred from Sequence Model (ISM)
 - Inferred from Genomic Context (IGC)
 - Inferred from Biological aspect of Ancestor (IBA)
 - Inferred from Biological aspect of Descendant (IBD)
 - Inferred from Key Residues (IKR)
 - Inferred from Rapid Divergence(IRD)
 - Inferred from Reviewed Computational Analysis (RCA)

Literature:
 - Traceable Author Statement (TAS)
 - Non-traceable Author Statement (NAS)

Other:
 - Inferred by Curator (IC)
 - No biological Data available (ND) evidence code
 - Inferred from Electronic Annotation (IEA)

Common gene id types

['symbol','name','entrez_id','ensembl_gene_id','refseq_accession','uniprot_ids']

Project details


Release history Release notifications | RSS feed

This version

0.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genemunge-0.0.tar.gz (39.2 MB view details)

Uploaded Source

Built Distribution

genemunge-0.0-py3-none-any.whl (39.4 MB view details)

Uploaded Python 3

File details

Details for the file genemunge-0.0.tar.gz.

File metadata

  • Download URL: genemunge-0.0.tar.gz
  • Upload date:
  • Size: 39.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for genemunge-0.0.tar.gz
Algorithm Hash digest
SHA256 997b5adc7892b8794c97a3113bdc8ddec4c89c229b5da563aafe64ceec13b9b7
MD5 a0d26980a3f3a17e6114fabb3edc7159
BLAKE2b-256 41288d4e387510da28e8db22293296a3998c0b65ad926287f87947db73864d3a

See more details on using hashes here.

File details

Details for the file genemunge-0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for genemunge-0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 11cb93b8aca312332c341fb53d4618d90222f3beecc44d1eaef5413e5f64f62b
MD5 6278cb2b17087aa1c760f1963f1d015b
BLAKE2b-256 9c28211e1a7545aaffd492c967ec219076778350d4446c794685164604a157e7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page