Skip to main content

Store and access gene expression datasets and gene definitions.

Project description

genedataset is a package to store and access gene expression datasets and gene definitions. It consists of two main classes, geneset and dataset.

geneset

geneset stores gene information combined from both Ensembl and NCBI/Entrez (mouse and human only), so that you can query it:

$ gs = geneset.Geneset().subset(queryStrings='ccr3')
$ print gs.geneIds()
 ['ENSG00000183625', 'ENSMUSG00000035448']
$ gs.dataframe()
 | EnsemblId          | Species     | EntrezId | GeneSymbol | Synonyms                     | Description                      | MedianTranscriptLength | Orthologue              | ExonLength |
 |--------------------|-------------|----------|------------|------------------------------|----------------------------------|------------------------|-------------------------|------------|
 | ENSG00000183625    | HomoSapiens | 1232     | CCR3       | CC-CKR-3|CD193|CKR3|CMKBR3   | chemokine (C-C motif),receptor 3 | 1242.5                 | ENSMUSG00000035448:Ccr3 | 3568.0     |
 | ENSMUSG00000035448 | MusMusculus | 12771    | Ccr3       | CC-CKR3|CKR3|Cmkbr1l2|Cmkbr3 | chemokine (C-C motif),receptor 3 | 3273                   | ENSG00000183625:CCR3    | 3273.0     |

dataset

dataset can store gene expression data so that it can be queried. The stored data consists of expression values (microarray and rna-seq) and sample data packaged into HDF5 format.

$ ds = dataset.Dataset("genedataset/data/testdataset.h5")
$ ds
 <Dataset name:testdata species:MusMusculus, platform_type:microarray>
$ ds.expressionMatrix()
 | probeId | s01  | s02  | s03  | s04  |
 |---------|------|------|------|------|
 | probe1  | 3.45 | 4.65 | 2.65 | 8.23 |
 | probe2  | 5.54 | 0.00 | 1.43 | 6.43 |
 | probe3  | 0.00 | 0.00 | 4.34 | 5.44 |
$ ds.sampleTable()
 | sampleId | celltype | tissue |
 |----------|----------|--------|
 | s01      | B1       | BM     |
 | s02      | B1       | BM     |
 | s03      | B2       | BM     |
 | s04      | B2       | BM     |

Contact

Jarny Choi

Changes

  • v0.1.x - Initial release with minor adjustments to test pypi and github upload/download.

  • v0.6.2 - Added a new column ‘ExonLength’ to the data.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genedataset-0.6.2.tar.gz (4.6 MB view details)

Uploaded Source

Built Distribution

genedataset-0.6.2-py2.py3-none-any.whl (11.9 MB view details)

Uploaded Python 2 Python 3

File details

Details for the file genedataset-0.6.2.tar.gz.

File metadata

  • Download URL: genedataset-0.6.2.tar.gz
  • Upload date:
  • Size: 4.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.15

File hashes

Hashes for genedataset-0.6.2.tar.gz
Algorithm Hash digest
SHA256 5bb945188e1e58121c3832002e7e7c5a714ac988ae36851af909b700488b5588
MD5 92d78dabb5f89a0b772c79f03049f338
BLAKE2b-256 dbad1cddcd446ae7fe40e0b426c46ca511e2eb600117c4339a385b3f3f47156a

See more details on using hashes here.

File details

Details for the file genedataset-0.6.2-py2.py3-none-any.whl.

File metadata

  • Download URL: genedataset-0.6.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 11.9 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.15

File hashes

Hashes for genedataset-0.6.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 94925f31c70dd8d59d663d24ceee4122996178b9ff9fade9668ac909e1c592c3
MD5 ff31618717f1f06ed1e5a69614933a31
BLAKE2b-256 0e5d8f116735feeaa894d2f976bf073c4d7355e5d8bfe878b823ff93fb048ba9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page