This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

manipulate datasets encoded as 2-D matrices with annotation (first) row and (first) column

Project Description

Class for importing and querying expression dataasets organized as a column- and row-annotated matrix.

Expression datasets contain the numeric results of one or more samples derived from microarray assays. Common to each of the assays is the specific platform (microarray). The dataset can be regarded as a table with rows and columns. Each column represents a single assay, and each row contains the assay results for a specific probe on the assay platform. Thus, the values in any given row are those obtained from the same probe location on the platform. These are referred to as expression profiles.

A dataset can be regarded as a table, such as this one:

probe_id HSC 1 HSC 2 NK 1 NK 2
45283 10.14 9.31 8.9 8.78
45284 12.52 12.63 12.55 11.96
45285 6.78 6.91 7.83 7.86
45286 5.58 5.06 6.69 6.64
45287 7.85 8.13 8.47 8.56
45288 8.12 7.17 8.71 8.08
45289 6.82 6.15 5.87 5.32
45290 10.55 10.39 10.7 9.93

Expression datasets, with rare exception, are stored in text (i.e. flat) files that have the following format:

  • two or more rows of data, delimited by ASCII newline (\x0a) characters. (Strictly speaking, there needen’t be any data at all, but what’s the point of that?)
  • each line or row consists of two or more columns of data, delimited by ASCII TAB (\x09) characters.
  • the first column contains the key or probe ID, assumed to be alpha-numeric, or for the probe.
  • the first row consists of labels identifying the probe ID and sample columns. This, too, is assumed to be alpha-numeric.
  • the second through last rows contain expression values and, aside from the first column, which contains the probe ID, are assumed to be floating point numbers. In microarray parlance, each row is typically referred to as an expression profile.

Some datasets may differ from this format. For instance, there may be no (first) row of labels, or the data may be of some format other than floating point. Provision is made for handling these arguably special cases. However, the default settings for instantiating Matricks classes makes the foregoing assumptions about the contents of raw source data. It is further assumed that the source dataset is encoded in ASCII strings, requiring the conversion of all numeric data to float type objects.

Matricks selection operations generally return Matricks objects. These can be iterated, row-wise, much like lists or tuples, to access individual expression profiles, the contents of which can be retrieved using list / tuple semantics.

Release History

Release History

This version
History Node

0.3.20

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
matricks-0.3.20-py2.6.egg (136.6 kB) Copy SHA256 Checksum SHA256 2.6 Egg Jul 25, 2012
matricks-0.3.20-py2.7.egg (135.5 kB) Copy SHA256 Checksum SHA256 2.7 Egg Jul 25, 2012

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting