Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

NumPy arrays with named axes and named indices.

Project Description

Scientists, engineers, mathematicians and statisticians don’t just work with matrices; they often work with structured data, just like you’d find in a table. However, functionality for this is missing from Numpy, and there are efforts to create something to fill the void. This is one of those efforts.

Warning

This code is currently experimental, and its API will change! It is meant to be a place for the community to understand and develop the right semantics and have a prototype implementation that will ultimately (hopefully) be folded back into Numpy.

Datarray provides a subclass of Numpy ndarrays that support:

  • individual dimensions (axes) being labeled with meaningful descriptions
  • labeled ‘ticks’ along each axis
  • indexing and slicing by named axis
  • indexing on any axis with the tick labels instead of only integers
  • reduction operations (like .sum, .mean, etc) support named axis arguments instead of only integer indices.

Prior Art

In no particular order:

  • xray - very close in spirit to this package, xray implements named ND array axes and tick labels. It integrates with (and depends on) Pandas;
  • pandas is based around a number of DataFrame-esque datatypes.
  • Tabular implements a spreadsheet-inspired datatype, with rows/columns, csv/etc. IO, and fancy tabular operations.
  • scikits.statsmodels sounded as though it had some features we’d like to eventually see implemented on top of something such as datarray, and Skipper seemed pretty interested in something like this himself.
  • scikits.timeseries also has a time-series-specific object that’s somewhat reminiscent of labeled arrays.
  • pandas is based around a number of DataFrame-esque datatypes.
  • pydataframe is supposed to be a clone of R’s data.frame.
  • larry, or “labeled array,” often comes up in discussions alongside pandas.
  • divisi includes labeled sparse and dense arrays.
  • pymvpa provides Dataset class encapsulating the data together with matching in length sets of attributes for the first two (samples and features) dimensions. Dataset is not a subclass of numpy array to allow other data structures (e.g. sparse matrices).
  • ptsa subclasses ndarray to provide attributes per dimensions aiming to ease slicing/indexing given the values of the axis attributes

Project Goals

  1. Get something akin to this in the numpy core;
  2. Stick to basic functionality such that projects like scikits.statsmodels can use it as a base datatype;
  3. Make an interface that allows for simple, pretty manipulation that doesn’t introduce confusion;
  4. Oh, and make sure that the base numpy array is still accessible.

Code

You can find our sources and single-click downloads:

The latest released version is always available from pypi.

Support

Please put up issues on the datarray issue tracker.

Release History

Release History

This version
History Node

0.1.0

History Node

0.0.6

History Node

0.0.5

History Node

0.0.4

History Node

0.0.3

History Node

0.0.2

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
datarray-0.1.0-py2-none-any.whl (36.8 kB) Copy SHA256 Checksum SHA256 py2 Wheel Jun 10, 2016
datarray-0.1.0-py3-none-any.whl (36.8 kB) Copy SHA256 Checksum SHA256 py3 Wheel Jun 10, 2016
datarray-0.1.0.tar.gz (61.6 kB) Copy SHA256 Checksum SHA256 Source Jun 10, 2016
datarray-0.1.0.zip (75.8 kB) Copy SHA256 Checksum SHA256 Source Jun 10, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting