Skip to main content

Memory-mapped numeric arrays, based on a format that is self-explanatory and tool-independent

Project description

Darr is a Python science library that enables you to store and access disk-based numeric arrays, without depending on tool-specific data formats. This makes it easy to access the same data in many different languages and on different analysis platforms. No exporting required and, as the data is saved in a self-explanatory way, not much explanation required either when sharing or archiving your data. Tool-independent and easy access to data is in line with good scientific practice as it promotes wide and long-term availability, to others but also to yourself. More rationale for this approach is provided here.

Darr supports efficient read/write/append access and is based on universally readable flat binary files and automatically generated text files, containing human-readable explanation of precisely how your binary data is stored. It also provides specific code that reads the data in a variety of current scientific data tools such as Python, R, Julia, IDL, Matlab, Maple, and Mathematica (see example array).

Darr currently supports numerical N-dimensional arrays, and experimentally supports numerical ragged arrays, i.e. a series of arrays in which one dimension varies in length.

See this tutorial for a brief introduction, or the documentation for more info.

Darr is currently pre-1.0, still undergoing significant development. However we have been using it in practice in our lab for more than a year on both Linux and Windows machines. It is open source and freely available under the New BSD License terms.

Features

Pro’s:

  • Data storage purely based on flat binary and text files, tool independence.

  • Human-readable explanation of how the binary data is stored is saved in a README text file.

  • README includes examples of how to read the particular array in popular analysis environments such as Python (without Darr), R, Julia, Octave/Matlab, GDL/IDL, and Mathematica.

  • Supports very large data arrays, larger than RAM, through memory-mapping.

  • Data read/write access is simple and powerful through NumPy indexing (see here).

  • Data is easily appendable.

  • Many numeric types are supported: (u)int8-(u)int64, float16-float64, complex64, complex128.

  • Easy use of metadata, stored in a separate JSON text file.

  • Minimal dependencies, only NumPy.

  • Integrates easily with the Dask library for numeric computation on very large arrays.

  • Supports ragged arrays (still experimental).

See the documentation for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

darr-0.3.3.tar.gz (58.9 kB view details)

Uploaded Source

Built Distribution

darr-0.3.3-py3-none-any.whl (45.9 kB view details)

Uploaded Python 3

File details

Details for the file darr-0.3.3.tar.gz.

File metadata

  • Download URL: darr-0.3.3.tar.gz
  • Upload date:
  • Size: 58.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for darr-0.3.3.tar.gz
Algorithm Hash digest
SHA256 127e2f54b352a2cc40ae83619eac9508d1f35ef67d915e4c0c31e368447a15f1
MD5 ab435c2096f9d2dccaf269a61733105a
BLAKE2b-256 9e29fa9d3c6a5bb180d93a50c2742f743f2a4ea48fa5e76a0f031a76e3507302

See more details on using hashes here.

File details

Details for the file darr-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: darr-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 45.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for darr-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f8f525c1d4f856f911634598003dab41d706830df6ea8db878b89e8b9ecc010f
MD5 c2c01196a9cd8dfca09f114773e20181
BLAKE2b-256 1edfd98b26288d3f1272f0377e2ec2844a7c7c750343ed6797a43eeba602d08a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page