Skip to main content

Memory-mapped numeric arrays, based on a format that is self-explanatory and tool-independent

Project description

Darr is a Python science library for disk-based NumPy arrays that persist in a format that is simple, self-documented and tool-independent. It enables you to work efficiently with potentially very large arrays, while keeping your data easily accessible from a wide range of computing environments. Keeping data universally readable and documented is a pillar of good scientific practice. More rationale for this approach is provided here.

Flat binary files and (JSON) text files are accompanied by a README text file that explains how the array and metadata are stored. It also provides code for reading the array in a variety of current scientific data tools such as Python, R, Julia, IDL, Matlab, Maple, and Mathematica. It is trivially easy to share your data with others or with yourself when working in different computing environments, because it always contains clear documentation, including code to read it. No need to export anything or to provide elaborate explanation. No dependence on complicated formats or specialized tools. Self-documentation and code examples are automatically updated as you change your arrays when working with them.

Darr uses NumPy memmory-mapped arrays under the hood, which you can access directly for full NumPy compatibility and efficient out-of-core read/write access to potentially very large arrays. In addition, Darr supports the possibility to append and truncate arrays, and the use of ragged arrays (still experimental).

See this tutorial for a brief introduction, or the documentation for more info.

Darr is currently pre-1.0, still undergoing significant development. It is open source and freely available under the New BSD License terms.

Features

  • Disk-persistent array data is directly accessible through NumPy indexing.

  • Works with data arrays larger than RAM.

  • Data is stored purely based on flat binary and text files, maximizing tool independence.

  • Data is automatically documented and includes a README text file with human-readable explanation of how the data is stored.

  • README includes examples of how to read the array in a number of popular data analysis environments, such as Python (without Darr), R, Julia, Octave/Matlab, GDL/IDL, and Mathematica (see example array).

  • Data is easily appendable.

  • Many numeric types are supported: (u)int8-(u)int64, float16-float64, complex64, complex128.

  • Easy use of metadata, stored in a separate JSON text file.

  • Minimal dependencies, only NumPy.

  • Integrates easily with the Dask library for out-of-core computation on very large arrays.

  • Supports ragged arrays (still experimental).

See the documentation for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

darr-0.4.1.tar.gz (58.7 kB view details)

Uploaded Source

Built Distribution

darr-0.4.1-py3-none-any.whl (46.6 kB view details)

Uploaded Python 3

File details

Details for the file darr-0.4.1.tar.gz.

File metadata

  • Download URL: darr-0.4.1.tar.gz
  • Upload date:
  • Size: 58.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for darr-0.4.1.tar.gz
Algorithm Hash digest
SHA256 60dd84726ea35c1fb701a69e0ddf8c34c37a85c5c3e551472b52d53ab5752fef
MD5 d677ce4a31ba5ef5d2dd5334607fa388
BLAKE2b-256 0a6f86e2b59dd1548dfca0556e9f302bd52026ca3780fd255671401c64f90ad6

See more details on using hashes here.

File details

Details for the file darr-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: darr-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 46.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for darr-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9cee1d874698bd0241ee813b56e075fcc3d6d374acdc85b4b619f373ee638737
MD5 7ec0200bd3088ac0ebc83c20bd2cadda
BLAKE2b-256 2def28a7c066cf04af0204780f869c3f3bfc5e03c644ce32c44ad3c82ab5c487

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page