darr · PyPI

Stores NumPy arrays in a way that is self-documented and tool-independent.

These details have not been verified by PyPI

Project links

Project description

Darr is a Python science library that allows you to work efficiently with potentially very large, disk-based Numpy arrays that are widely readable and self-documented. Documentation includes copy-paste ready code to read arrays in many popular data science languages such as R, Julia, Scilab, IDL, Matlab, Maple, and Mathematica, or in Python/Numpy without Darr. Without exporting them and with minimal effort.

Universal readability of data is a pillar of good scientific practice. It is also generally a good idea for anyone who wants to flexibly move between analysis environments, who wants to save data for the longer term, or who wants to share data with others without spending much time on figuring out and/or explaining how the receiver can read it. No idea how to read your 7-dimensional uint32 numpy array in Matlab to quickly try out an algorithm your colleague wrote? No worries, a quick copy-paste of code from the array documentation is all that is needed to read your data in, e.g. R or Matlab (see example). As you work with your array, its documentation is automatically kept up to date. No need to export anything, make notes, or to provide elaborate explanation. No looking up things. No dependence on complicated formats or specialized libraries for reading you data elsewhere later.

In essence, Darr makes it trivially easy to share your numerical arrays with others or with yourself when working in different computing environments, and stores them in a future-proof way.

More rationale for a tool-independent approach to numeric array storage is provided here.

Under the hood, Darr uses NumPy memory-mapped arrays, which is a widely established and trusted way of working with disk-based numerical data, and which makes Darr fully NumPy compatible. This enables efficient out-of-core read/write access to potentially very large arrays. In addition to automatic documentation, Darr adds other functionality to NumPy’s memmap, such as easy the appending and truncating of data, support for ragged arrays, the ability to create arrays from iterators, and easy use of metadata. Flat binary files and (JSON) text files are accompanied by a README text file that explains how the array and metadata are stored (see example arrays).

See this tutorial for a brief introduction, or the documentation for more info.

Darr is currently pre-1.0, and still undergoing development. It is open source and freely available under the New BSD License terms.

Features

Data is stored purely based on flat binary and text files, maximizing universal readability.
Automatic self-documention, including copy-paste ready code snippets for reading the array in a number of popular data analysis environments, such as Python (without Darr), R, Julia, Scilab, Octave/Matlab, GDL/IDL, and Mathematica (see example array).
Disk-persistent array data is directly accessible through NumPy indexing and may be larger than RAM
Easy and efficient appending of data (see example).
Supports ragged arrays.
Easy use of metadata, stored in a widely readable separate JSON text file (see example).
Many numeric types are supported: (u)int8-(u)int64, float16-float64, complex64, complex128.
Integrates easily with the Dask library for out-of-core computation on very large arrays.
Minimal dependencies, only NumPy.

Limitations

No structured (record) arrays supported yet, just ndarrays
No string data, just numeric.
No compression, although compression for archiving purposes is supported.
Uses multiple files per array, as binary data is separated from text documentation and metadata. This can be a disadvantage in terms of storage space if you have very many very small arrays.

Installation

Darr officially depends on Python 3.9 or higher. Older versions may work (probably >= 3.6) but are not tested.

Install Darr from PyPI:

$ pip install darr

Or, install Darr via conda:

$ conda install -c conda-forge darr

To install the latest development version, use pip with the latest GitHub master:

$ pip install git+https://github.com/gbeckers/darr@master

Documentation

See the documentation for more information.

Contributing

Any help / suggestions / ideas / contributions are welcome and very much appreciated. For any comment, question, or error, please open an issue or propose a pull request.

Other interesting projects

If Darr is not exactly what you are looking for, have a look at these projects:

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.3

May 26, 2025

0.6.1

May 26, 2025

0.6.0

Nov 7, 2024

0.5.5

Sep 20, 2023

0.5.4

Jul 12, 2022

0.5.3

Jun 13, 2022

0.5.2

Jun 13, 2022

0.5.1

May 16, 2022

0.5.0

Mar 7, 2022

0.4.1

Feb 19, 2022

0.4.0

Dec 6, 2021

0.3.3

Jul 20, 2021

0.3.2

Jul 14, 2021

0.3.1

Mar 8, 2021

0.2.2

May 11, 2020

0.2.1

Apr 28, 2020

0.2.0

Oct 28, 2019

0.1.11

Jan 20, 2019

0.1.10

Dec 1, 2018

0.1.9

Nov 13, 2018

0.1.8

Nov 7, 2018

0.1.7

Oct 28, 2018

0.1.6

Oct 28, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

darr-0.6.3.tar.gz (299.0 kB view details)

Uploaded May 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

darr-0.6.3-py3-none-any.whl (43.3 kB view details)

Uploaded May 26, 2025 Python 3

File details

Details for the file darr-0.6.3.tar.gz.

File metadata

Download URL: darr-0.6.3.tar.gz
Upload date: May 26, 2025
Size: 299.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for darr-0.6.3.tar.gz
Algorithm	Hash digest
SHA256	`9daf1b15737352815e5e55ac27b60b5eabb81867a1f8424510924af7657d31af`
MD5	`d92524ccb30cb65971db38188ae77dd8`
BLAKE2b-256	`ce47fef78a9712e75050e54805a85a8ba6a4a37fd3abfd19d3f348cb553ec7c0`

See more details on using hashes here.

File details

Details for the file darr-0.6.3-py3-none-any.whl.

File metadata

Download URL: darr-0.6.3-py3-none-any.whl
Upload date: May 26, 2025
Size: 43.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for darr-0.6.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`efbdf7d5fb37777ce009ccf7181777df3e74e5f68512c1e586df27638499d0d5`
MD5	`b472ab5eeec76a2b26bf2280e9e3cc09`
BLAKE2b-256	`35550f143871475cc2333d239a91024aec2f45b272a8ef11e01144b5c92739fc`

See more details on using hashes here.

darr 0.6.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Features

Limitations

Installation

Documentation

Contributing

Other interesting projects

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes