Interface for using cubed with xarray for parallel computation.
Project description
Note: this is a proof-of-concept, and many things are incomplete, untested, or don't work.
cubed-xarray
Interface for using cubed with xarray.
Requirements
- Cubed version >=0.23.0
- Xarray version >=2024.09.0
Installation
Install via pip
pip install cubed-xarray
or conda
conda install -c conda-forge cubed-xarray
Importing
You don't need to import this package in user code. Once poperly installed, xarray should automatically become aware of this package via the magic of entrypoints.
Usage
Xarray objects backed by cubed arrays can be created either by:
- Passing existing
cubed.Arrayobjects to thedataargument of xarray constructors, - Calling
.chunkon xarray objects, - Passing a
chunksargument toxarray.open_dataset.
In (2) and (3) the choice to use cubed.Array instead of dask.array.Array is made by passing the keyword argument chunked_array_type='cubed'.
To pass arguments to the constructor of cubed.Array you should pass them via the dictionary from_array_kwargs, e.g. from_array_kwargs={'spec': cubed.Spec(allowed_mem='2GB')}.
If cubed and cubed-xarray are installed but dask is not, then specifying chunked_array_type is not necessary,
as the entrypoints system will then default to the only chunked parallel backend available (i.e. cubed).
Sharp Edges 🔪
Some things almost certainly won't work yet:
- Certain operations called in xarray but not implemented in cubed, for instance
pad(see https://github.com/tomwhite/cubed/issues/193) - Array operations involving NaNs - for now use
skipna=Trueto avoid eager loading (see https://github.com/pydata/xarray/issues/7243) - Using
parallel=Truewithxr.open_mfdatasetwon't work because cubed doesn't implement a version ofdask.Delayed(see https://github.com/pydata/xarray/issues/7810) - Groupby (see https://github.com/tomwhite/cubed/issues/223 and https://github.com/xarray-contrib/flox/issues/224)
xarray.map_blocksdoes not actually dispatch tocubed.map_blocksyet, and will always use Dask.- Certain operations using
cumreduction(e.g.ffillandbfill) are not hooked up to theChunkManageryet, so will attempt to call dask.
and some other things might work but have not yet been tried:
- Saving to formats other than zarr
In general a bug could take the form of an error, or of a silent attempt to coerce the array type to numpy by immediately computing the underlying array.
Tests
Integration tests for wrapping cubed with xarray also live in this repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cubed_xarray-0.0.9.tar.gz.
File metadata
- Download URL: cubed_xarray-0.0.9.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff1064199c248f78e1a79ca91f5effc8e6854ad47ba45d2d32a5d4f5c26f5482
|
|
| MD5 |
5f53a56d91f77c2834f2a752edca59ab
|
|
| BLAKE2b-256 |
20d940381f32f73fccc9ab26a642ab49dfa328748db096447edd13e86597e753
|
File details
Details for the file cubed_xarray-0.0.9-py3-none-any.whl.
File metadata
- Download URL: cubed_xarray-0.0.9-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5c90feb6ff3194991a8f7c14059dd01eb7c3222b8b259a071830a7f9a5c9127
|
|
| MD5 |
5ee40d838249d3ff74164fc9c24a999f
|
|
| BLAKE2b-256 |
eb3fc33fbb5349a202320b4e1b28ce4eeb0560f8823052a00cdd80ff39729c6f
|