Skip to main content

Hierarchical tree-like data structures for xarray

Project description

datatree

CI GitHub Workflow Status Code Coverage Status pre-commit.ci status
Docs Documentation Status
Package Conda PyPI
License License

Datatree is a prototype implementation of a tree-like hierarchical data structure for xarray.

Datatree was born after the xarray team recognised a need for a new hierarchical data structure, that was more flexible than a single xarray.Dataset object. The initial motivation was to represent netCDF files / Zarr stores with multiple nested groups in a single in-memory object, but datatree.DataTree objects have many other uses.

:rotating_light: :bell: :warning: NO LONGER MAINTAINED :warning: :bell: :rotating_light:

This repository has been archived and the code is no longer maintained!

Datatree has been merged upstream into pydata/xarray, and released as of xarray version 2024.10.0.

There will be no further bugfixes or feature additions to this respository.

Users of this repository should migrate to using xarray.DataTree instead, following the Migration Guide.

The information below is all outdated, and is left only for historical interest.

Installation

You can install datatree via pip:

pip install xarray-datatree

or via conda-forge

conda install -c conda-forge xarray-datatree

Why Datatree?

You might want to use datatree for:

  • Organising many related datasets, e.g. results of the same experiment with different parameters, or simulations of the same system using different models,
  • Analysing similar data at multiple resolutions simultaneously, such as when doing a convergence study,
  • Comparing heterogenous but related data, such as experimental and theoretical data,
  • I/O with nested data formats such as netCDF / Zarr groups.

Talk slides on Datatree from AMS-python 2023

Features

The approach used here is based on benbovy's DatasetNode example - the basic idea is that each tree node wraps a up to a single xarray.Dataset. The differences are that this effort:

  • Uses a node structure inspired by anytree for the tree,
  • Implements path-like getting and setting,
  • Has functions for mapping user-supplied functions over every node in the tree,
  • Automatically dispatches some of xarray.Dataset's API over every node in the tree (such as .isel),
  • Has a bunch of tests,
  • Has a printable representation that currently looks like this:
drawing

Get Started

You can create a DataTree object in 3 ways:

  1. Load from a netCDF file (or Zarr store) that has groups via open_datatree().
  2. Using the init method of DataTree, which creates an individual node. You can then specify the nodes' relationships to one other, either by setting .parent and .children attributes, or through __get/setitem__ access, e.g. dt['path/to/node'] = DataTree().
  3. Create a tree from a dictionary of paths to datasets using DataTree.from_dict().

Development Roadmap

Datatree currently lives in a separate repository to the main xarray package. This allows the datatree developers to make changes to it, experiment, and improve it faster.

Eventually we plan to fully integrate datatree upstream into xarray's main codebase, at which point the github.com/xarray-contrib/datatree repository will be archived. This should not cause much disruption to code that depends on datatree - you will likely only have to change the import line (i.e. from from datatree import DataTree to from xarray import DataTree).

However, until this full integration occurs, datatree's API should not be considered to have the same level of stability as xarray's.

User Feedback

We really really really want to hear your opinions on datatree! At this point in development, user feedback is critical to help us create something that will suit everyone's needs. Please raise any thoughts, issues, suggestions or bugs, no matter how small or large, on the github issue tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xarray_datatree-0.0.15.tar.gz (91.1 kB view details)

Uploaded Source

Built Distribution

xarray_datatree-0.0.15-py3-none-any.whl (64.3 kB view details)

Uploaded Python 3

File details

Details for the file xarray_datatree-0.0.15.tar.gz.

File metadata

  • Download URL: xarray_datatree-0.0.15.tar.gz
  • Upload date:
  • Size: 91.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.10

File hashes

Hashes for xarray_datatree-0.0.15.tar.gz
Algorithm Hash digest
SHA256 4e828086d858742e4dec7ed7e9187865d5215302919d0e02a613dce0ff0db003
MD5 651a8935c8c9b94a3b8b5b1742e6012d
BLAKE2b-256 e65fbf24d56bcffaab77f56789ea9ad08bd185b15a644000f7ba2187ad92bee0

See more details on using hashes here.

File details

Details for the file xarray_datatree-0.0.15-py3-none-any.whl.

File metadata

File hashes

Hashes for xarray_datatree-0.0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 190d8262061522eeaaa0dc7058b50df7228a615e6d62761150f093518bdad62c
MD5 f9a8575d9d1fb8059f431565449ac2d4
BLAKE2b-256 cbe36952d37e19b66bd2f18a3de16289ad7da4ef649f6284e07942a5bf5931a8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page