An intake plugin for parsing an ESM (Earth System Model) Collection/catalog and loading assets (netCDF files and/or Zarr stores) into xarray datasets.
Project description
Intake-esm
Motivation
Project efforts such as the Coupled Model Intercomparison Project (CMIP) and the Community Earth System Model (CESM) Large Ensemble Project produce a huge of amount climate data persisted on tape, disk storage, object storage components across multiple (in the order of ~ 300,000) data assets. These data assets are stored in netCDF and more recently Zarr formats. Finding, investigating, loading these assets into data array containers such as xarray can be a daunting task due to the large number of files a user may be interested in. Intake-esm aims to address these issues by providing necessary functionality for searching, discovering, data access/loading.
Overview
intake-esm is a data cataloging utility built on top of intake, pandas, and xarray, and it’s pretty awesome!
Opening an ESM collection definition file: An ESM (Earth System Model) collection file is a JSON file that conforms to the ESM Collection Specification. When provided a link/path to an esm collection file, intake-esm establishes a link to a database (CSV file) that contains data assets locations and associated metadata (i.e., which experiement, model, the come from). The collection JSON file can be stored on a local filesystem or can be hosted on a remote server.
>>> import intake >>> col_url = "https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json" >>> col = intake.open_esm_datastore(col_url)
Search and Discovery: intake-esm provides functionality to execute queries against the database:
>>> cat = col.search(experiment_id=['historical', 'ssp585'], table_id='Oyr', ... variable_id='o2', grid_label='gn')
Access: when the user is satisfied with the results of their query, they can ask intake-esm to load data assets (netCDF/HDF files and/or Zarr stores) into xarray datasets:
>>> dset_dict = cat.to_dataset_dict(zarr_kwargs={'consolidated': True, 'decode_times': False}, ... cdf_kwargs={'chunks': {}, 'decode_times': False})
See documentation for more information.
Installation
Intake-esm can be installed from PyPI with pip:
pip install intake-esm
It is also available from conda-forge for conda installations:
conda install -c conda-forge intake-esm
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for intake_esm-2020.5.21-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f962caceec93a370671e4d31ac461a1bbed73a20d91bce5709dc1a113da684c5 |
|
MD5 | a7c09df10c982efbc5095ac30003a089 |
|
BLAKE2b-256 | adafc3e828933d10fb06383ed6a8fac77fb615f25feb41c5bfdc519c2e4a9e10 |