Skip to main content

Python hdf5 tools

Project description

This git repository contains a python package with an H5 class to load and combine one or more HDF5 data files (or xarray datasets) with optional filters. The class will then export the combined data to an HDF5 file, file object, or xr.Dataset. This class is designed to be fast and safe on memory. This means that files of any size can be combined and saved even on a PC with low memory (unlike xarray).

Installation

Using pip:

pip install hdf5tools

Or using conda/mamba from conda-forge:

conda install -c conda-forge hdf5tools

Usage

Currently, only the Combine class is recommended for other to use.

First, initiate the class with one or many: paths to netcdf3, netcdf4, or hdf5 files; xr.Dataset objects (opened via xr.open_dataset); or h5py.File objects.

from hdf5tools import Combine

###############################
### Parameters

path1 = '/path/to/file1.nc'
path2 = '/path/to/file2.nc'

##############################
### Combine files

c1 = Combine([path1, path2])

If you want to do some kind of selection via the coordinates or only select some of the data variables/coordinates then use the .sel method (like in xarray). Be sure to read the docstrings for additional info about the input parameters.

c2 = c1.sel({'time': slice('2001-01-01', '2020-01-01'), 'latitude': slice(0, 10)}, include_data_vars=['precipitation'])

And finally, save the combined data to a single hdf5/netcdf4 file using the .to_hdf5 method. The only additional parameters that are important include the output which should be a path or a io.Bytes object, and the compression parameter. If you plan on using this file outside of the python environment, use gzip for compression, otherwise use lzf. The docstrings have more details.

output = '/path/to/output.nc'

c2.to_hdf5(output, compression='gzip')

If you’ve passed xr.Dataset objects to Combine, it will be slower than passing the file as a path on disk. Only pass xr.Dataset objects to Combine if you don’t want to write the intermediate file to disk before reading it into Combine.

The package also comes with a bonus function called xr_to_hdf5. It is a convenience function to convert a single xr.Dataset to an hdf5/netcdf4 file.

from hdf5tools import xr_to_hdf5

xr_to_hdf5(xr_dataset, output, compression='gzip')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hdf5tools-0.5.0.tar.gz (29.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hdf5tools-0.5.0-py2.py3-none-any.whl (33.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file hdf5tools-0.5.0.tar.gz.

File metadata

  • Download URL: hdf5tools-0.5.0.tar.gz
  • Upload date:
  • Size: 29.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.7

File hashes

Hashes for hdf5tools-0.5.0.tar.gz
Algorithm Hash digest
SHA256 b1e733a91cec8631c4fea6eb58395235414ae92c945a0e149b3f115f75375b91
MD5 a4021932054d1cfce2e89d071772b51b
BLAKE2b-256 854ed381806010233df674f2681aeb75ad5fe0e4f6437491f2dd3770cfa92c95

See more details on using hashes here.

File details

Details for the file hdf5tools-0.5.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for hdf5tools-0.5.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f90774ae23ab80956d23edda4c9bb23edf5739dbea0f27910bb5ee51dbea89c8
MD5 0e33cb32c010113fb42480dbb19a3808
BLAKE2b-256 8fdc2f8835b42df8aba7462a0c2d2cdb76db6b51b4902f119411adaa65a53a6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page