Skip to main content

Python client for the hdfstream service.

Project description

Python client module for the hdfstream HDF5 streaming service

This module provides facilities to access HDF5 files stored on a remote server which streams their contents in messagepack format. It attempts to replicate the h5py high level interface to some extent.

The source code and issue tracker are hosted on github: https://github.com/jchelly/hdfstream-python

Releases are hosted on pypi: https://pypi.org/project/hdfstream/

For documentation see: https://hdfstream-python.readthedocs.io/en/latest

Installation

The module can be installed using pip:

pip install hdfstream

Quick start

Connecting to the server

You can connect to the server as follows:

import hdfstream
root = hdfstream.open("https://localhost:8443/hdfstream", "/")

Here, the first parameter is the server URL and the second is the name of the directory to open. This returns a RemoteDirectory object.

Opening a file

The RemoteDirectory behaves like a python dictionary where the keys are the names of files and subdirectories within the directory. A file or subdirectory can be opened by like this:

# Open a HDF5 file
snap_file = root["EAGLE/Fiducial_models/RefL0012N0188/snapshot_028_z000p000/snap_028_z000p000.0.hdf5"]

which opens the specified file and returns a RemoteFile object.

Reading datasets

The file object acts like a dictionary containing HDF5 groups and datasets, so we can read a dataset as follows:

# Read all dark matter particle positions in the file
dm_pos = snap_file["PartType1/Coordinates"][...]

or if we only want to download part of the dataset:

# Read the first 100 dark matter particle positions
dm_pos = snap_file["PartType1/Coordinates"][:100,:]

HDF5 attributes can be accessed using the attrs field of group and dataset objects:

print(snap_file["Header"].attrs)

Building the documentation

To make a local copy of the documentation in html format:

pip install sphinx sphinx-rtd-theme sphinx-autodoc-typehints
cd docs
make html

Testing

There are some basic unit tests which can be run without access to a server. The repository includes a few pre-recorded responses from the server and a small amount of simulation data to check that the module can decode responses correctly. The tests can be run by running pytest in the source directory.

To regenerate the stored responses, assuming the server is available:

rm -r ./tests/cassettes/
pytest --record-mode=rewrite

Other pytest command line flags which might be useful:

  • --disable-recording: run a "live" test ignoring the stored responses and generating real http requests
  • --server: specify the server URL to use in tests
  • --no-verify-cert: don't verify certificates (e.g. when testing against a local development server)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hdfstream-0.0.27.tar.gz (650.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hdfstream-0.0.27-py3-none-any.whl (44.7 kB view details)

Uploaded Python 3

File details

Details for the file hdfstream-0.0.27.tar.gz.

File metadata

  • Download URL: hdfstream-0.0.27.tar.gz
  • Upload date:
  • Size: 650.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hdfstream-0.0.27.tar.gz
Algorithm Hash digest
SHA256 cfa01e97d9793ac54b001d71148fba9772f582766001e7184fe6c828c7c0d3d5
MD5 4ebc5f3553394495abd9d3601b6ee90f
BLAKE2b-256 b6272770854e004ab8014deb1cd74d7116bbb81258774b1127ed385c762ec24f

See more details on using hashes here.

Provenance

The following attestation bundles were made for hdfstream-0.0.27.tar.gz:

Publisher: publish.yml on jchelly/hdfstream-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hdfstream-0.0.27-py3-none-any.whl.

File metadata

  • Download URL: hdfstream-0.0.27-py3-none-any.whl
  • Upload date:
  • Size: 44.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hdfstream-0.0.27-py3-none-any.whl
Algorithm Hash digest
SHA256 87e19ff67e618fc5b94f8b6600465e15a572179d89af03196f6381f4ff8afa95
MD5 4912cde70d278077230de7b83954f549
BLAKE2b-256 a51e364e67000caeff9dd8d48f2cfb2de431533bf95099b14b4622d8c51f4d99

See more details on using hashes here.

Provenance

The following attestation bundles were made for hdfstream-0.0.27-py3-none-any.whl:

Publisher: publish.yml on jchelly/hdfstream-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page