Skip to main content

Python client for the hdfstream service.

Project description

Python client module for the hdfstream HDF5 streaming service

This module provides facilities to access HDF5 files stored on a remote server which streams their contents in messagepack format. It attempts to replicate the h5py high level interface to some extent.

The source code and issue tracker are hosted on github: https://github.com/jchelly/hdfstream-python

Releases are hosted on pypi: https://pypi.org/project/hdfstream/

For documentation see: https://hdfstream-python.readthedocs.io/en/latest

Installation

The module can be installed using pip:

pip install hdfstream

Quick start

Connecting to the server

You can connect to the server as follows:

import hdfstream
root = hdfstream.open("https://localhost:8443/hdfstream", "/")

Here, the first parameter is the server URL and the second is the name of the directory to open. This returns a RemoteDirectory object.

Opening a file

The RemoteDirectory behaves like a python dictionary where the keys are the names of files and subdirectories within the directory. A file or subdirectory can be opened by like this:

# Open a HDF5 file
snap_file = root["EAGLE/Fiducial_models/RefL0012N0188/snapshot_028_z000p000/snap_028_z000p000.0.hdf5"]

which opens the specified file and returns a RemoteFile object.

Reading datasets

The file object acts like a dictionary containing HDF5 groups and datasets, so we can read a dataset as follows:

# Read all dark matter particle positions in the file
dm_pos = snap_file["PartType1/Coordinates"][...]

or if we only want to download part of the dataset:

# Read the first 100 dark matter particle positions
dm_pos = snap_file["PartType1/Coordinates"][:100,:]

HDF5 attributes can be accessed using the attrs field of group and dataset objects:

print(snap_file["Header"].attrs)

Building the documentation

To make a local copy of the documentation in html format:

pip install sphinx sphinx-rtd-theme sphinx-autodoc-typehints
cd docs
make html

Testing

There are some basic unit tests which can be run without access to a server. The repository includes a few pre-recorded responses from the server and a small amount of simulation data to check that the module can decode responses correctly. The tests can be run by running pytest in the source directory.

To regenerate the stored responses, assuming the server is available:

rm -r ./tests/cassettes/
pytest --record-mode=rewrite

Other pytest command line flags which might be useful:

  • --disable-recording: run a "live" test ignoring the stored responses and generating real http requests
  • --server: specify the server URL to use in tests
  • --no-verify-cert: don't verify certificates (e.g. when testing against a local development server)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hdfstream-0.0.28.tar.gz (398.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hdfstream-0.0.28-py3-none-any.whl (45.0 kB view details)

Uploaded Python 3

File details

Details for the file hdfstream-0.0.28.tar.gz.

File metadata

  • Download URL: hdfstream-0.0.28.tar.gz
  • Upload date:
  • Size: 398.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hdfstream-0.0.28.tar.gz
Algorithm Hash digest
SHA256 b56e8b169a205543776a2d8f691a401577ab6869c634323139e70bc2a19947a9
MD5 b2fd7498ca5074d557e80cb83eabe225
BLAKE2b-256 0a3ad1bb86a01f040238f02cefd8c0ad902d15ea03212d3baee4a5553421fa8e

See more details on using hashes here.

Provenance

The following attestation bundles were made for hdfstream-0.0.28.tar.gz:

Publisher: publish.yml on jchelly/hdfstream-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hdfstream-0.0.28-py3-none-any.whl.

File metadata

  • Download URL: hdfstream-0.0.28-py3-none-any.whl
  • Upload date:
  • Size: 45.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hdfstream-0.0.28-py3-none-any.whl
Algorithm Hash digest
SHA256 66bd4b811156807986dadb84a6d8cebe475d9e6d5f006b2d0818d46f58af1418
MD5 4f97c339989da0354f80f10e85b5571f
BLAKE2b-256 46dcf6afce42d6a94665c19790c192c56fdd3efc847a74027adb81a895338123

See more details on using hashes here.

Provenance

The following attestation bundles were made for hdfstream-0.0.28-py3-none-any.whl:

Publisher: publish.yml on jchelly/hdfstream-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page