hdfuse is a tool for quick inspection of HDF5 files
Project description
What is hdfuse?
hdfuse is a filesystem in userspace (FUSE https://github.com/libfuse/libfuse or MacFUSE https://osxfuse.github.io/) to explore data stored in hierarchical hdf5 files (see https://www.hdfgroup.org/solutions/hdf5/).
All files in the directory given as the hdfuse5 root are mirrored into the mountpoint with all hdf5 files being presented as a directories. HDF5 Groups within the files are mapped into further directories, data arrays are files, and attributes are shown as extended filesystem attributes. Non-hdf5 files and directories are just mirrored unchanged.
The main purpose is to enable a quick access to the contents of hdf5 files with normal system tools (grep, find, cat, file browser, etc.).
What do I need to run hdfuse?
- python (https://www.python.org/)
- h5py (https://www.h5py.org/)
- fusepy (https://github.com/fusepy/fusepy)
- FUSE or MacFUSE support in you operating system and the required permissions to use it. That rules out any Windows variants.
Most distributions have a 'fuse' user group you need to be a member of in order to be allowed to use fuse (and therefore hdfuse5).
What do I need to do?
$ pip install hdfuse
# This will put you inside the editor (by default Vim), browsing
# the directory that was mounted with `-d`
$ hdfuse -d directory_containing_nxs_file
To go back to a shell from Vim you can do :sh
. The directory path should be
visible just above where the new shell starts. If you cd
to it then you can
browse it like a normal directory:
$ # what's in the mountpoint now? a directory for the file!
$ ls -l tmppath
total 0
drwxr-xr-x 1 user user 14704 2011-06-17 16:55 NXtest.h5
$ # explore the structure
$ find tmppath -ls
1 0 drwxr-xr-x 2 user user 4096 Jun 17 16:55 tmppath
2 0 drwxr-xr-x 1 user user 14704 Jun 17 16:55 tmppath/NXtest.h5
3 0 drwxr-xr-x 1 user user 0 Jun 17 16:55 tmppath/NXtest.h5/entry
4 0 -rw-r--r-- 1 user user 10 Jun 17 16:55 tmppath/NXtest.h5/entry/ch_data
5 0 drwxr-xr-x 1 user user 0 Jun 17 16:55 tmppath/NXtest.h5/entry/data
6 0 -rw-r--r-- 1 user user 8000 Jun 17 16:55 tmppath/NXtest.h5/entry/data/comp_data
7 0 -rw-r--r-- 1 user user 28 Jun 17 16:55 tmppath/NXtest.h5/entry/data/flush_data
8 0 -rw-r--r-- 1 user user 160 Jun 17 16:55 tmppath/NXtest.h5/entry/data/r8_data
9 0 -rw-r--r-- 1 user user 4 Jun 17 16:55 tmppath/NXtest.h5/entry/i1_data
10 0 -rw-r--r-- 1 user user 8 Jun 17 16:55 tmppath/NXtest.h5/entry/i2_data
11 0 -rw-r--r-- 1 user user 16 Jun 17 16:55 tmppath/NXtest.h5/entry/i4_data
12 0 -rw-r--r-- 1 user user 32 Jun 17 16:55 tmppath/NXtest.h5/entry/i8_data
13 0 -rw-r--r-- 1 user user 80 Jun 17 16:55 tmppath/NXtest.h5/entry/r4_data
14 0 -rw-r--r-- 1 user user 160 Jun 17 16:55 tmppath/NXtest.h5/entry/r8_data
15 0 drwxr-xr-x 1 user user 0 Jun 17 16:55 tmppath/NXtest.h5/entry/sample
16 0 -rw-r--r-- 1 user user 20 Jun 17 16:55 tmppath/NXtest.h5/entry/sample/ch_data
17 0 drwxr-xr-x 1 user user 0 Jun 17 16:55 tmppath/NXtest.h5/link
18 0 -rw-r--r-- 1 user user 160 Jun 17 16:55 tmppath/NXtest.h5/link/renLinkData
19 0 drwxr-xr-x 1 user user 0 Jun 17 16:55 tmppath/NXtest.h5/link/renLinkGroup
20 0 -rw-r--r-- 1 user user 20 Jun 17 16:55 tmppath/NXtest.h5/link/renLinkGroup/ch_data
21 0 drwxr-xr-x 1 user user 0 Jun 17 16:55 tmppath/NXtest.h5/link/sample
22 0 -rw-r--r-- 1 user user 20 Jun 17 16:55 tmppath/NXtest.h5/link/sample/ch_data
$ # find out the attributes of a dataset
$ getfattr -d tmppath/NXtest.h5/entry/data/comp_data
# file: tmppath/NXtest.h5/entry/data/comp_data
user.dtype="int32"
user.dtype.itemsize="4"
user.itemsize="4"
user.ndim="2"
user.shape="(100, 20)"
user.size="2000"
$ # explore the data (see "man od")
$ od -t dI -w4 -v tmppath/NXtest.h5/entry/data/comp_data | head -30
0000000 0
0000004 0
0000010 0
0000014 0
0000020 0
0000024 0
0000030 0
0000034 0
0000040 0
0000044 0
0000050 0
0000054 0
0000060 0
0000064 0
0000070 0
0000074 0
0000100 0
0000104 0
0000110 0
0000114 0
0000120 1
0000124 1
0000130 1
0000134 1
0000140 1
0000144 1
0000150 1
0000154 1
0000160 1
0000164 1
$ grep -ar sample tmppath # find some sample information
mnt/NXtest.h5/entry/sample/ch_data:NeXus sample
mnt/NXtest.h5/link/renLinkGroup/ch_data:NeXus sample
mnt/NXtest.h5/link/sample/ch_data:NeXus sample
After you exit the shell and Vim the file will be automatically closed.
Random other questions
What kind of performance can I expect?
None. hdfuse was coded as a proof of principle. It holds no state, there is no caching, no optimisations whatsoever. This left the code quite simple and concise but still proved to be good enough for the main application: quick inspection of the file structure of a limited number of files.
It is not recommended to run this on a directory with hundreds of HDF5 files.
Why do I have to mount a directory? Why can't I mount a single file?
This would be simple to support. The motivation at the time was to be able explore to multiple files easily with "grep" or similar tools.
Can I have write support?
It is best to use h5py
directly for this, that way you can be more
confident that the data is correct.
Any drawbacks?
hdf5 files cannot be opened "normally" by HDF5 aware tools in the tree mounted via hdfuse5, because to the operating system they look like directories.
Any future plans?
Links in hdf5 could be presented as symbolic link in the mounted directory. Pull requests welcome.
Another idea would get to present data that could be interpreted as image data in the hdf5 in a format that could be read by standard image file readers, e.g. as tiff.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file hdfuse-1.0.2.tar.gz
.
File metadata
- Download URL: hdfuse-1.0.2.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200325 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33a5c5d95014c89d8dd4df04f22a6184c53da92b5f4b022c4c20b8dc85eae267 |
|
MD5 | a34aa8816903ca85f9f4a9ffc21d2469 |
|
BLAKE2b-256 | 77f2e38edcbb18867632e94081eb70fd5c129cd3af88624b14a6ce5179793618 |
File details
Details for the file hdfuse-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: hdfuse-1.0.2-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200325 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ff315711c6176053183ff1e05b75c26fa5e8a023900bc9ae960672240e3d554 |
|
MD5 | 5aeb690c96f97a3b68013acbbfd5682e |
|
BLAKE2b-256 | 05cfa74b857599fcda83c6c49421562d3ae7ef90720e299a03c6ae19a7c2d6b3 |