Skip to main content

A command-line tool wrapping the blosc2-grok JPEG2K compressor for HDF5 tomographic data.

Project description

Tomocompress

Simple-to-use compression tool for tomographic HDF5 file compression (developed with Python 3.12). A command-line executable named tomocompress is installed. It is a wrapper for the Blosc2 Grok JPEG2K to compress tomography raw NeXus/HDF5 in a lossy fashion.

This project has received funding from the European Union´s Horizon 2020 research and innovation programme under grant agreement no. 101004728.

LEAPS-INNOV project

Install Tomocompress

# Optionally, create a dedicated conda env with Python 3.12
$ conda create -n tomocompress python=3.12
$ conda activate tomocompress

# Install the latest wheel from PyPI
$ pip install tomocompress

# It installs a command-line tool called tomocompress

Run Tomocompress

# Activate your conda env if needed
$ conda activate tomocompress

# Run it!
$ tomocompress myfile.h5

# More options
$ tomocompress --help

# Result
A file called compressed_grok_myfile.h5 next to the input hdf5 file.

# Examples

## If your dataset is not called 'data' but 'something' in your hdf5 arborescence
$ tomocompress myfile.h5 -d something

## Specify more than one dataset to compress (comma-separated)
## note: the program will look for these dataset names in the HDF5 arborescence
## so that you don't have to enter their full path
$ tomocompress myfile.h5 -d "data,dark,flat"

## Specify a target compression ratio of 10 (default 4)
$ tomocompress myfile.h5 -c 10

Output file

By default, a compressed file bearing a suffix is created in the same directory as the original file. You can change this behaviour by specifying either a path to a directory or a full file path

$ tomocompress myfile.h5 -o /some/other/path/compressed.h5

# only specifying a directory, a suffix will be appended to the name of the original file
$ tomocompress myfile.h5 -o /some/other/path

Reading compressed files (Python)

Provided that the hdf5plugin and blosc2-grok Python packages are installed, it is possible to read back the written data with h5py.

import blosc2_grok
import h5py
import hdf5plugin

with h5py.File("my_compressed_file.h5", "r") as h5f:
    read_data = h5f["data"][()]

See the doc and scripts folders for more resources.

Programmatic usage (Python):

from tomocompress.compressor import Blosc2GrokCompressor

# Input HDF5 tomo file to compress
input_tomo_file = sys.argv[1]

# The dataset name you want to compress inside the input HDF5 file (default:data)
# If not specified, it will try to find automatically a dataset called "data" in the file arborescence
dataset_names = "data,dark,flat"

# Desired compression ratio
CR = 20               # desired compression ratio

# This will write the compressed file in the same directory as the original one
grok_compressor =  Blosc2GrokCompressor(input_hdf5=input_tomo_file,
                                        compression_ratio=CR,
                                        dataset_name=dataset_names,
                                        output_file_path="/some/path")
grok_compressor.compress()

Compressing a 3D Numpy array of images instead of working with an input HDF5

from tomocompress.compressor import Blosc2GrokCompressor

# creating a Numpy array of 100 images to feed in
data_array = np.random.randint(0, 256, size=(100, 512, 512), dtype=np.uint8)

# Compressing it in a new HDF5 file with Blosc2&Grok
compressor = Blosc2GrokCompressor(input_np_array=data_array,
                                  dataset_name="data",
                                  compression_ratio=2,
                                  output_file_path="2X_compressed.h5")
compressor.compress()

Recommended Python version

3.12

Authors and acknowledgment

Nicolas Soler (SDM) Alba Synchrotron

License

See LICENSE.rst

Gitlab page

https://gitlab.com/alba-synchrotron/sdm/tomocompress

PyPI page

https://pypi.org/project/tomocompress/

Project status

stable

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tomocompress-0.2.5-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file tomocompress-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: tomocompress-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.2 Linux/5.10.0-36-amd64

File hashes

Hashes for tomocompress-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 68ac82669ca9d214b9b455f85ec6d99d51191700a3fcdb20813a3fd7ccde9198
MD5 252c32bdc7a72f36a86d737fc6e14561
BLAKE2b-256 570629eb4f334b28a4be98dfe82dc2ee84ba8c289e11776a853f611ee3cdb44b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page