Skip to main content

A command-line tool wrapping the blosc2-grok JPEG2K compressor for HDF5 tomographic data.

Project description

Tomocompress

Simple-to-use compression tool for tomographic HDF5 file compression (developed with Python 3.12).

A command-line executable named tomocompress is installed. It is a wrapper for the Blosc2 Grok JPEG2K to compress tomography raw NeXus/HDF5 in a lossy fashion. This work was financed by the LEAPS-INNOV project.

Install Tomocompress

# Optionally, create a dedicated conda env with Python 3.12
$ conda create -n tomocompress python=3.12
$ conda activate tomocompress

# Install the latest wheel from PyPI
$ pip install tomocompress

# It installs a command-line tool called tomocompress

Run Tomocompress

# Activate your conda env if needed
$ conda activate tomocompress

# Run it!
$ tomocompress myfile.h5

# More options
$ tomocompress --help

# Result
A file called compressed_grok_myfile.h5 next to the input hdf5 file.

# Examples

## If your dataset is not called 'data' but 'something' in your hdf5 arborescence
$ tomocompress myfile.h5 -d something

## Specify more than one dataset to compress (comma-separated)
## note: the program will look for these dataset names in the HDF5 arborescence
## so that you don't have to enter their full path
$ tomocompress myfile.h5 -d "data,dark,flat"

## Specify a target compression ratio of 10 (default 4)
$ tomocompress myfile.h5 -c 10

Output file

By default, a compressed file bearing a suffix is created in the same directory as the original file. You can change this behaviour by specifying either a path to a directory or a full file path

$ tomocompress myfile.h5 -o /some/other/path/compressed.h5

# only specifying a directory, a suffix will be appended to the name of the original file
$ tomocompress myfile.h5 -o /some/other/path

Compressing a 3D Numpy array of images

from tomocompress.compressor import Blosc2GrokCompressor

# creating a Numpy array of 100 images to feed in
data_array = np.random.randint(0, 256, size=(100, 512, 512), dtype=np.uint8)

# Compressing it in a new HDF5 file with Blosc2&Grok
compressor = Blosc2GrokCompressor(input_np_array=data_array,
                                  dataset_name="data",
                                  compression_ratio=2,
                                  output_file_path="2X_compressed.h5"
                                  )
compressor.compress()

Reading compressed files (Python)

Provided that the hdf5plugin and blosc2-grok Python packages are installed, it is possible to read back the written data with h5py.

import blosc2_grok
import h5py
import hdf5plugin

with h5py.File("my_compressed_file.h5", "r") as h5f:
    read_data = h5f["data"][()]

See the doc and scripts folders for more resources.

Programmatic usage (Python):

from tomocompress.compressor import Blosc2GrokCompressor

# Input HDF5 tomo file to compress
input_tomo_file = sys.argv[1]

# The dataset name you want to compress inside the input HDF5 file (default:data)
# If not specified, it will try to find automatically a dataset called "data" in the file arborescence
dataset_names = "data,dark,flat"

# Desired compression ratio
CR = 20               # desired compression ratio

# This will write the compressed file in the same directory as the original one
grok_compressor =  Blosc2GrokCompressor(input_hdf5=input_tomo_file, compression_ratio=CR, dataset_names=dataset_names, output_file_path="/some/path")
grok_compressor.compress()

Recommended Python version

3.12

Authors and acknowledgment

Nicolas Soler (SDM) Alba Synchrotron

License

See LICENSE.rst

Gitlab page

https://gitlab.com/alba-synchrotron/sdm/tomocompress

PyPI page

https://pypi.org/project/tomocompress/

Project status

stable

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tomocompress-0.2.4-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file tomocompress-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: tomocompress-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.2 Linux/5.10.0-36-amd64

File hashes

Hashes for tomocompress-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3916aba3b298e05e77e86cf439769f9234a8a7a1e540f86c7241a7ccc127add6
MD5 a1f7542f7e92552cbf24ed71927f4279
BLAKE2b-256 1da94a71d9c2119b0ad16c58630112842f995cccc266e48052fe94551b713f63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page