Skip to main content

Retrieve information content and compress accordingly.

Project description


xbitinfo: Retrieve bitwise information content and compress accordingly

Binder Open In SageMaker Studio Lab CI pre-commit.ci status Documentation Status pypi Conda (channel only)

Xbitinfo analyses datasets based on their bitwise real information content and applies lossy compression accordingly. Being based on xarray it integrates seamlessly into common research workflows. Additional convienient functions help users to visualize the bitwise information content and to make informed decisions on the real information threshold that is subsequently used as the preserved precision during the compression.

Xbitinfo works in four steps:

  1. Analyse the bitwise information content of a dataset
  2. Decide on a threshold of real information to preserve (e.g. 99%)
  3. Reduce the precision of the dataset accordingly (bitrounding)
  4. Apply lossless compression (e.g. zlib, blosc, zstd) and store the dataset

To fullfill these steps, Xbitinfo relies on:

  • xarray for handling multi-dimensional arrays and file formats (e.g. netcdf, zarr, hdf5, grib)
  • dask for scaling to large datasets
  • BitInformation.jl (optional) for computing the bitwise information content based on the original Julia implementation. Continuous integration tests ensure however that the python-implementation shipped with xbitinfo result in identical results.
  • numcodecs for a wide-range of lossless compression algorithms

Overall, the package presents a pipeline to compress (climate) datasets based on the real information content.

How to install

Xbitinfo is packaged and distributed both via PyPI and conda-forge and can be installed via pip or conda respectively.

Depending on whether one wants to use the Julia implementation of the bitinformation algorithm (BitInformation.jl) or the native python implementation shipped with xbitinfo, one might choose one installation option over the other.

Pure-python installation (recommended)

pip install xbitinfo

or

conda install -c conda-forge xbitinfo-python

Installation including optional Julia backend

conda install -c conda-forge xbitinfo

or

pip install xbitinfo  # julia needs to be installed manually

How to use

import xarray as xr
import xbitinfo as xb

# Load example dataset
# (requires pooch to be installed via e.g. `pip install pooch`)
example_dataset = "eraint_uvz"
ds = xr.tutorial.load_dataset(example_dataset)
# Step 1: analyze bitwise information content
bitinfo = xb.get_bitinformation(ds, dim="longitude")

# Step 2: decide on a threshold of real information to preserve (e.g. 99%)
keepbits = xb.get_keepbits(
    bitinfo, inflevel=0.99
)  # get number of mantissa bits to keep for 99% real information

# Step 3: reduce the precision of the dataset accordingly (bitrounding)
ds_bitrounded = xb.xr_bitround(
    ds, keepbits
)  # bitrounding keeping only keepbits mantissa bits

# Step 4: apply lossless compression (e.g. zlib, blosc, zstd) and store the dataset
ds_bitrounded.to_compressed_netcdf(outpath)

How the science works

Paper

Klöwer, M., Razinger, M., Dominguez, J. J., Düben, P. D., & Palmer, T. N. (2021). Compressing atmospheric data into its real information content. Nature Computational Science, 1(11), 713–724. doi: 10/gnm4jj

Videos

Julia Repository

BitInformation.jl

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xbitinfo-0.0.5.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xbitinfo-0.0.5-py3-none-any.whl (28.8 kB view details)

Uploaded Python 3

File details

Details for the file xbitinfo-0.0.5.tar.gz.

File metadata

  • Download URL: xbitinfo-0.0.5.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xbitinfo-0.0.5.tar.gz
Algorithm Hash digest
SHA256 e95a77740922ddfa945ba9d3a3e8282139dc645827e0f1917d6ffc6bdbaf826e
MD5 ba7522a3ec7da8060fb3cd1c2b4b4e69
BLAKE2b-256 16c698fff202dc7a1e3f0e531d0033abb7b8dca007ff7e9e03c0e9dbdc258522

See more details on using hashes here.

File details

Details for the file xbitinfo-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: xbitinfo-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 28.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xbitinfo-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0a23e002fae6b1547842abf5788f71681b791052288c6e96e360a599e569decb
MD5 6c7eb0b8749407e438c624c1150268a8
BLAKE2b-256 9f54dada9249df20b0de684879a80f1521d47728bd9d0ae254ebf86b01c664cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page