Skip to main content

A light weight scientific data management system using HDF5

Project description


Logo

BAMBOOST

Bamboost is a Python library built for datamanagement using the HDF5 file format. bamboost stands for a lightweight shelf which will boost your efficiency and which will totally break if you load it heavily. Just kidding, bamboo can fully carry pandas.
🐼🐼🐼🐼

Docs Pipeline Pypi PyPI_downloads License

Explore the docs »

Terminal User Interface (TUI) · Doc site repo

Report Bug · Request Feature

[!important] Starting from version 0.10.0, bamboost breaks compatibility with previous versions. For previous versions, checkout the legacy branch.

Table of Contents

[[TOC]]

About The Project

bamboost is a python data framework designed for managing scientific simulation data. It provides an organized model for storing, indexing, and retrieving simulation results, making it easier to work with large-scale computational studies. In its core, it is a filesystem storage model, providing directories for simulations, bundled in collections.

Principles

  • Independence: Any dataset must be complete and understandable on it's own. You can copy or extract any of your data and distribute it without external dependencies.
  • Path redundancy: Data must be referencable without knowledge of it's path. This serves several purposes: You can share your data easily ($e.g.$ supplementary material for papers), and renaming directories, moving files, switching computer, etc. will not break your data referencing.

This leads to the following requirements:

  • Simulation parameters must be stored locally, inside the simulation directory. Crucially, not exclusively in a global database of any kind.
  • Collections must have unique identifiers that are independent of its path.
  • Simulations must have unique identifiers that are independent of its path.

Concept

We organize simulations in collections within structured directories. Let's consider the following directory:

test_data/
├── simulation_1/
│   ├── data.h5
│   ├── data.xdmf
│   ├── additional_file_1.txt
│   ├── additional_file_2.csv
├── simulation_2/
│   ├── data.h5
│   ├── additional_file_3.txt
└── .bamboost-collection-ABCD1234

This is a valid bamboost collection at the path ./test_data. It contains an identifier file giving this collection a unique identifier. In this case, it is ABCD1234. This file defines the unique ID of the collection.

It contains two entries; simulation_1 and simulation_2. As you can see, each simulation owns a directory inside a collection. The directory names are simultaneously used as their name as well as their ID. The unique identifier for a single simulation becomes the combination of the collection ID that it belongs to and the simulation ID. That means, the full identifier of simulation_1 is ABCD1234:simulation_1.

Each simulation contains a central HDF5 file named data.h5. This file is used to store the parameters, as well as generated data. The simulation API of bamboost provides extensive functionality to store and retrieve data from this file. However, users are not limited to this file, or using python in general. The reason why simulations are directories instead of just a single HDF file is that you can dump any file that belongs to this simulation into its path. This can be output from 3rd party software (think LAMMPS), additional input files such as images, and also scripts to reproduce the generated data.

(back to top)

Getting Started

bamboost is available from the Python Package Index (PyPI) and can be installed using pip (or uv of course):

pip install bamboost

Prerequisites

To use bamboost with MPI, you need a working MPI installation. Additionally, you need

Installation

bamboost is available from the Python Package Index (PyPI) and can be installed using pip (or uv of course):

pip install bamboost

To install the latest version from this repository, you can use:

pip install git+https://github.com/smec-ethz/bamboost.git

(back to top)

Usage

For a getting started guide, please see here: Getting started

(back to top)

Roadmap

  • Clear MPI handling

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT license. See LICENSE for more information.

(back to top)

Contact

zrlf - forez@ethz.ch

Project Link: https://github.com/smec-ethz/bamboost

(back to top)

Acknowledgments

(back to top)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bamboost-0.12.1.tar.gz (320.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bamboost-0.12.1-py3-none-any.whl (224.0 kB view details)

Uploaded Python 3

File details

Details for the file bamboost-0.12.1.tar.gz.

File metadata

  • Download URL: bamboost-0.12.1.tar.gz
  • Upload date:
  • Size: 320.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bamboost-0.12.1.tar.gz
Algorithm Hash digest
SHA256 00747d1d60f784d94c328eace32b3465099007c40ee9025710aab421373031bf
MD5 c4ddc907921fae61b60667603c1b8a54
BLAKE2b-256 df0f2e10616255fb574bb0a401c229792287289248e79e3fc10ccc9522a48f89

See more details on using hashes here.

File details

Details for the file bamboost-0.12.1-py3-none-any.whl.

File metadata

  • Download URL: bamboost-0.12.1-py3-none-any.whl
  • Upload date:
  • Size: 224.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bamboost-0.12.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b3da374f99fb19f7d53ca5200d788d71f8181bc0a6a34f19d26667471834c1fe
MD5 ee59e124f36890bf5721f951ae1728a8
BLAKE2b-256 5b718efba3fc47c6dabd979fb1e271abd7ba13b715ea507b4424c4dae41f7117

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page