Skip to main content

A database for results collected from the SHIELD permeation rig

Project description

SHIELD-Data

A repository to store and manage raw experimental data produced from the SHIELD permeation rig.

Overview

This repository provides an automated data management system for SHIELD experimental runs. It includes:

  • Automated Data Upload: Watchdog-based monitoring system that detects new experimental data and automatically creates GitHub pull requests
  • Data Cataloging: Automatic generation of a searchable catalogue (CSV + README) containing metadata for all experimental runs
  • Structured Storage: Organized folder structure with run metadata, pressure gauge data, and backups
  • PR-based Workflow: All data additions are tracked through GitHub pull requests with detailed metadata

Repository Structure

SHIELD-Data/
├── run_data/                          # Main data storage folder
│   ├── YY.MM.DD_run_X_HHhMM/         # Individual run folders
│   │   ├── pressure_gauge_data.csv   # Experimental measurements
│   │   ├── run_metadata.json         # Run configuration and metadata
│   │   └── backup/                   # Backup data files
│   ├── runs_catalogue.csv            # Auto-generated catalogue
│   └── README.md                     # Auto-generated table view of catalogue
└── src/shield_data/                  # Python package
    ├── data_upload_handler.py        # Watchdog monitoring and PR creation
    ├── build_catalogue.py            # Catalogue generation
    └── pr_template.md                # PR body template

Features

Automated Data Upload

The upload_data_from_folder() function monitors a specified folder for new experimental data and automatically:

  1. Detects new or modified run data
  2. Validates folder structure and metadata
  3. Creates a git branch and commits changes
  4. Regenerates the data catalogue
  5. Opens a pull request with detailed run information

Data Catalogue

Every time data is added, the catalogue is automatically updated with:

  • Run ID (folder name)
  • Relative path to data
  • Run type (e.g., permeation_exp)
  • Date
  • Furnace setpoint
  • Material (if available)
  • Coating (if available)

Run Metadata

Each experimental run includes a run_metadata.json file containing:

  • Run information (type, date, furnace setpoint, etc.)
  • Gauge configurations
  • Valve timing information
  • Recording parameters

Usage

Installing the Package

pip install -e .

Monitoring for New Data

from shield_data import upload_data_from_folder

# Monitor the run_data folder with default settings
upload_data_from_folder("run_data")

# Custom monitoring intervals
upload_data_from_folder(
    "run_data",
    check_interval=5,    # Check every 5 seconds
    batch_delay=2        # Wait 2 seconds after last change before processing
)

Building the Catalogue

from shield_data import build_catalogue

# Regenerate the catalogue manually
build_catalogue("run_data")

Loading and Analyzing Data

The package provides simple functions to load and filter experimental data:

View the Catalogue

from shield_data import catalogue

# Load the catalogue as a pandas DataFrame
cat = catalogue()
print(cat)

Load a Specific Run

from shield_data import load

# Load pressure gauge data for a specific run
df = load("25.10.06_run_1_10h41")

# The DataFrame includes all measurement data plus a 'run_id' column
print(df.head())

Load Run Metadata

from shield_data import load_metadata

# Load the metadata JSON as a dictionary
metadata = load_metadata("25.10.06_run_1_10h41")

# Access specific metadata fields
run_info = metadata["run_info"]
print(f"Run type: {run_info['run_type']}")
print(f"Furnace setpoint: {run_info['furnace_setpoint']} K")
print(f"Start time: {run_info['start_time']}")

Filter and Load Multiple Runs

from shield_data import load_filtered

# Load all runs at a specific temperature
df_500k = load_filtered(furnace_setpoint=500)

# Load runs by type and date
df_oct6 = load_filtered(run_type="permeation_exp", date="2025-10-06")

# Filter by material (when available)
df_material = load_filtered(material="stainless_steel")

# The result is a combined DataFrame with data from all matching runs
print(f"Loaded {len(df_500k)} data points from {df_500k['run_id'].nunique()} runs")

Example Analysis Workflow

from shield_data import catalogue, load_filtered
import matplotlib.pyplot as plt

# View available runs
cat = catalogue()
print(cat[["run_id", "date", "furnace_setpoint"]])

# Load all 500K experiments
df = load_filtered(furnace_setpoint=500)

# Group by run and plot
for run_id in df["run_id"].unique():
    run_data = df[df["run_id"] == run_id]
    plt.plot(run_data["time"], run_data["pressure"], label=run_id)

plt.xlabel("Time (s)")
plt.ylabel("Pressure")
plt.legend()
plt.show()

Requirements

  • Python >= 3.9
  • watchdog
  • jinja2
  • pandas
  • Git
  • GitHub CLI (gh) configured with authentication

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shield_data-0.1.tar.gz (16.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shield_data-0.1-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file shield_data-0.1.tar.gz.

File metadata

  • Download URL: shield_data-0.1.tar.gz
  • Upload date:
  • Size: 16.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for shield_data-0.1.tar.gz
Algorithm Hash digest
SHA256 c1872d8146b0e824c7dbdf69b025f7fcfd3dc54430bf343638d3e3cdf52b43be
MD5 478c17a6f5674bab78656fe5fa09cd88
BLAKE2b-256 d9d2ad2e82c721414c5de2dfad8abc0052b5478e48d361b00059e19a72ad9652

See more details on using hashes here.

Provenance

The following attestation bundles were made for shield_data-0.1.tar.gz:

Publisher: python-publish.yml on PTTEPxMIT/SHIELD-Data

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shield_data-0.1-py3-none-any.whl.

File metadata

  • Download URL: shield_data-0.1-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for shield_data-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 78d7a5819f014dd3c1b2a36db9f8b6ddddc922d8a15aef2c61490040a5132a18
MD5 17c91334341a8b5ffd23ea1d391906ba
BLAKE2b-256 19d49152878e5272f12a8730575ebe6a5a6aa19a5628c1c746c0ce069b19ebfd

See more details on using hashes here.

Provenance

The following attestation bundles were made for shield_data-0.1-py3-none-any.whl:

Publisher: python-publish.yml on PTTEPxMIT/SHIELD-Data

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page