Skip to main content

Plugin-based file loader framework that attaches locked subtrees into a LockableTree

Project description

vcti-fileloader

A protocol-based framework for loading file content into a shared tree. Loaders attach a locked subtree under a caller-supplied parent handle in any LockableTree backing — they do not own the tree.

Install with pip install vcti-fileloader; import from vcti.fileloader.core. The vcti.fileloader package is a namespace shared with the loader plugin packages (vcti.fileloader.hdf5, vcti.fileloader.json, vcti.fileloader.numpy, …), each of which is its own PyPI distribution.

This package is fully typed (py.typed) and safe for strict type checkers (mypy --strict, pyright).

Overview

Applications that work with simulation and CAE data need to load many file formats — HDF5, VTK, OpenFOAM, JSON, CSV, proprietary binary, etc. The data from each file is hierarchical (groups, datasets, attributes), but the trees often have to combine: a workflow may load several files into a single browseable structure.

vcti-fileloader defines a uniform protocol that every loader plugin implements: populate(handle, tree, parent) builds the file's content as a new subtree under parent and locks it before returning. Loaders pass through the file's native attributes verbatim into a read-only side of the node payload (file_attributes); a separate mutable side (enricher_attributes) is reserved for post-load enrichment by the caller. The framework codes against the LockableTree protocol from vcti-tree, so the caller picks the backing — DictTree for simple cases, ArrayTree (from vcti-nptree) for file-structure-scale workloads, or any other conforming implementation.

┌─────────────────────────────────────────────────────┐
│                  Application Code                   │
│        (owns LockableTree, uses Loader protocol)    │
└──────────────┬──────────────────────┬───────────────┘
               │                      │
       ┌───────▼───────┐      ┌───────▼───────┐
       │  HDF5 Loader  │      │  JSON Loader  │  ...
       │  (plugin pkg) │      │  (plugin pkg) │
       └───────────────┘      └───────────────┘
                  attach subtrees into the tree

Key Concepts

Loader (Protocol)

Any class that implements these four methods plus validator and setup attributes satisfies the protocol — no base class inheritance required (PEP 544 structural subtyping):

Method Purpose
can_load(path) Lightweight check — can this loader handle the file?
load(path, **options) Open a file and return an opaque handle
populate(handle, tree, parent, *, before_lock=None, **options) Build the file's subtree under parent; lock and return its handle
unload(handle) Release file handles and memory (idempotent)

populate is generic in the tree handle type H, so the same loader works against any LockableTree[DataNode, H] backing.

Each loader also carries optional validator and setup hooks:

  • LoaderValidator.validate() — returns True if all runtime dependencies (e.g., h5py) are available.
  • LoaderSetup.setup() — configures paths, environment variables, or component versions before first use.

before_lock hook

populate accepts an optional callback (tree, subtree_root) -> None that fires inside the transaction after the loader attaches the file's content and before the locks are applied. Use it for attribute enrichment, computing derived payload state, or validation. Any exception raised by the hook triggers rollback of the partial subtree, so failures are atomic.

The hook does not know about, and is not coupled to, any particular enrichment library — it is just a callable. The package vcti-attribute-enricher provides one (rule-driven) enricher you can wire in via this hook; your own callbacks work equally well.

SubtreeBuilder

A transactional helper for implementing populate. It owns a single subtree under a caller-supplied parent and guarantees:

  • Scope enforcement. Writes are rejected if their parent is not inside this builder's subtree — loaders cannot accidentally mutate the rest of the tree.
  • Pre-commit hook. A before_commit callable (the implementation side of the loader's before_lock) runs after content is built and before locks fire.
  • Commit-on-success. Normal exit from the with block locks the subtree (structure + payload).
  • Rollback-on-failure. An exception during the build or during the pre-commit hook removes the partial subtree before propagating.

DataNode and LazyDataNode

Tree payloads. A DataNode carries four pieces of state:

Field / property Description
data Primary payload — NumPy array, parsed dict, None, anything.
name File-internal identifier (HDF5 basename, NPZ archive key). None when not applicable.
file_attributes Read-only Mapping view of the file's native attributes (loader-set, verbatim).
enricher_attributes Mutable dict where post-load enrichers (or before_lock hooks) write.
attributes ChainMap merged view, enricher first. Read here for portable rules; writes go to enricher_attributes.

A LazyDataNode adds an on-demand loader callback plus pre-load shape and dtype fields, so consumers can filter or display a dataset without materialising it.

LoaderDescriptor and LoaderRegistry

LoaderDescriptor wraps a Loader instance with metadata — a unique id, a human-readable name, and filterable attributes (typically {"supported_formats": ["hdf5-file"]} pointing at descriptor IDs from vcti-path-format-descriptors).

LoaderRegistry is a typed registry of LoaderDescriptor entries. Register loaders at startup, then look them up by id or query by attributes at runtime.

Lifecycle Contracts

  1. Validate / setup — call validator.validate() and setup.setup() once before the first load().
  2. Checkcan_load(path) before load() to prevent UnsupportedFormatError.
  3. Loadloader.load(path) opens the file, returns a handle.
  4. Populateloader.populate(handle, tree, parent, before_lock=...) grafts the file's subtree under parent, optionally runs the hook, then locks the subtree. Returns the subtree root handle.
  5. Unloadloader.unload(handle) releases resources. Idempotent. If the loader attached LazyDataNodes, their closures may hold the handle — call materialise_subtree(tree, root) first if the tree must remain usable after unload.

Installation

pip install vcti-fileloader>=5.1.1
dependencies = [
    "vcti-fileloader>=5.1.1",
]

Quick Start

from pathlib import Path

from vcti.tree import DictTree, descendants
from vcti.fileloader.core import DataNode, LoaderDescriptor, LoaderRegistry

# At startup
registry = LoaderRegistry()
registry.register(LoaderDescriptor(
    id="hdf5-h5py-loader",
    name="HDF5 Loader (h5py)",
    loader=my_h5py_loader,
    attributes={"supported_formats": ["hdf5-file"]},
))

# At runtime
desc = registry.get("hdf5-h5py-loader")
desc.loader.validator.validate()
desc.loader.setup.setup()

# Application owns the tree
tree: DictTree[DataNode] = DictTree(DataNode())

handle = desc.loader.load(Path("simulation.h5"))
try:
    subtree_root = desc.loader.populate(handle, tree, tree.root_handle)
    # subtree is structure-locked and payload-locked
    for h in descendants(tree, subtree_root):
        node = tree.payload(h)
        if node.name == "stress":
            ...
finally:
    desc.loader.unload(handle)

Quick Start — with a before_lock hook

from vcti.attribute_enricher import EnrichRule, apply_rules
from vcti.lookup import Rule

def enrich(tree, root):
    apply_rules(
        descendants(tree, root, include_self=True),
        rules=[
            EnrichRule(set={"file_path": str(path)}),
            EnrichRule(set={"category": "mechanical"},
                       when=(Rule("name", "^=", "stress"),)),
        ],
    )

handle = desc.loader.load(Path("simulation.h5"))
try:
    root = desc.loader.populate(handle, tree, tree.root_handle, before_lock=enrich)
finally:
    desc.loader.unload(handle)

vcti-attribute-enricher is an optional package — the framework itself has no dependency on it. The before_lock argument accepts any callable (tree, root) -> None; your own callback works just as well.

Error Handling

All exceptions inherit from LoaderError:

Exception When to raise / catch
LoaderError Base — catches any loader failure
LoadError File cannot be opened or parsed
UnloadError Resource cleanup failed
UnsupportedFormatError Loader does not recognise the file format
ValidationError validator.validate() detected missing dependencies
SetupError setup.setup() could not configure the environment
TreeAttachmentError populate() cannot attach: parent is missing, deleted, or structure-locked

TreeAttachmentError translates the named tree exceptions from vcti-tree (HandleError, InactiveNodeError, StructureLockedError) into a single fileloader-domain failure type. The underlying tree exception is preserved on __cause__.

What this package does NOT do

  • No concrete loaders. Actual file reading (HDF5, JSON, NPY, etc.) lives in separate loader plugin packages.
  • No tree implementation. Backings come from vcti-tree (DictTree), vcti-nptree (ArrayTree), or third parties.
  • No attribute enrichment. Enrichment is run via the optional before_lock hook by the caller, using whatever callable they like (e.g., vcti-attribute-enricher).
  • No data transformation. Data is returned as-is from the loader.
  • No caching. Caching strategies belong at the application level.

Further Reading

  • Common Patterns — Loader implementation, the SubtreeBuilder, the before_lock hook, validator/setup patterns, error handling.
  • Design & Concepts — Architecture, protocol rationale, layered attribute model, locking model.

Dependencies

  • numpy (>=1.24) — DataNode.__eq__
  • vcti-plugin-catalog (>=1.0.0) — Descriptor and Registry base classes
  • vcti-tree (>=1.0.0) — LockableTree protocol, generic algorithms, named exceptions

Versioning

This package follows Semantic Versioning. Breaking changes to the Loader protocol or DataNode shape will only occur in major version bumps. Downstream loader plugins should pin to a compatible major version (e.g., vcti-fileloader>=5.0,<6).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcti_fileloader-5.1.1.tar.gz (29.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcti_fileloader-5.1.1-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file vcti_fileloader-5.1.1.tar.gz.

File metadata

  • Download URL: vcti_fileloader-5.1.1.tar.gz
  • Upload date:
  • Size: 29.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vcti_fileloader-5.1.1.tar.gz
Algorithm Hash digest
SHA256 328ff1fe9eb3756bd232f507bd26a97a35070cf6a576e966dd7c94b370649c0a
MD5 d0f16b38252ce38ceb0fc6eb6a823a25
BLAKE2b-256 333601876116f5de3d1ac8276b564da50286693aa491fc0eab5f9a000b7426ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_fileloader-5.1.1.tar.gz:

Publisher: release.yml on vcollab/vcti-python-fileloader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vcti_fileloader-5.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for vcti_fileloader-5.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4eaa8cc0dca9c72e32a02f57694ed89940337da8172fbea1ba9f9f84e42c0896
MD5 38d001f9caefff886d6bd17a9eccace8
BLAKE2b-256 fe5fae3ecfd01ad3070cecd0a1a02e810da4730fe31cc2cdc7c7b4c6a472a17c

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_fileloader-5.1.1-py3-none-any.whl:

Publisher: release.yml on vcollab/vcti-python-fileloader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page