Data-scope abstraction for grouping related data sources under one managed lifecycle, with format-aware loader resolution.
Project description
Data Scope
Data-scope abstraction for grouping related data sources under one managed lifecycle, with format-aware loader resolution.
Overview
vcti-data-scope provides a small framework for tying related data sources
together — files, folders, and (in future) other source kinds — under one
named, lifecycle-managed object. Each source is added with an explicit
format identifier; the scope resolves a loader for it via a
LoaderRegistry (from vcti-fileloader), and the user opens and closes the
whole collection together.
The framework is pluggable: a DataScope is the base type, with one
concrete subclass shipping today (PathsGroup for file and folder sources)
and additional types possible later (positional file arrays, parameter
sweeps, streaming sources, etc.). v1 is read-only after open and uses a
strict load policy for required sources; optional sources can fail
without aborting the scope.
Installation
pip install vcti-data-scope>=1.0.0
In requirements.txt
vcti-data-scope>=1.0.0
In pyproject.toml dependencies
dependencies = [
"vcti-data-scope>=1.0.0",
]
Preparing a registry
PathsGroup resolves loaders from a LoaderRegistry (from
vcti-fileloader). The registry must be populated with descriptors before
the scope is used. Each descriptor's attributes["supported_formats"]
list is what the scope matches against the format_id you pass when
adding a source.
from vcti.fileloader import LoaderRegistry, LoaderDescriptor
registry = LoaderRegistry()
registry.register(
LoaderDescriptor(
id="hdf5-h5py",
name="HDF5 (h5py)",
loader=H5pyLoader(), # your loader implementing the Loader protocol
attributes={"supported_formats": ["hdf5-file"]},
)
)
# ...register one descriptor per loader you need
Most callers wrap this setup in a helper module so application code receives a ready-to-use registry.
Quick Start (context manager)
When scope usage is confined to a single block, the context manager is the most concise form. The scope closes automatically on exit, even if an exception is raised inside the block.
from pathlib import Path
from vcti.datascope import PathsGroup
with PathsGroup("brake-squeal", registry=registry) as scope:
scope.add_path_source(
name="solver_input",
path=Path("model.inp"),
format_id="abaqus-inp",
)
scope.add_path_source(
name="solver_output",
path=Path("sol103.h5"),
format_id="hdf5-file",
)
scope.add_path_source(
name="solver_log",
path=Path("run.log"),
format_id="text-log",
required=False,
)
scope.load()
assert scope.is_valid
# Reach into per-source loaders for typed access:
h5_loader = scope.sources["solver_output"].loader
# ... use h5_loader's typed API ...
Usage without the context manager
When the scope's lifetime spans function boundaries — for example, a scope owned by a long-lived service, an interactive session, or a class attribute — open and close it explicitly. The contract is the same as the context-manager form; only the syntax differs.
Plain open / close
from pathlib import Path
from vcti.datascope import PathsGroup
scope = PathsGroup("brake-squeal", registry=registry)
scope.add_path_source("solver_input", Path("model.inp"), format_id="abaqus-inp")
scope.add_path_source("solver_output", Path("sol103.h5"), format_id="hdf5-file")
if not scope.is_valid:
raise RuntimeError("scope not loadable — some required source is unavailable")
scope.load()
try:
# ... use scope.sources["..."].loader ...
...
finally:
scope.close()
scope.close() is idempotent and best-effort: it walks every
source, closes the ones that are loaded, and logs (rather than raises) on
per-source close failures. It is always safe to call — including before
load() and after a failed load().
As an attribute of a long-lived object
class AnalysisSession:
def __init__(self, registry):
self._scope = PathsGroup("session", registry=registry)
def open(self, model_path, output_path):
self._scope.add_path_source("input", model_path, format_id="abaqus-inp")
self._scope.add_path_source("output", output_path, format_id="hdf5-file")
self._scope.load()
def close(self):
self._scope.close()
@property
def output_loader(self):
return self._scope.sources["output"].loader
Reopening after close
After close(), the scope may be reopened. Optional sources that
failed previously have their last_error cleared and are retried on the
next load(). Sources cannot be added or removed while the scope is
open (DataScopeStateError); add or remove before calling load() again.
scope.load()
# ... use ...
scope.close()
# ... later ...
scope.load() # re-opens; failed optionals get another chance
Working with optional sources
Sources added with required=False do not abort load() on failure;
their failure is recorded and the scope continues:
scope.load()
if not scope.is_valid:
raise RuntimeError("scope is not in a usable state")
for src in scope.failed_optional_sources.values():
log.warning("optional source %r unavailable: %s", src.name, src.last_error)
is_valid is a pre-flight check (scope.is_valid, no parens):
"could this scope be loaded right now?" Specifically:
- Empty scope (no sources) — invalid.
- Every required source's own
is_validisTrue— scope is valid. - A loaded scope short-circuits to
Truewithout re-checking —load()would have raised on any required failure, so reachingis_loadedalready proves validity. While unloaded, the check is re-run on every call (no caching), so a moved or deleted file is detected immediately.
is_loaded answers a different question — "has load() actually
completed?" Use is_valid before opening to confirm readiness; use
is_loaded after opening to confirm the lifecycle finished.
Disambiguating between loaders that share a format
When several registered loaders declare the same format_id (e.g. two
HDF5 readers for different solvers), pass extra_rules to narrow the
selection. Rules are vcti.lookup.Rule instances applied alongside the
implicit supported_formats contains <format_id> rule:
from vcti.lookup import Rule
scope.add_path_source(
name="solver_output",
path=Path("sol103.h5"),
format_id="hdf5-file",
extra_rules=[Rule("solver", "==", "nastran")],
)
If no descriptor matches, add_path_source raises ValueError at the
point of registration — not later at load() time.
See docs/design.md for the conceptual model and docs/api.md for the API reference.
Dependencies
- vcti-fileloader (>=1.0.0) — Loader, LoaderRegistry, LoaderDescriptor
- vcti-lookup (>=1.0.0) — Rule (format-based loader filtering)
- vcti-logging (>=1.0.0) — logger
- vcti-error (>=1.0.0) — error codes
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vcti_data_scope-1.0.0.tar.gz.
File metadata
- Download URL: vcti_data_scope-1.0.0.tar.gz
- Upload date:
- Size: 17.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16ae60a0029f034b032b11814e51808ec217f4980b4cae3fedee0935755b4e5e
|
|
| MD5 |
bb5097986eb962611c9905eb8da28b2b
|
|
| BLAKE2b-256 |
55d90b282c00d914976a23f14b6c9c7597590999712e6ecf17039c64dc13c16e
|
Provenance
The following attestation bundles were made for vcti_data_scope-1.0.0.tar.gz:
Publisher:
release.yml on vcollab/vcti-python-data-scope
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vcti_data_scope-1.0.0.tar.gz -
Subject digest:
16ae60a0029f034b032b11814e51808ec217f4980b4cae3fedee0935755b4e5e - Sigstore transparency entry: 1565390385
- Sigstore integration time:
-
Permalink:
vcollab/vcti-python-data-scope@a692c4f42f929e904193ccae609ec0a3b2b46fb3 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/vcollab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a692c4f42f929e904193ccae609ec0a3b2b46fb3 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vcti_data_scope-1.0.0-py3-none-any.whl.
File metadata
- Download URL: vcti_data_scope-1.0.0-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ac0c2115300d6269f7a7f80041704ea22cd549b82e353fc713f31445d766880
|
|
| MD5 |
5609b71f37370f3370bfa88dd5b3c37a
|
|
| BLAKE2b-256 |
9af65e33275370845ee24ef271a08d3d5db6aff1404b5ba1f60e7ee2915e328e
|
Provenance
The following attestation bundles were made for vcti_data_scope-1.0.0-py3-none-any.whl:
Publisher:
release.yml on vcollab/vcti-python-data-scope
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vcti_data_scope-1.0.0-py3-none-any.whl -
Subject digest:
6ac0c2115300d6269f7a7f80041704ea22cd549b82e353fc713f31445d766880 - Sigstore transparency entry: 1565390421
- Sigstore integration time:
-
Permalink:
vcollab/vcti-python-data-scope@a692c4f42f929e904193ccae609ec0a3b2b46fb3 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/vcollab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a692c4f42f929e904193ccae609ec0a3b2b46fb3 -
Trigger Event:
push
-
Statement type: