Data space for organizing, loading, and working with hierarchical data from multiple file sources
Project description
Data Space
Data space for organizing, loading, and working with hierarchical data from multiple file sources.
Installation
pip install vcti-dataspace>=1.0.0
With HDF5 export support:
pip install vcti-dataspace[hdf5-export]>=1.0.0
Quick Start
import numpy as np
from pathlib import Path
from vcti.dataspace import FilesProject
# Context manager ensures clean resource cleanup
with FilesProject("Analysis", registry=registry) as project:
# Add file sources (auto-loads, creates immutable source DataSpaces)
project.add_path_source("sim", Path("results.h5"))
# Access source DataSpace (immutable, mirrors file)
sim = project.get_path_source("sim")
tree = sim.dataspace.tree
info = sim.dataspace.get_node_info_all()
# Open workspace and create user DataSpace (mutable)
ws = project.open_workspace()
user_space = ws.create_dataspace("output")
# Copy data from source, add custom groups and datasets
user_space.copy_node_from(sim.dataspace, node_id=5)
grp = user_space.create_group("custom")
user_space.create_dataset("values", np.array([1.0, 2.0]), parent_id=grp)
# Delete nodes when no longer needed
user_space.delete_node(grp, recursive=True)
# Export to HDF5
user_space.export_hdf5(Path("output.h5"))
Error Handling
All exceptions inherit from DataSpaceError, so you can catch broadly or narrowly:
from vcti.dataspace import DataSpaceError, ImmutableDataSpaceError, NodeNotFoundError
try:
space.delete_node(999)
except NodeNotFoundError:
print("Node does not exist")
except DataSpaceError:
print("Catch-all for any DataSpace error")
Core API
DataSpace
| Method | Description |
|---|---|
build_from_loader(loader, handle) |
Build from file loader (source only) |
create_group(name, parent_id) |
Create group node (user only) |
create_dataset(name, data, parent_id) |
Create dataset node (user only) |
delete_node(node_id, recursive) |
Delete node and optionally its children |
copy_node_from(source, node_id) |
Copy node from another DataSpace |
get_dataset(node_id) |
Get dataset (lazy-loads for sources) |
get_node_info(node_id) |
Get node metadata |
export_hdf5(path) |
Export to HDF5 file |
clear_dataset_cache() |
Free cached lazy datasets |
node_count |
Total number of nodes (property) |
dataset_count |
Number of datasets (property) |
PathSource, Workspace, FilesProject
| Class | Purpose |
|---|---|
PathSource |
File + loader + source DataSpace |
Workspace |
Session with user DataSpaces |
FilesProject |
Orchestrates sources + workspaces |
Dependencies
- numpy (>=1.24)
- vcti-fileloader (>=1.0.0)
- vcti-array-tree (>=1.0.0) — DataNode
- h5py — optional, for HDF5 export
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vcti_dataspace-1.0.0.tar.gz.
File metadata
- Download URL: vcti_dataspace-1.0.0.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
007653d20448040e9ac1398c5969066ad1cf8a7f1239378f7dbd9129067d091e
|
|
| MD5 |
9779764b0e6335d34ae42686b179c224
|
|
| BLAKE2b-256 |
fbac7f18e661485cde8991354a881b832ee7a39fe7df2d900e7ba15037a45bf1
|
Provenance
The following attestation bundles were made for vcti_dataspace-1.0.0.tar.gz:
Publisher:
publish.yml on vcollab/vcti-python-dataspace
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vcti_dataspace-1.0.0.tar.gz -
Subject digest:
007653d20448040e9ac1398c5969066ad1cf8a7f1239378f7dbd9129067d091e - Sigstore transparency entry: 1193370241
- Sigstore integration time:
-
Permalink:
vcollab/vcti-python-dataspace@30e17ee211d5906aa63387da2ac70486606759f6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vcollab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@30e17ee211d5906aa63387da2ac70486606759f6 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file vcti_dataspace-1.0.0-py3-none-any.whl.
File metadata
- Download URL: vcti_dataspace-1.0.0-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6508055ab298089d4808d3766670e7b5fa1f946409e5d5a57601e418db351ff
|
|
| MD5 |
c283afcbea22f6239a8b62cced2f8705
|
|
| BLAKE2b-256 |
018d2ef086fe182e23880e97a10a61e0c490fbecf0373a3584e6b0efec1c031b
|
Provenance
The following attestation bundles were made for vcti_dataspace-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on vcollab/vcti-python-dataspace
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vcti_dataspace-1.0.0-py3-none-any.whl -
Subject digest:
e6508055ab298089d4808d3766670e7b5fa1f946409e5d5a57601e418db351ff - Sigstore transparency entry: 1193370248
- Sigstore integration time:
-
Permalink:
vcollab/vcti-python-dataspace@30e17ee211d5906aa63387da2ac70486606759f6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vcollab
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@30e17ee211d5906aa63387da2ac70486606759f6 -
Trigger Event:
workflow_dispatch
-
Statement type: