Versioned Parquet dataset management with snapshots and multi-cache support
Project description
ionbus_parquet_cache
Python tools for managing versioned Parquet datasets with date partitioning, snapshot versioning, multi-cache lookup, YAML-driven dataset creation, and CLI workflows for update, cleanup, and synchronization.
Installation
pip install ionbus-parquet-cache
Or install from source:
pip install -e .
Includes
CacheRegistryfor reading from one or more cache locationsDatedParquetDatasetfor date-partitioned, incrementally updated datasetsNonDatedParquetDatasetfor full-refresh reference datasetsDataSourceandDataCleanerextension points- YAML configuration helpers for declarative dataset setup
- Snapshot lineage, cache history, YAML annotations, and optional external provenance sidecars
- CLI modules for dataset creation, updating, cleanup, cache sync, and post-sync hooks
Full documentation on GitHub.
Requirements
- Python >= 3.9
- See
requirements.txtfor runtime dependencies
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ionbus_parquet_cache-1.4.0.0.tar.gz.
File metadata
- Download URL: ionbus_parquet_cache-1.4.0.0.tar.gz
- Upload date:
- Size: 240.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d3b51a81875968a907164869273c4184948808c20f84e6591fc77e0c3483c79
|
|
| MD5 |
35fee52fd192e4e22713edaf87cd2b74
|
|
| BLAKE2b-256 |
8e68cdd16cd5d6ed226cc90cd660f0098fe8c696af8f599d6d2d1482ae9f57b2
|
File details
Details for the file ionbus_parquet_cache-1.4.0.0-py3-none-any.whl.
File metadata
- Download URL: ionbus_parquet_cache-1.4.0.0-py3-none-any.whl
- Upload date:
- Size: 193.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a928431ea2b3c84b74545121a90cd8dd54fe748f4587ca63d8192fba5fd4a59
|
|
| MD5 |
2194db760eaa41408e31657d5c3b5421
|
|
| BLAKE2b-256 |
887d63445b3f08938f8b6b14ad2f43c1482df7a6f927a8a939f5360fa7ee0d42
|