Versioned persistence for Python dataclasses with hash validation, declarative migrations, and pluggable backends
Project description
versionable
Future-proof your data files. Save structured Python objects with versioning, declarative migrations, and open file formats.
Why versionable ?
Your data lives in files. Your code keeps changing. Without versioning, old files silently load with missing fields, wrong types, or stale values.
versionable fixes that. Like database migrations, but for files. Every file is stamped with a version number and a
fingerprint of its structure. Files written by v1 of your code load cleanly into v5, automatically migrated, never
silently broken. Save to standard formats out of the box: JSON, HDF5, YAML, and TOML.
What you get:
- Zero boilerplate — no schema files, no code generation, no build step. Just inherit from
Versionable - Simple versioning with declarative migrations — rename, add, remove, or transform fields across versions
- Rich type support — datetime, Path, UUID, Enum, numpy arrays, and more — easy to extend with your own
- Nested objects with independent versioning — compose complex dataclasses from smaller
Versionablepieces - Incremental HDF5 writes — append rows as data arrives, no need to hold everything in memory
- Random access for huge files — read slices directly from disk without loading the whole file
- JSON, HDF5, YAML, TOML — or bring your own backend
- Import-time safety — schema hash mismatches are caught when your module loads, not in production
- Modern, type-safe Python — fully typed and compatible with mypy, pyright, and other static analyzers
How does it compare?
| Versionable Features | pickle | dc libs¹ | protobuf | raw JSON | sidecars |
|---|---|---|---|---|---|
| ✅ Zero boilerplate | ✅ | ✅ | 🛠️ | - | - |
| ✅ Versioning with declarative migrations | 🛠️ | - | - | 🛠️ | - |
| ✅ Rich type support | ✅ | ✅ | 🛠️ | 🛠️ | 🟠 |
| ✅ Nested objects, versioned independently | 🟠 | 🛠️ | 🟠 | 🛠️ | - |
| ✅ Incremental HDF5 writes | - | - | - | - | 🛠️ |
| ✅ Random access for huge files | - | - | - | - | 🛠️ |
| ✅ Custom Backends | - | 🟠 | 🟠 | 🟠 | - |
| ✅ Import-time validation | - | - | 🛠️ | - | - |
| ✅ Modern, type-safe Python | - | ✅ | ✅ | - | - |
¹ pydantic, dataclasses-json, etc.
- 🛠️ = requires manual effort / build step
- 🟠 = partial
Installation
The base install includes the JSON backend with zero heavy dependencies:
pip install versionable
Add backend support as needed (JSON is included by default):
pip install pyyaml # YAML backend (.yaml, .yml)
pip install tomlkit # TOML backend (.toml)
pip install h5py hdf5plugin # HDF5 backend (.h5, .hdf5)
Or install the latest main from source:
pip install git+https://github.com/hendrickmelo/versionable.git
Quick Start
Simple files
You save a config file today:
from dataclasses import dataclass
import versionable
from versionable import Versionable
@dataclass
class SensorConfig(Versionable, version=1, hash="4b7866"):
name: str
value: float
config = SensorConfig(name="experiment-A", value=9.81)
versionable.save(config, "config.json")
A few weeks later you rename value to magnitude. Without versionable, old files silently load with missing data.
With it, you bump the version and declare a migration — old files upgrade automatically:
from versionable import Migration
@dataclass
class SensorConfig(Versionable, version=2, hash="a70249"):
name: str
magnitude: float # renamed from "value"
class Migrate:
v1 = Migration().rename("value", "magnitude")
# Old v1 file loads and the old field is automatically migrated
loaded = versionable.load(SensorConfig, "config.yaml")
assert loaded.magnitude == 9.81
The Schema Hash — Friction as a Feature
The hash parameter is optional — everything works without it. But when present, it acts as a tripwire.
Without it, here's what happens: you rename a field, forget to add a migration, and old files load with a missing field that silently defaults to zero. Your experiment runs with wrong calibration data for a week before anyone notices.
The hash prevents that. It's a fingerprint of your fields and their types, validated at import time — not at runtime, not in production. Change a field and forget to update the version? Python won't even import:
@dataclass
class SensorConfig(Versionable, version=2, hash="4b7866"): # ⬅ old hash
name: str
magnitude: float # changed, but hash wasn't updated
# HashMismatchError: SensorConfig: hash mismatch — declared '4b7866',
# computed 'a70249'. Update the hash parameter to 'a70249'.
That error is the point. It means you can't accidentally ship a schema change without a migration. The hash makes breaking changes visible during development, in CI, at deploy time — never in production. Think of it like a type checker for your data format: optional, zero runtime cost, catches mistakes before they matter.
Working with Large Data
For scientific and engineering workflows, fields map to native HDF5 chunked datasets. You can append rows incrementally and read slices from disk without loading the whole file into memory:
import numpy as np
from numpy.typing import NDArray
@dataclass
class Experiment(Versionable, version=1, hash="536849"):
name: str
traces: NDArray[np.float64] = field(default_factory=lambda: np.empty((0, 1024)))
# Append to a chunked, resizable dataset as data arrives
session = versionable.hdf5.open(Experiment, "run.h5")
with session as obj:
obj.name = "long-running-acquisition"
for batch in data_source:
obj.traces.append(batch) # extends the dataset on disk
session.flush() # flush HDF5 buffers to OS
# Read slices directly from disk without loading the whole file
with versionable.hdf5.open(Experiment, "run.h5", mode="read") as obj:
print(obj.traces[1000]) # reads only row 1000
print(obj.traces[50:100]) # reads only this slice
Learn More
Want to see how old files get upgraded automatically when your schema changes?
- See migrations in action
- Explore the available backends
For AI Agents
If you're an AI agent working with versionable, see AGENT.md for a condensed API reference.
Complete Documentation
For custom type converters, HDF5 support, and more, see the full documentation.
Background
The pattern behind versionable has been used in production C++ systems for over 15 years — from CArchive-based
serialization to modern C++11 variadic macros. Some version of this pattern has been a part of every project the authors
have worked on. This is our second Python implementation of a proven approach, built with modern type-safe Python. The
test suite has a ~1:1 ratio of test code to source code, with cross-backend round-trip coverage and edge-case validation
across all four backends.
Have questions? See the FAQ. Want to contribute? See the contributing guide.
Acknowledgements
The idea started with Steve Araiza, who first taught me this approach. Over the years it evolved through many C++ iterations, and every project I've worked on since has used some version of this pattern. Emma Powers brought great fresh ideas to this Python implementation.
A big thank you to both of them! 🥓🥞🍳
License
MIT - Copyright ©️ 2026 Hendrick Melo
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file versionable-0.2.0.tar.gz.
File metadata
- Download URL: versionable-0.2.0.tar.gz
- Upload date:
- Size: 454.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb36c429a8f078c2b495347b6786ca7c6e6121ff891cec2e526159f84d42d9af
|
|
| MD5 |
5e81a4f05710868a0ddcf4b53607fe6b
|
|
| BLAKE2b-256 |
38d173adaf90e3ba880245cfdb12f81d4d9ef3b399d20203faad4a0a2718fe6b
|
Provenance
The following attestation bundles were made for versionable-0.2.0.tar.gz:
Publisher:
publish.yml on hendrickmelo/versionable
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
versionable-0.2.0.tar.gz -
Subject digest:
fb36c429a8f078c2b495347b6786ca7c6e6121ff891cec2e526159f84d42d9af - Sigstore transparency entry: 1444197027
- Sigstore integration time:
-
Permalink:
hendrickmelo/versionable@c6b94f9aec38e14d1bf74717b907c77910cf49a3 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/hendrickmelo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c6b94f9aec38e14d1bf74717b907c77910cf49a3 -
Trigger Event:
release
-
Statement type:
File details
Details for the file versionable-0.2.0-py3-none-any.whl.
File metadata
- Download URL: versionable-0.2.0-py3-none-any.whl
- Upload date:
- Size: 61.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c8cbebd39e96e4b5faf58d4d456ff335075173e611216b621136727985f88d1
|
|
| MD5 |
681f5c61f067528a01462805c7e97b07
|
|
| BLAKE2b-256 |
c39b68d0928051d33abf6964094d87589e1b3eebc4f320fdba76d5e26e0e0476
|
Provenance
The following attestation bundles were made for versionable-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on hendrickmelo/versionable
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
versionable-0.2.0-py3-none-any.whl -
Subject digest:
8c8cbebd39e96e4b5faf58d4d456ff335075173e611216b621136727985f88d1 - Sigstore transparency entry: 1444197118
- Sigstore integration time:
-
Permalink:
hendrickmelo/versionable@c6b94f9aec38e14d1bf74717b907c77910cf49a3 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/hendrickmelo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c6b94f9aec38e14d1bf74717b907c77910cf49a3 -
Trigger Event:
release
-
Statement type: