CLI for capturing and restoring ML training artifacts with encrypted cloud storage
Project description
mlvault
Immutable, encrypted snapshots of ML training runs.
mlvault captures your model checkpoints, metrics, and logs after every training run — encrypted and stored in the cloud. Restore any run on any machine with a single command.
pip install mlvault mlvault-mlflow
mlvault init
Quick start
1. Install
pip install mlvault mlvault-mlflow
2. Configure your API key
Get a key at obsideo.io/mlvault-beta, then:
mlvault init
# Paste your mlvault API key: mlv1_...
# Validating key... ok
# You're all set.
3. Run and archive a training job
mlvault run --collect ./outputs python train.py
mlvault runs your script, then encrypts and uploads everything in ./outputs to cloud storage.
Run ID : myproject_20240307_143022_abc123
Collect: /home/user/project/outputs
Command: python train.py
-- Training started --
...
-- Collecting artifacts --
3 artifact(s) -> myproject_20240307_143022_abc123.tar.gz
-- Encrypting --
myproject_20240307_143022_abc123.tar.gz.enc (258.4 KB)
-- Uploading --
Done. Run archived: myproject_20240307_143022_abc123
Remote : myproject_20240307_143022_abc123_bundle.enc
4. Restore a run
mlvault restore myproject_20240307_143022_abc123
Works on any machine where mlvault init has been run with the same API key.
5. List runs
mlvault list
MLflow integration
mlvault integrates with MLflow as a custom artifact store. Artifacts logged via mlf.log_artifact() are staged locally, then pushed to cloud storage with mlvault commit.
Setup
import mlflow
# Set the artifact store once when creating an experiment
mlflow.create_experiment("my-experiment", artifact_location="mlvault://my-project")
mlflow.set_experiment("my-experiment")
Or set it via environment variable before training:
export MLFLOW_ARTIFACT_URI=mlvault://my-project
Training loop
import mlflow
with mlflow.start_run() as run:
mlflow.log_param("lr", 0.001)
mlflow.log_metric("loss", 0.42)
mlflow.log_artifact("checkpoint.pt")
run_id = run.info.run_id
# After training, push artifacts to cloud storage:
# mlvault commit <run_id>
print(f"Run ID: {run_id}")
Commit artifacts
mlvault commit <mlflow_run_id>
Restore MLflow artifacts
mlvault restore <run_id>
Commands
| Command | Description |
|---|---|
mlvault init |
Configure API key |
mlvault run [cmd] |
Run training command and archive artifacts |
mlvault commit <mlflow_run_id> |
Push staged MLflow artifacts to storage |
mlvault restore <run_id> |
Download and decrypt a stored run |
mlvault list |
List locally tracked runs |
mlvault log <run_id> |
Show run details |
Security
Artifacts are encrypted with AES-256-GCM before leaving your machine. Encryption keys are derived from your API key — only you can decrypt your artifacts.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file obsideo_cloud-0.1.0b1-py3-none-any.whl.
File metadata
- Download URL: obsideo_cloud-0.1.0b1-py3-none-any.whl
- Upload date:
- Size: 17.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b73b50dc4f9690fb632e3279b7611d60f564389cfabdcf979a3e0a97196a2d7e
|
|
| MD5 |
da56276a89479c96b2e6e0a7af19e7f1
|
|
| BLAKE2b-256 |
f7fc8cc80d760eef0ebd0b98c97f51dfea95e6889e343e311b525ea640372a22
|