Skip to main content

CLI for capturing and restoring ML training artifacts with encrypted cloud storage

Project description

mlvault

Immutable, encrypted snapshots of ML training runs.

mlvault captures your model checkpoints, metrics, and logs after every training run — encrypted and stored in the cloud. Restore any run on any machine with a single command.

pip install mlvault mlvault-mlflow
mlvault init

Quick start

1. Install

pip install mlvault mlvault-mlflow

2. Configure your API key

Get a key at obsideo.io/mlvault-beta, then:

mlvault init
# Paste your mlvault API key: mlv1_...
# Validating key... ok
# You're all set.

3. Run and archive a training job

mlvault run --collect ./outputs python train.py

mlvault runs your script, then encrypts and uploads everything in ./outputs to cloud storage.

Run ID : myproject_20240307_143022_abc123
Collect: /home/user/project/outputs
Command: python train.py

-- Training started --
...
-- Collecting artifacts --
  3 artifact(s) -> myproject_20240307_143022_abc123.tar.gz
-- Encrypting --
  myproject_20240307_143022_abc123.tar.gz.enc (258.4 KB)
-- Uploading --

Done. Run archived: myproject_20240307_143022_abc123
Remote : myproject_20240307_143022_abc123_bundle.enc

4. Restore a run

mlvault restore myproject_20240307_143022_abc123

Works on any machine where mlvault init has been run with the same API key.

5. List runs

mlvault list

MLflow integration

mlvault integrates with MLflow as a custom artifact store. Artifacts logged via mlf.log_artifact() are staged locally, then pushed to cloud storage with mlvault commit.

Setup

import mlflow

# Set the artifact store once when creating an experiment
mlflow.create_experiment("my-experiment", artifact_location="mlvault://my-project")
mlflow.set_experiment("my-experiment")

Or set it via environment variable before training:

export MLFLOW_ARTIFACT_URI=mlvault://my-project

Training loop

import mlflow

with mlflow.start_run() as run:
    mlflow.log_param("lr", 0.001)
    mlflow.log_metric("loss", 0.42)
    mlflow.log_artifact("checkpoint.pt")

    run_id = run.info.run_id

# After training, push artifacts to cloud storage:
# mlvault commit <run_id>
print(f"Run ID: {run_id}")

Commit artifacts

mlvault commit <mlflow_run_id>

Restore MLflow artifacts

mlvault restore <run_id>

Commands

Command Description
mlvault init Configure API key
mlvault run [cmd] Run training command and archive artifacts
mlvault commit <mlflow_run_id> Push staged MLflow artifacts to storage
mlvault restore <run_id> Download and decrypt a stored run
mlvault list List locally tracked runs
mlvault log <run_id> Show run details

Security

Artifacts are encrypted with AES-256-GCM before leaving your machine. Encryption keys are derived from your API key — only you can decrypt your artifacts.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

obsideo_cloud-0.1.0b1-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file obsideo_cloud-0.1.0b1-py3-none-any.whl.

File metadata

File hashes

Hashes for obsideo_cloud-0.1.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 b73b50dc4f9690fb632e3279b7611d60f564389cfabdcf979a3e0a97196a2d7e
MD5 da56276a89479c96b2e6e0a7af19e7f1
BLAKE2b-256 f7fc8cc80d760eef0ebd0b98c97f51dfea95e6889e343e311b525ea640372a22

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page