Skip to main content

CLI for capturing and restoring ML training artifacts with encrypted cloud storage

Project description

mlvault

Immutable, encrypted snapshots of ML training runs.

mlvault captures your model checkpoints, metrics, and logs after every training run — encrypted and stored in the cloud. Restore any run on any machine with a single command.

pip install mlvault mlvault-mlflow
mlvault init

Quick start

1. Install

pip install mlvault mlvault-mlflow

2. Configure your API key

Get a key at obsideo.io/mlvault-beta, then:

mlvault init
# Paste your mlvault API key: mlv1_...
# Validating key... ok
# You're all set.

3. Run and archive a training job

mlvault run --collect ./outputs python train.py

mlvault runs your script, then encrypts and uploads everything in ./outputs to cloud storage.

Run ID : myproject_20240307_143022_abc123
Collect: /home/user/project/outputs
Command: python train.py

-- Training started --
...
-- Collecting artifacts --
  3 artifact(s) -> myproject_20240307_143022_abc123.tar.gz
-- Encrypting --
  myproject_20240307_143022_abc123.tar.gz.enc (258.4 KB)
-- Uploading --

Done. Run archived: myproject_20240307_143022_abc123
Remote : myproject_20240307_143022_abc123_bundle.enc

4. Restore a run

mlvault restore myproject_20240307_143022_abc123

Works on any machine where mlvault init has been run with the same API key.

5. List runs

mlvault list

MLflow integration

mlvault integrates with MLflow as a custom artifact store. Artifacts logged via mlf.log_artifact() are staged locally, then pushed to cloud storage with mlvault commit.

Setup

import mlflow

# Set the artifact store once when creating an experiment
mlflow.create_experiment("my-experiment", artifact_location="mlvault://my-project")
mlflow.set_experiment("my-experiment")

Or set it via environment variable before training:

export MLFLOW_ARTIFACT_URI=mlvault://my-project

Training loop

import mlflow

with mlflow.start_run() as run:
    mlflow.log_param("lr", 0.001)
    mlflow.log_metric("loss", 0.42)
    mlflow.log_artifact("checkpoint.pt")

    run_id = run.info.run_id

# After training, push artifacts to cloud storage:
# mlvault commit <run_id>
print(f"Run ID: {run_id}")

Commit artifacts

mlvault commit <mlflow_run_id>

Restore MLflow artifacts

mlvault restore <run_id>

Commands

Command Description
mlvault init Configure API key
mlvault run [cmd] Run training command and archive artifacts
mlvault commit <mlflow_run_id> Push staged MLflow artifacts to storage
mlvault restore <run_id> Download and decrypt a stored run
mlvault list List locally tracked runs
mlvault log <run_id> Show run details

Security

Artifacts are encrypted with AES-256-GCM before leaving your machine. Encryption keys are derived from your API key — only you can decrypt your artifacts.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

obsideo_cloud-0.1.0b2.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

obsideo_cloud-0.1.0b2-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file obsideo_cloud-0.1.0b2.tar.gz.

File metadata

  • Download URL: obsideo_cloud-0.1.0b2.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for obsideo_cloud-0.1.0b2.tar.gz
Algorithm Hash digest
SHA256 1cb8a43c974661c4b21e91066ea18fe8001dbcd44508ca799ce13399b71388d8
MD5 35c3b45586b12b88394abdaefb598533
BLAKE2b-256 0583b87a9746ec72cf9567ea36ea2cb22812a3b4ef18bb9235911e723c0dbaff

See more details on using hashes here.

File details

Details for the file obsideo_cloud-0.1.0b2-py3-none-any.whl.

File metadata

File hashes

Hashes for obsideo_cloud-0.1.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 434cd8a8dcf3e10480571aeaa5d2185ba10aae25b9fb7595ab4f2b9c7074a0fd
MD5 68ac15f74459f5c31e8af75dc335b70a
BLAKE2b-256 133a5445cb9b3143a0ec3a26960cc233ff5219233e2ed61dba6e9df5e6ad9629

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page