Skip to main content

Snapshot the machine learning environment and diff two snapshots.

Project description

mlenv

CI PyPI Python License: MIT

Snapshot the machine learning environment to a single file, and diff two snapshots to see exactly what changed.

"It trained fine last week" usually comes down to something in the stack moving: a different CUDA build behind PyTorch, a minor Python bump, a driver upgrade, a GPU swapped on a shared box. mlenv captures all of that into one JSON file you can commit, share, and compare, with the changes most likely to affect results ranked first.

$ mlenv snapshot -o good.json     # when results were correct
$ mlenv snapshot -o now.json      # after something changed
$ mlenv diff good.json now.json
risk  section   key         change
high  cuda      torch_cuda  12.1 -> 11.8
high  python    version     3.11.9 -> 3.12.0
low   packages  rich        13.7.0 -> 13.8.0

Install

$ pip install mlenv-cli                 # from PyPI, once released
$ pip install git+https://github.com/jmweb-org/mlenv   # latest, available now          # core: Python, platform, packages, CUDA, env vars
$ pip install "mlenv-cli[gpu]"   # adds GPU model and driver capture via NVML

mlenv has no heavy dependencies and runs anywhere. GPU details are read through NVML only when the gpu extra is installed and a driver is present; without it, every other section is still captured.

What it captures

Section Contents
python Version, implementation, interpreter path
platform OS, release, architecture, libc
packages Every installed distribution and its version
cuda PyTorch build, its CUDA and cuDNN versions, availability
gpus GPU count, model per index, driver version
env Training-relevant variables (CUDA_*, NCCL_*, OMP_NUM_THREADS, ...)

Commands

$ mlenv snapshot -o env.json   # write a snapshot (omit -o to print to stdout)
$ mlenv show                   # pretty-print the current environment
$ mlenv diff a.json b.json     # show changes, highest risk first
$ mlenv diff a.json b.json --json    # machine-readable changes
$ mlenv diff a.json b.json --check   # exit non-zero on any high-risk change

In CI

--check turns the diff into a gate. Commit a baseline snapshot and fail the build when the runner drifts from it in a way that could change results:

- run: mlenv snapshot -o current.json
- run: mlenv diff baseline.json current.json --check

Risk levels

A change is high risk when it routinely moves numerical results: anything under cuda, a GPU model or count or driver change, or a Python minor-version bump. A major-version bump of a sensitive package (torch, numpy, tensorflow, jax, transformers) is high; a smaller bump is medium. Everything else is low. Only high-risk changes trip --check.

Exit codes

Code Meaning
0 Ran; no high-risk change (or --check not set)
1 --check found a high-risk change
2 A snapshot file was missing or invalid

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlenv_cli-0.2.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlenv_cli-0.2.0-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file mlenv_cli-0.2.0.tar.gz.

File metadata

  • Download URL: mlenv_cli-0.2.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mlenv_cli-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0e8baf7db1da8bfbb6aac29f68760553c10d6b579539464c1f8770b25743936d
MD5 9eab41b121e3ce38b061e9459f410452
BLAKE2b-256 469c391a900afa994bf675bf54ca24395dc35aa6925b88efd0daa74d688bf5f3

See more details on using hashes here.

File details

Details for the file mlenv_cli-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mlenv_cli-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mlenv_cli-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a4af45c9e022c245821d58099e4174d4e4f67c3fbaf38c8f99a47c5a1b8ba0e
MD5 41619fd7c4c5d3abbceaeb47fa5602fe
BLAKE2b-256 d2ca815077fef529107c947f740f9463cd956ad16e8456c712d136e038c31e7b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page