Skip to main content

Utilities for MLflow

Project description


title: README author: Jan-Michael Rye

Synopsis

Provide several commands for working with MLflow output directories, such as merging experiments from separate mlruns directories into a single one or fixing artifact paths after moving an mlruns directory.

Links

Installation

Install the MLflow Extra package from the Python Package Index using any standard Python package manager, e.g.

# Uncomment the following 2 lines to create and activate a virtual environment.
# python -m venv venv
# source venv/bin/activate
pip3 install --upgrade mlflow-extra

It can also be installed from source with any standard Python package manager that supports pyproject.toml files. For example, to install it with pip, either locally or in a virtual environment, run the following commands:

git clone https://gitlab.inria.fr/jrye/mlflow-extra
cd mlflow-extra
# Uncomment the following 2 lines to create and activate a virtual environment.
# python -m venv venv
# source venv/bin/activate
pip install --upgrade .

Commands

These commands provide complimentary functionality for the mlflow command-line interface.

mlflow-filter_runs

A command-line tool to filter runs in an experiment using either metric threshold values or the total number of runs to keep.

$ mlflow-filter_runs --help
usage: mlflow-filter_runs [-h] [-a] [-c] [-l] [-m METRICS [METRICS ...]] [-n NUMBER] [-t THRESHOLDS [THRESHOLDS ...]] experiment_id

Delete runs from experiment based on thresholds.

positional arguments:
  experiment_id         The MLflow experiment ID (see mlflow experiments list).

options:
  -h, --help            show this help message and exit
  -a, --ascending       Keep the first n runs in ascending order instead of descending.
  -c, --confirm         Confirm the deletion. Without this only a dryrun is performed.
  -l, --list            List metrics and their statistics.
  -m METRICS [METRICS ...], --metrics METRICS [METRICS ...]
                        The metrics by which to filter runs.
  -n NUMBER, --number NUMBER
                        The number of runs to keep.
  -t THRESHOLDS [THRESHOLDS ...], --thresholds THRESHOLDS [THRESHOLDS ...]
                        The threshold values for the selected metrics. In descending order the threshold values are a lower limit. In ascending order they are an upper limit.

mlflow-fix_artifacts

A command-line tool for fixing artifact URIs in experiment and run metadata files. It can be used to fix paths after they have been changed, either on the same system or when transferred from another.

$ mlflow-fix_artifacts --help
usage: mlflow-fix_artifacts [-h] [-m MAP] path

Attempt to fix broken artifact URIs in experiments and runs.

positional arguments:
  path               A path to a directory with experiments and runs.

options:
  -h, --help         show this help message and exit
  -m MAP, --map MAP  A path to a YAML file that maps old paths to new paths.

mlflow-fix_experiment_ids

A command-line tool for fixing experiment IDs. The experiment ID will be set to the experiment's directory name if it is a non-negative integer (nni). If not, the directory will be renamed to the experiment's current ID if the ID is a nni, otherwise it will be renamed to the first available nni in the parent directory. The experiment ID will then be updated in the experiment and all runs in it.

$ mlflow-fix_experiment_ids --help
usage: mlflow-fix_experiment_ids [-h] paths [paths ...]

Attempt to fix experiment IDs so that the experiment's directory and all of its runs match its ID.

positional arguments:
  paths       Experiment directory paths.

options:
  -h, --help  show this help message and exit

mlflow-merge

A command-line tool for merging experiments from multiple mlruns directories into a common directory. It will merge experiments with the same name and update experiment IDs to ensure consistency.

$ mlflow-merge --help
usage: mlflow-merge [-h] target dirs [dirs ...]

Copy experiments into a common MLflow directory. Runs from experiments with the same name will be merged.

positional arguments:
  target      The directory into which to merge the experiments. Default: None
  dirs        The directories with the experiments to merge.

options:
  -h, --help  show this help message and exit

Python Module

See the online documentation for details.

Utility Scripts

Several utility scripts are provided for convenience.

install.sh

install.sh will optionally set up a virtual environment and then install MLflow Extra from source with pip. See install.sh -h for details.

install_and_run.sh

install_and_run.sh will run any of the commands in the MLflow Extra package after ensuring that they are available by installing the package from source if necessary. It is useful for quickly fixing artifacts paths when transferring mlruns directories. See install_and_run.sh -h for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlflow_extra-2023.7.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

mlflow_extra-2023.7-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file mlflow_extra-2023.7.tar.gz.

File metadata

  • Download URL: mlflow_extra-2023.7.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for mlflow_extra-2023.7.tar.gz
Algorithm Hash digest
SHA256 33ea7dbb87cfe5493489993ed1e0226e050ffde89aafb167f1e58299cfdec2a4
MD5 1e9c3c009a6b3ab3be541b43c62c5fdd
BLAKE2b-256 4c6572f549508d48ba82653f4822a81844640d2d3eeba91e14b562342cbcca88

See more details on using hashes here.

File details

Details for the file mlflow_extra-2023.7-py3-none-any.whl.

File metadata

File hashes

Hashes for mlflow_extra-2023.7-py3-none-any.whl
Algorithm Hash digest
SHA256 356b9bc0fe4c970dfd19e5f928559d0dc171397cb6f53d01e6cff3b6ed6f3486
MD5 cb6e42fec648840925347391a99be7d0
BLAKE2b-256 c4af5668f719f83e0189b3b7c4cb39fe5f958c14e29e8bdaf0c197a61bb0a471

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page