Skip to main content

Hydra Optuna sweeper with MLflow parent-run logging

Project description

Hydra Optuna MLflow Sweeper

Hydra Optuna MLflow Sweeper is a general-purpose Hydra sweeper plugin for hyperparameter optimization with Optuna.

This project is based on the original Hydra Optuna Sweeper plugin by Toshihiko Yanase: https://github.com/toshihikoyanase/hydra-optuna-sweeper/tree/main

What This Package Adds

In addition to Optuna-based sweeping, this package adds:

  • MLflow study and trial hierarchy logging, including parent run propagation to trial jobs.
  • Restart behavior for persistent studies with restart_mode:
    • resume: continue an existing Optuna study and reuse the matching MLflow study run.
    • fresh: create a new timestamped study name while keeping the same storage backend.
  • Support for persistent SQLite Optuna storage (for example, sqlite:///logs/optuna/mlp_search.db).

Installation

Using pip:

pip install -e .

Using uv:

uv sync

Quick Usage

Set the sweeper in your Hydra config:

defaults:
  - override /hydra/sweeper: mlflow_optuna
  - override /hydra/launcher: joblib

The sweeper injects these runtime overrides for each trial:

  • +mlflow_parent_run_id
  • +optuna_trial_number

Your training code can use these values to attach nested runs to the study parent run.

mlflow_study_run_name controls the top-level study run name created by the sweeper. When set, that explicit value is used instead of the resolved study name. In restart_mode: resume, the sweeper reuses the latest matching MLflow study run for the resolved Optuna study name instead of creating a new one.

Parallel trial execution can be controlled through Hydra's joblib launcher by linking launcher workers:

hydra:
  launcher:
    n_jobs: 4

You can also force a dedicated file-only logger for each trial subjob:

hydra:
  sweeper:
    optuna_config:
      subjob_job_logging: file_only

This injects hydra/job_logging=file_only into each trial job.

Example logging config file:

The file file_only implements a simple file logger that writes to the Hydra run output directory, namely,

version: 1
formatters:
  simple:
    format: '[%(asctime)s][%(name)s][%(levelname)s] - %(message)s'
handlers:
  file:
    class: logging.FileHandler
    formatter: simple
    # written to the Hydra run output directory alongside other run artifacts
    filename: ${hydra.runtime.output_dir}/${hydra.job.name}.log
root:
  level: INFO
  handlers: [file]
disable_existing_loggers: false

Recommended Config Example

Below is a production-style example adapted from your config:

# @package _global_
defaults:
  - override /hydra/sweeper: mlflow_optuna

# Metric returned by train() (unused by CV)
optimized_metric: "val/loss"

# Vary the CV split seed across trials
split_seed: ${hydra:job.num}

log_system_metrics: false
save_checkpoints: false

hydra:
  mode: "MULTIRUN"
  sweeper:
    optuna_config:
      # Persistent study DB. Re-running the same command with resume
      # continues the same study.
      storage: sqlite:///logs/optuna/mlp_search.db
      study_name: mlp_search
      load_if_exists: true

      # resume: keep same study_name
      # fresh: append timestamp suffix to create a new study in same DB
      restart_mode: resume

      # Top-level MLflow run name (defaults to study_name when null)
      mlflow_study_run_name: null
      direction: minimize
      n_trials: 50

      sampler:
        _target_: optuna.samplers.TPESampler
        seed: 42

      params:
        # Architecture
        model.model.hidden_size: choice(12, 16, 20, 24, 28, 32)
        model.model.num_layers: choice(2, 3, 4, 5)
        model.model.activation: choice("relu", "softplus", "silu")
        model.model.dropout: choice(0.0, 0.1, 0.2, 0.3, 0.4, 0.5)

        # Optimization
        model.weight_decay: choice(0, 1e-5, 1e-4, 1e-3)

        # Batch size affects throughput and generalization
        datamodule.batch_size: choice(1024, 2048, 4096)

  # Keep sweep directory simple to avoid unresolved interpolation issues
  sweep:
    dir: logs/multirun/${now:%Y-%m-%d_%H-%M-%S}
    subdir: ${hydra.job.num}

Minimal Example App

A minimal runnable example is provided in example/.

python example/quadratic.py -m 'x=interval(-5.0, 5.0)' 'y=interval(0.0, 10.0)'

Train-Side MLflow Run Setup

In trial jobs (for example train.py), consume mlflow_parent_run_id injected by the sweeper to attach each training run under the study run:

from omegaconf import DictConfig
import mlflow


def _start_mlflow_run(cfg: DictConfig):
  """Start an MLflow run using config values and enable autologging."""
  logger_cfg = cfg.trainer.logger
  tracking_uri = logger_cfg.tracking_uri
  experiment_name = cfg.experiment_path
  run_name = cfg.get("run_name")
  parent_run_id = cfg.get("mlflow_parent_run_id")

  mlflow.set_tracking_uri(tracking_uri)
  mlflow.set_experiment(experiment_name)

  start_run_kwargs = {"run_name": run_name}
  if parent_run_id:
    start_run_kwargs["parent_run_id"] = parent_run_id
  return mlflow.start_run(**start_run_kwargs)

With restart_mode: resume, rerunning the same sweep command with the same study_name and storage backend continues the existing Optuna study and reuses the matching MLflow study run.

Contributing

We welcome contributions! To get started:

  1. Set up the development environment:

    uv sync
    source .venv/bin/activate
    
  2. Install pre-commit hooks:

    uv run pre-commit install
    
  3. Make your changes and run linting/tests:

    uv run pre-commit run --all-files
    uv run pytest
    
  4. Submit a pull request with a clear description of your changes.

Please ensure your code follows the project's style guidelines (enforced by ruff and pre-commit).

License

This project is licensed under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydra_optuna_mlflow_sweeper-0.1.3.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hydra_optuna_mlflow_sweeper-0.1.3-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file hydra_optuna_mlflow_sweeper-0.1.3.tar.gz.

File metadata

File hashes

Hashes for hydra_optuna_mlflow_sweeper-0.1.3.tar.gz
Algorithm Hash digest
SHA256 ac97085774492a2ddced61f0f6158aef1c9f21860345aa227320fe720f67f0a1
MD5 d312eeb87ddbb82c4c23a52d5bafd7ff
BLAKE2b-256 e697bdc93c58ac3f857c17883036bb63fe128c4d372ef02631862416d2162f2c

See more details on using hashes here.

Provenance

The following attestation bundles were made for hydra_optuna_mlflow_sweeper-0.1.3.tar.gz:

Publisher: publish-pypi.yml on amari97/hydra-optuna-mlflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hydra_optuna_mlflow_sweeper-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for hydra_optuna_mlflow_sweeper-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 81e05975f5ac784260f02c9e4cf82dcf5b1f8b76942700a18e2dd77b9ba16d18
MD5 53c26c4e0341aa9e6247ae35c54fd750
BLAKE2b-256 f26a6876455081cf1c00159122fa54fc662feef37abc570e2e6dabae3a7f79a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for hydra_optuna_mlflow_sweeper-0.1.3-py3-none-any.whl:

Publisher: publish-pypi.yml on amari97/hydra-optuna-mlflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page