Hydra launcher plugin that runs jobs on Modal (modal.com), with config-driven sandbox customisation.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

hydra-modal-launcher

A Hydra launcher plugin that ships multirun jobs to Modal. Inspired by hydra-submitit-launcher and hydra-ray-launcher.

Each Hydra job runs as one invocation of a Modal function. The image and function spec (GPU, CPU, memory, secrets, volumes, timeout, parallelism) are configured from YAML. An image_builder escape hatch lets you produce a fully custom modal.Image in Python when YAML isn't enough.

Install
Quick start
Common recipes
- Path resolution
Configuration reference
Per-job outputs
How the user's code reaches the container
How it works
Project structure
Gotchas
Limitations
Troubleshooting
Testing

Install

pip install hydra-modal-launcher
# or, from a checkout:
pip install -e ".[dev]"

Requires python>=3.10, hydra-core>=1.3, and modal>=1.4. You also need modal token new to be configured on the host that runs the sweep.

Quick start

# my_app.py
import hydra
from omegaconf import DictConfig

@hydra.main(version_base=None, config_path="conf", config_name="config")
def main(cfg: DictConfig) -> float:
    return float(cfg.lr) * float(cfg.epochs)

if __name__ == "__main__":
    main()

# conf/config.yaml
defaults:
  - _self_
  - override hydra/launcher: modal

lr: 0.01
epochs: 10

hydra:
  launcher:
    parallelism: 3        # 1 = serial, N = cap, -1 = unlimited
    image:
      pip_packages: [hydra-core, omegaconf]
    function:
      cpu: 1
      memory: 1024
      timeout: 300

Launch a sweep:

# Dry-run: log resolved spec without calling Modal
uv run my_app.py --multirun hydra.launcher.dry_run=true lr=0.001,0.01,0.1

# Real run (Modal credentials in env)
uv run my_app.py --multirun lr=0.001,0.01,0.1

Common recipes

These go under hydra.launcher in your config (or as --multirun overrides).

GPU job

hydra:
  launcher:
    parallelism: 4
    function:
      gpu: "L40S"             # or "A100", "H100", "L40S:2" for 2x
      memory: 16384
      timeout: 3600
    image:
      pip_packages: [torch]

Use a Modal Secret

hydra:
  launcher:
    function:
      secrets:
        - my-wandb-key        # resolved via modal.Secret.from_name("my-wandb-key")

Mount a Modal Volume at the sweep dir

hydra:
  launcher:
    function:
      volumes:
        ${hydra.sweep.dir}: my-sweeps    # mount-path → volume-name

Custom image builder (full Python control)

# custom_image.py at the project root (or anywhere on your sys.path)
import modal

def build_image(image_cfg) -> modal.Image:
    return (
        modal.Image.from_registry("ghcr.io/myorg/base:latest")
        .pip_install("torch==2.5.0", "lightning")
        .run_commands("git clone https://github.com/myorg/data.git /data")
    )

hydra:
  launcher:
    image:
      image_builder: custom_image.build_image    # every other image.* field is ignored

Install deps from a requirements file or pyproject

For projects with more than a handful of pins, point the launcher at your existing dep manifest instead of duplicating entries in pip_packages. Both fields are composable — and additive with pip_packages (which still wins on name collision with the auto-pinned runtime deps).

hydra:
  launcher:
    image:
      pip_requirements: requirements.txt          # passed to Image.pip_install_from_requirements

hydra:
  launcher:
    image:
      pip_pyproject: pyproject.toml               # passed to Image.pip_install_from_pyproject
      pip_pyproject_extras: [training]            # → optional_dependencies=[...]
      pip_packages: [extra-debug-tool]            # still merged on top

Install layers are emitted heavy-first (pip_pyproject → pip_requirements → pip_packages), so editing pip_packages between runs doesn't invalidate the large transitive-dep layer.

Path resolution

pip_pyproject and pip_requirements accept both absolute and relative paths. Relative paths follow this order:

Absolute paths are used as-is.
Relative paths that exist relative to CWD are passed through unchanged — Modal's default behaviour.
Otherwise, the launcher walks up from CWD looking for the nearest pyproject.toml / setup.py / setup.cfg / .git, and if the file exists there, uses that absolute path. The resolution is logged.
No match anywhere — the path is handed to Modal unchanged so the resulting FileNotFoundError surfaces at build time.

This means you can invoke uv run scripts/train.py from any subdir and pip_pyproject: pyproject.toml will still find the project's root pyproject — same DWIM the launcher already does for source mounting.

Pin extra deps without losing the auto-pinned runtime deps

The plugin auto-adds hydra-core==<host_version> and cloudpickle==<host_version> to every built image. Your pip_packages entries are merged with these; on a name collision, your pin wins.

hydra:
  launcher:
    image:
      pip_packages:
        - "torch==2.5.0"
        - "transformers>=4.50"
        - "hydra-core>=1.3.0,<2"   # overrides the auto-pin

Configuration reference

`hydra.launcher.image` (`ModalImageConf`)

Field	Default	Notes
`python_version`	`null`	If unset, matches the host's `major.minor` at launch time. Cross-version cloudpickle of `__main__` functions can SIGSEGV the container; keep these aligned.
`base`	`"debian_slim"`	or `"from_registry"`
`base_image`	`null`	required when `base="from_registry"`
`pip_packages`	`[]`	sorted before install for cache stability; merged with auto-pinned `hydra-core` + `cloudpickle`
`pip_requirements`	`null`	path to a requirements file; passed to `Image.pip_install_from_requirements`. Relative paths are resolved against the nearest project root (see Path resolution).
`pip_pyproject`	`null`	path to a `pyproject.toml`; passed to `Image.pip_install_from_pyproject`. Same resolution rules as `pip_requirements`.
`pip_pyproject_extras`	`[]`	extras keys for `pip_pyproject`, forwarded as `optional_dependencies=[...]`
`apt_packages`	`[]`
`run_commands`	`[]`	extra `RUN` lines
`env`	`{}`	env vars baked into the image
`local_python_modules`	`[]`	importable module names; passed to `Image.add_local_python_source`
`local_dirs`	`[]`	list of `{local_path, remote_path, ignore}` for `Image.add_local_dir`
`image_builder`	`null`	dotted path to `(image_cfg) -> modal.Image`. Overrides every other field in `image`.

`hydra.launcher.function` (`ModalFunctionConf`)

Field	Default	Notes
`gpu`	`null`	e.g. `"L40S"`, `"A100:2"`
`cpu`	`null`	float, fractional cores
`memory`	`null`	MB
`timeout`	`3600`	seconds
`secrets`	`[]`	names resolved via `modal.Secret.from_name`
`volumes`	`{}`	`mount_path -> volume_name`, resolved via `modal.Volume.from_name`
`retries`	`0`
`region`	`null`

`hydra.launcher` (top-level)

Field	Default	Notes
`app_name`	`"hydra-modal-launcher"`	passed to `modal.App(...)`
`parallelism`	`-1`	`1` = serial, `N` caps concurrent containers via `max_containers=N`, `-1` = unbounded
`dry_run`	`false`	log resolved spec and skip `app.run()`
`env_passthrough`	`[]`	Host env vars to snapshot at launch time and inject into every worker container. Shipped as an ephemeral `modal.Secret.from_dict`, so values are present before user code starts. Use for per-launch runtime values (e.g. a tracking run ID set by a parent-side callback) that can't live in a static named secret. Missing keys log a warning and are skipped.

Per-job outputs

Jobs run remotely on ephemeral Modal containers; Hydra's per-job working directory written by run_job lives on that container, not on your laptop. The launcher:

Always writes minimal local .hydra/{config,hydra,overrides}.yaml stubs into ${hydra.sweep.dir}/<job_num>/ from the parent process so downstream tooling and humans see the expected layout.
Optionally mounts a Modal Volume on the remote container via hydra.launcher.function.volumes. If you want real artifact persistence, point a volume at the sweep dir and pull it down after the run.

Each job's Python return value is captured in JobReturn._return_value. Failures are mapped to JobReturn(status=FAILED, _return_value=<exception>).

How the user's code reaches the container

Modal does not auto-mount your CWD. The launcher inspects your task_function's module and:

Importable package (__module__ == "myproject.scripts.train"): adds the top-level package via Image.add_local_python_source("myproject").
__main__ (e.g. python scripts/train.py): walks up from the script's directory looking for pyproject.toml / setup.py / setup.cfg / .git. If found, mounts the whole project root via Image.add_local_dir(<root>, "/root") with default ignores (.venv/, .git/, __pycache__/, node_modules/, multirun/, outputs/, etc.). This handles the common research-repo layout where scripts/ is a sibling of the package:
```
myproject/
├── pyproject.toml        ← project root marker
├── myproject/            ← package
│   └── lib.py
├── scripts/
│   └── train.py          ← @hydra.main entrypoint
└── conf/
    └── config.yaml
```
__main__ with no project markers anywhere up-tree: mounts only the script's directory and warns that sibling packages will be unreachable.

Override either path by setting image.local_python_modules, image.local_dirs (with custom ignore globs per mount), or by taking full control with image.image_builder.

How it works

parent process                              modal cloud
──────────────                              ───────────
@hydra.main(main)
   │
   ▼
ModalLauncher.launch(overrides, idx0)
   │
   │  1. configure_log + Singleton.get_state()
   │  2. _resolve_sweep_configs(overrides)        ┐  done on parent —
   │  3. _write_local_job_stubs(sweep_configs)    │  the user's conf/ dir
   │  4. cloudpickle.dumps(launcher)              │  is local-only and
   │  5. build_modal_app(launcher_cfg)            ┘  doesn't exist remotely
   │
   ▼
with modal.enable_output(), app.run():
    fn.starmap(payloads, return_exceptions=True) ────►  spawns N containers
                                                              │
                                                              ▼
                                            _worker.modal_entry(sweep_config, num, state, launcher_pickled)
                                                              │
                                                              │  cloudpickle.loads(launcher_pickled)
                                                              │  Singleton.set_state + setup_globals
                                                              │  HydraConfig.instance().set_config(sweep_config)
                                                              │  open_dict: hydra.job.id = modal call id
                                                              │  run_job(task_function, sweep_config, ...)
                                                              │
                                                              ▼
                                                         returns JobReturn
   │
   ▼
[JobReturn, JobReturn, ...] ────► back to Hydra's sweeper

Sweep configs are pre-resolved on the parent so the worker never needs to read the local conf/ dir from inside a Modal container. The cloudpickled launcher carries task_function and hydra_context.callbacks; the singleton snapshot is shipped separately and restored on the worker so HydraConfig.instance() resolves correctly.

Project structure

hydra-modal-launcher/
├── hydra_plugins/hydra_modal_launcher/   # the plugin (PEP 420 namespace — no __init__ on hydra_plugins/)
│   ├── config.py                          # dataclasses + ConfigStore registration
│   ├── modal_launcher.py                  # ModalLauncher(Launcher)
│   ├── _modal_app.py                      # pure + impure builders for modal.App / Image / Function
│   └── _worker.py                         # ships to the Modal container
├── example/                               # Layout-A demo (entry: `uv run example/my_app.py`)
├── tests/                                 # pure unit tests, no Modal account required
├── AGENTS.md                              # ← read this if you're an AI agent
└── CHANGELOG.md

For deeper conventions and invariants — what's pure vs impure, where Modal can be imported, how to add a config field — see AGENTS.md.

Gotchas

Host/container Python version must match. Cloudpickle ships __main__-scoped functions by value (bytecode + cells); deserializing across Python minor versions can segfault the container. The default python_version=null auto-detects the host's major.minor and uses that, so you generally don't need to set it.
hydra-core and cloudpickle are added to every built image automatically, pinned to your host's installed versions. User-supplied version pins for the same package win on name collision.
Modal logs stream to your terminal during a sweep via modal.enable_output(). Local Hydra logs and remote container stdout are interleaved.

Limitations

No checkpoint / preemption support — Modal has no equivalent of SLURM's signal protocol.
No automatic sync of remote working dirs back to your laptop. Use volumes if you need it.
Ephemeral apps only (with app.run():). Pre-deployed apps via Function.from_name are out of scope.
Image is rebuilt once per launch() call. Modal caches build layers so subsequent runs are fast.

Troubleshooting

`Runner segmentation fault (SIGSEGV)` on container startup

Host and container Python versions don't match. Cloudpickle ships __main__ functions by value (bytecode); deserializing across minor versions crashes the container. Verify hydra.launcher.image.python_version is null (the default — it auto-matches your host) or set explicitly to your host's major.minor.

`ModuleNotFoundError: No module named 'mypkg'` on the remote

The auto-mount didn't pick up your package. If you ran the script directly (python scripts/train.py), the launcher looks for pyproject.toml / setup.py / setup.cfg / .git to mount the whole project root. If none exist, only the script's directory is mounted. Fix by either:

adding the missing markers (an empty pyproject.toml is fine), or
explicitly setting image.local_python_modules: ["mypkg"] or image.local_dirs: in your config.

`Primary config directory not found` on the remote

You're on a stale build of the plugin. v0.1.0+ pre-resolves sweep configs on the parent — the worker should never call load_sweep_config. Upgrade.

`Input aborted - exceeded limit of 8 retries`

Container is crashing during input deserialization. Usual causes:

Python-version mismatch (SIGSEGV — see above).
Out-of-memory at import time. Bump function.memory. hydra-core + omegaconf import at ~150 MB; 256 MB is too tight.
Cloudpickle version drift. Should be auto-pinned to your host — verify with hydra.launcher.dry_run=true and check cloudpickle==X.Y.Z is in pip_packages.

Modal container logs aren't showing up

You're probably running an old build. v0.1.0+ wraps the sweep in with modal.enable_output(). Upgrade.

Dry-run for everything

Add hydra.launcher.dry_run=true to any sweep. The launcher logs the resolved image spec + function kwargs and returns without calling Modal. Useful for validating config and image deps before paying for a build.

Testing

uv sync --extra dev
uv run pytest tests/

uv.lock is committed, so the sync is reproducible. Unit tests don't require a Modal account; the orchestration is pure functions where possible.

Live tests against real Modal are marked @pytest.mark.live and skipped by default. To run them (requires Modal credentials configured locally):

uv run pytest tests/ --live

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

joncarter1

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

May 13, 2026

0.2.1

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydra_modal_launcher-0.3.0.tar.gz (22.5 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hydra_modal_launcher-0.3.0-py3-none-any.whl (20.1 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file hydra_modal_launcher-0.3.0.tar.gz.

File metadata

Download URL: hydra_modal_launcher-0.3.0.tar.gz
Upload date: May 13, 2026
Size: 22.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hydra_modal_launcher-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`4a0a0852b676246b3131257071e5684bc86197d305af0473e37fb87dbdd2a23a`
MD5	`e10906aec66de3d2c849ab71a2324276`
BLAKE2b-256	`a6d6cb03e41d13d629d125126e8ce3ef54872682609fa9ad60abcdef56feada4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hydra_modal_launcher-0.3.0.tar.gz:

Publisher: publish.yml on joncarter1/hydra-modal-launcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hydra_modal_launcher-0.3.0.tar.gz
- Subject digest: 4a0a0852b676246b3131257071e5684bc86197d305af0473e37fb87dbdd2a23a
- Sigstore transparency entry: 1523730922
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: joncarter1/hydra-modal-launcher@dbe1ffb3c5922d7468727770e57df1f1320d13f0
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/joncarter1
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@dbe1ffb3c5922d7468727770e57df1f1320d13f0
- Trigger Event: push

File details

Details for the file hydra_modal_launcher-0.3.0-py3-none-any.whl.

File metadata

Download URL: hydra_modal_launcher-0.3.0-py3-none-any.whl
Upload date: May 13, 2026
Size: 20.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hydra_modal_launcher-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a302bf3828b994e80e2aefb37ca47217fd13c7c19a9c417b1860ce1836111b78`
MD5	`52433f1e063f2aff1d3f06e5784195a9`
BLAKE2b-256	`54a484c78b14033d1f823de3a9587be53ae7d8e7a18fd94db3145083d29943ec`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hydra_modal_launcher-0.3.0-py3-none-any.whl:

Publisher: publish.yml on joncarter1/hydra-modal-launcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hydra_modal_launcher-0.3.0-py3-none-any.whl
- Subject digest: a302bf3828b994e80e2aefb37ca47217fd13c7c19a9c417b1860ce1836111b78
- Sigstore transparency entry: 1523730929
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: joncarter1/hydra-modal-launcher@dbe1ffb3c5922d7468727770e57df1f1320d13f0
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/joncarter1
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@dbe1ffb3c5922d7468727770e57df1f1320d13f0
- Trigger Event: push

hydra-modal-launcher 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

hydra-modal-launcher

Contents

Install

Quick start

Common recipes

GPU job

Use a Modal Secret

Mount a Modal Volume at the sweep dir

Custom image builder (full Python control)

Install deps from a requirements file or pyproject

Path resolution

Pin extra deps without losing the auto-pinned runtime deps

Configuration reference

hydra.launcher.image (ModalImageConf)

hydra.launcher.function (ModalFunctionConf)

hydra.launcher (top-level)

Per-job outputs

How the user's code reaches the container

How it works

Project structure

Gotchas

Limitations

Troubleshooting

Runner segmentation fault (SIGSEGV) on container startup

ModuleNotFoundError: No module named 'mypkg' on the remote

Primary config directory not found on the remote

Input aborted - exceeded limit of 8 retries

Modal container logs aren't showing up

Dry-run for everything

Testing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`hydra.launcher.image` (`ModalImageConf`)

`hydra.launcher.function` (`ModalFunctionConf`)

`hydra.launcher` (top-level)

`Runner segmentation fault (SIGSEGV)` on container startup

`ModuleNotFoundError: No module named 'mypkg'` on the remote

`Primary config directory not found` on the remote

`Input aborted - exceeded limit of 8 retries`