A composite Dagster run launcher that routes runs to multiple Docker daemons or the default launcher based on code location.
Project description
dagster-multihost-launcher
A composite Dagster run launcher that routes runs to multiple Docker daemons across different hosts, or falls back to the DefaultRunLauncher for non-Docker code locations.
Architecture
┌──────────────────────────────────────────────────────────────┐
│ Host A (control plane) │
│ │
│ ┌────────────┐ ┌──────────┐ ┌──────────┐ │
│ │ webserver │ │ daemon │ │ postgres │ │
│ └────────────┘ └──────────┘ └──────────┘ │
│ │ ▲ │
│ │ │ (event storage) │
│ ┌──────────────────┐ │ │ │
│ │ admin code loc │ │ │ │
│ │ (cleanup/status)│ │ │ │
│ └──────────────────┘ │ │ │
│ │ │ │
└───────────────────────┼──────────────┼───────────────────────┘
│ │
┌─────────────┼──────────────┼──────────────┐
│ ▼ │ │
│ Host B (Docker) │ │
│ │ │
│ ┌──────────────────┐ │ │
│ │ code loc gRPC │ │ │
│ │ (etl_pipelines) │ │ │
│ └──────────────────┘ │ │
│ │ │
│ ┌──────────────────┐ │ │
│ │ run container ◄──┼──────┘ │
│ │ (created by │ │
│ │ daemon via TCP) │ │
│ └──────────────────┘ │
│ │
└───────────────────────────────────────────┘
┌───────────────────────────────────────────┐
│ Host D (non-Docker) │
│ │
│ ┌──────────────────┐ │
│ │ code loc gRPC │ (bare process) │
│ │ (reporting) │ │
│ └──────────────────┘ │
│ ▲ │
│ │ DefaultRunLauncher sends │
│ │ start_run gRPC; run executes │
│ │ here in the gRPC server process │
└───────────────────────────────────────────┘
How It Works
The launcher inspects each run's code location name and routes it:
- Code locations listed under
docker_hosts→ creates a Docker container on the mapped remote daemon via the Docker TCP API - All other code locations → delegates to Dagster's
DefaultRunLauncher, which sends astart_rungRPC call to the code location's gRPC server — the run executes as a subprocess on whichever host runs that gRPC server
This means you can mix Docker-based and non-Docker code locations freely. Any location not explicitly mapped to a Docker host automatically falls back to the DefaultRunLauncher.
Installation
pip install dagster-multihost-launcher
The package must be installed in the Docker images for the webserver and daemon containers on Host A.
Configuration
dagster.yaml
run_launcher:
module: dagster_multihost_launcher
class: MultiHostDockerRunLauncher
config:
default_env_vars:
- DAGSTER_POSTGRES_USER
- DAGSTER_POSTGRES_PASSWORD
- DAGSTER_POSTGRES_DB
- DAGSTER_POSTGRES_HOST
docker_hosts:
# Host A: containerized code locations on the control plane host.
# Omitting docker_url connects to the local Docker daemon.
- host_name: "host-a"
location_names:
- "local_pipelines"
network: "dagster_network"
# Host B: remote Docker host with TLS
- host_name: "host-b"
docker_url: "tcp://10.0.1.2:2376"
tls:
ca_cert: "/certs/ca.pem"
client_cert: "/certs/client-cert.pem"
client_key: "/certs/client-key.pem"
location_names:
- "etl_pipelines"
network: "host_b_dagster_network"
env_vars:
- "WAREHOUSE_HOST=10.0.1.50"
# Host D's "reporting" code location is NOT listed above, so it
# uses DefaultRunLauncher — the run executes on Host D's gRPC server.
Configuration Reference
Top-level options:
| Key | Type | Description |
|---|---|---|
docker_hosts |
list | Remote Docker host configurations |
default_env_vars |
list[str] | Env vars passed to ALL run containers |
default_container_kwargs |
dict | Default containers.create() kwargs |
container_label_prefix |
str | Label prefix (default: dagster) |
Per docker_host:
| Key | Type | Required | Description |
|---|---|---|---|
host_name |
str | yes | Friendly name (used in tags/logs) |
docker_url |
str | no | Docker daemon URL (tcp://, ssh://, unix://). Omit to use the local daemon. |
location_names |
list[str] | yes | Code locations that run here |
tls |
dict | no | TLS config (ca_cert, client_cert, client_key, verify) |
network |
str | no | Docker network to attach run containers to |
networks |
list[str] | no | Multiple networks to attach to |
env_vars |
list[str] | no | Host-specific env vars |
container_kwargs |
dict | no | Host-specific containers.create() kwargs |
registry |
dict | no | Registry credentials (url, username, password) |
Setting Up Remote Docker Daemons
Each remote host needs its Docker daemon exposed over TCP with TLS.
1. Generate TLS certificates
Use Docker's built-in TLS guide or a tool like cfssl. You need:
- A CA cert (
ca.pem) - Server cert + key for each Docker host
- Client cert + key for the Dagster daemon on Host A
2. Configure the Docker daemon
On each remote host, edit /etc/docker/daemon.json:
{
"hosts": ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2376"],
"tls": true,
"tlscacert": "/etc/docker/tls/ca.pem",
"tlscert": "/etc/docker/tls/server-cert.pem",
"tlskey": "/etc/docker/tls/server-key.pem",
"tlsverify": true
}
If using systemd, you may also need to override the ExecStart:
sudo systemctl edit docker.service
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd
Then reload and restart:
sudo systemctl daemon-reload
sudo systemctl restart docker
3. Alternative: SSH-based access
Instead of TLS over TCP, you can use SSH:
docker_hosts:
- host_name: "host-b"
docker_url: "ssh://deploy@10.0.1.2"
location_names: ["etl_pipelines"]
This requires the Dagster daemon container to have SSH client installed and the appropriate key mounted.
Admin Code Location
The package includes pre-built assets for container cleanup and monitoring. On Host A, create an admin code location:
# admin_location/__init__.py
from dagster_multihost_launcher import build_admin_definitions
defs = build_admin_definitions() # every 5 min, cleanup after 1 min
This creates a single multihost_admin_job that first checks container status across all hosts, then cleans up old exited containers. The two assets (multihost_container_status → multihost_container_cleanup) run in sequence within the same job.
Since this code location is NOT listed under any docker_hosts entry, it runs via DefaultRunLauncher on Host A — which is where the Docker clients and TLS certs are configured.
Important: The admin container needs the dagster.yaml and TLS certs mounted, because the admin assets rehydrate the MultiHostDockerRunLauncher to talk to remote Docker daemons. See the example docker-compose.yml for the required volume mounts.
The cleanup max age is configurable per-run via the multihost/cleanup_max_age_hours run tag.
Integrating into an existing code location
If you already have a code location with its own Definitions, you can import the individual assets instead of using build_admin_definitions:
from dagster_multihost_launcher import multihost_cleanup_asset, multihost_status_asset
from dagster import Definitions, ScheduleDefinition, define_asset_job
admin_job = define_asset_job(
"admin_job",
selection=[multihost_status_asset, multihost_cleanup_asset],
)
admin_schedule = ScheduleDefinition(job=admin_job, cron_schedule="0 */6 * * *")
defs = Definitions(
assets=[my_asset_1, my_asset_2, multihost_status_asset, multihost_cleanup_asset],
jobs=[my_job, admin_job],
schedules=[my_schedule, admin_schedule],
)
Networking Considerations
Run containers → Postgres
Run containers on remote hosts need to reach Postgres on Host A. Use Host A's real IP/hostname (not a docker-compose service name) for DAGSTER_POSTGRES_HOST. Make sure:
- Postgres port is published on Host A (on all interfaces, not just localhost)
- Firewall rules allow traffic from remote hosts
DefaultRunLauncher and instance_ref
When DefaultRunLauncher sends a run to a remote gRPC server, it includes the daemon's instance_ref — the serialized storage config from dagster.yaml. If your storage config uses env: references (e.g., env: DAGSTER_POSTGRES_HOST), those env vars must also be set on the remote gRPC server's host. Use the same env var names but with values appropriate for that host (e.g., Host A's external IP instead of a Docker service name).
Run containers → other services
If run containers need to talk to services in the same docker-compose stack on their host (e.g., a local Redis), attach them to the same network via the network config.
Important: Docker Compose prefixes network names with the project name. Either:
- Use an explicit
name:in your docker-compose network definition - Use the full prefixed name in your
dagster.yaml
Code location gRPC → Host A
The gRPC servers on remote hosts need their ports accessible from Host A (for the webserver and daemon to load definitions).
Run Tags
The launcher tags each Docker run with:
| Tag | Description |
|---|---|
multihost_docker/container_id |
Docker container ID |
multihost_docker/host_name |
Which Docker host the container is on |
multihost_docker/launcher_type |
docker or default |
These are used by terminate(), check_run_worker_health(), and the admin cleanup assets.
Image Resolution
For Docker-launched runs, the image is resolved in this order:
dagster/imagetag on the runDAGSTER_CURRENT_IMAGEfrom the code location- Container image from the job code origin
Set DAGSTER_CURRENT_IMAGE in your code location's environment (see Host B docker-compose example).
Logging
Dagster's run worker process (dagster api execute_run) writes structured events directly to Postgres. As long as the run container can reach Postgres, logs appear in the Dagster UI automatically — no log forwarding required.
For container-level failures (OOM, image pull errors, crashes), the check_run_worker_health method captures the last 25 lines of Docker logs and reports them as engine events.
Examples
The root-level dagster.yaml, workspace.yaml, docker-compose.yml, and host_b_docker-compose.yml provide example configurations for a typical multi-host setup.
The integration_test/ directory contains a complete working test setup across three hosts:
integration_test/host_a/— Control plane (webserver, daemon, postgres, admin code location)integration_test/host_b/— Remote Docker host with TLS, running a dockerized test code locationintegration_test/host_d/— Bare-process host running a non-Docker gRPC server
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dagster_multihost_launcher-0.1.2.tar.gz.
File metadata
- Download URL: dagster_multihost_launcher-0.1.2.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dac5484736236360621e4d3a89efc33d74144358f0d47bd453725cc8fb3b6079
|
|
| MD5 |
be3db360cd1be1f67a20f1f25eef0121
|
|
| BLAKE2b-256 |
c27158d63480eee6e41afdabfa125213445f8113ba5a42bdef8170fe1789fbab
|
Provenance
The following attestation bundles were made for dagster_multihost_launcher-0.1.2.tar.gz:
Publisher:
publish.yml on electrophys/dagster-multihost-launcher
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dagster_multihost_launcher-0.1.2.tar.gz -
Subject digest:
dac5484736236360621e4d3a89efc33d74144358f0d47bd453725cc8fb3b6079 - Sigstore transparency entry: 941737986
- Sigstore integration time:
-
Permalink:
electrophys/dagster-multihost-launcher@f2893ee86a665dda3675dc2bea2ed444a0ae5e0e -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/electrophys
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f2893ee86a665dda3675dc2bea2ed444a0ae5e0e -
Trigger Event:
release
-
Statement type:
File details
Details for the file dagster_multihost_launcher-0.1.2-py3-none-any.whl.
File metadata
- Download URL: dagster_multihost_launcher-0.1.2-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
636709aa6d1598837ef178cf02a26a7e4cb805491c250453320bb20c70a0ffff
|
|
| MD5 |
7189613123a3af9716e973af895c5651
|
|
| BLAKE2b-256 |
1fb009d602cb217ef892d0ec93581f7153a7cff746955f5fea3c4f67b600df6e
|
Provenance
The following attestation bundles were made for dagster_multihost_launcher-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on electrophys/dagster-multihost-launcher
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dagster_multihost_launcher-0.1.2-py3-none-any.whl -
Subject digest:
636709aa6d1598837ef178cf02a26a7e4cb805491c250453320bb20c70a0ffff - Sigstore transparency entry: 941737998
- Sigstore integration time:
-
Permalink:
electrophys/dagster-multihost-launcher@f2893ee86a665dda3675dc2bea2ed444a0ae5e0e -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/electrophys
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f2893ee86a665dda3675dc2bea2ed444a0ae5e0e -
Trigger Event:
release
-
Statement type: