Dagster connector for Floe CLI (config-driven ingestion)
Project description
Floe + Dagster
This folder contains the Dagster connector for Floe.
For local setup of both Dagster and Airflow with isolated virtual environments, see:
orchestrators/LOCAL_DEV.md
Current model:
- Parse-time reads Floe manifests (
floe.manifest.v1), not Floe YAML. - Supports single manifest and multi-manifest directory loading.
- Builds one asset per entity and one job per manifest.
- Runs Floe with manifest execution contract and JSON logs.
- Publishes run/entity metadata from NDJSON + summary.
What Is Implemented
- Manifest-first orchestration (
floe.manifest.v1). - Strict schema validation at manifest load time.
- Asset generation from
entities[](asset_key,group_namerespected). - One Dagster job per manifest (stable job names).
- Multi-manifest loading (
*.manifest.json) with collision checks. - Local runner (
local_process) support. execution.defaults.envandexecution.defaults.workdirsupport.- Floe quality outcomes exposed as Dagster native Asset Checks (
cast_error,not_null,unique,schema_mismatch,file_status) from Floe reports.
Install
Prereqs:
- Floe installed (either the
floeCLI binary or Docker with a Floe image). - Python 3.10+.
pip install dagster-floe
Development install (from this repo)
python3 -m venv orchestrators/dagster-floe/.venv
source orchestrators/dagster-floe/.venv/bin/activate
pip install -e orchestrators/dagster-floe[dev]
Generate a manifest
floe manifest generate \
-c orchestrators/dagster-floe/example/config.yml \
--output orchestrators/dagster-floe/example/manifest.dagster.json
Run the example (repo-only)
cd orchestrators/dagster-floe
FLOE_MANIFEST_DIR=./example/manifests dagster dev
The example workspace loads example/definitions.py, which wires local example files/manifest to the reusable connector APIs.
The repository example includes two manifests by domain:
example/manifests/hr.manifest.jsonexample/manifests/sales.manifest.json
Notes
- This connector does not parse YAML directly; it consumes
floe.manifest.v1. - Connector logic lives under
src/floe_dagster/; local wiring for demo lives inexample/definitions.py. - For local development without an installed
floebinary, you can pointLocalRunnerto a custom command, e.g.:LocalRunner(\"cargo run -p floe-cli --\")
- Manifest runner support in connector is currently
local_processonly. - For local setup commands, use
orchestrators/LOCAL_DEV.md. - Design notes and future work:
orchestrators/dagster-floe/INTEGRATION_SPEC.md
What Is Not Implemented Yet
- Kubernetes/ECS runner adapters.
- Cloud summary loading (
s3://,gs://,abfs://). - Single-process multi-entity fan-out execution mode.
Releasing
This repo is a monorepo. Floe and this connector are versioned and tagged independently:
- Floe CLI release tags:
vX.Y.Z - Dagster connector release tags:
dagster-floe-vX.Y.Z(triggers the PyPI publish workflow)
Example:
git checkout main
git pull
git tag dagster-floe-v0.1.0
git push origin dagster-floe-v0.1.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dagster_floe-0.1.6.tar.gz.
File metadata
- Download URL: dagster_floe-0.1.6.tar.gz
- Upload date:
- Size: 34.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d376a8fd853bdc03920cc5222e8f373b78381dc3f026ab1d9b7e1d5ed3fd798c
|
|
| MD5 |
46e113b54a6c31b398653a1950804d63
|
|
| BLAKE2b-256 |
b07e819321da98417d5d29e6efee3c6533bab56471018c6d463cb026419f18d0
|
Provenance
The following attestation bundles were made for dagster_floe-0.1.6.tar.gz:
Publisher:
release-dagster-floe.yml on malon64/floe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dagster_floe-0.1.6.tar.gz -
Subject digest:
d376a8fd853bdc03920cc5222e8f373b78381dc3f026ab1d9b7e1d5ed3fd798c - Sigstore transparency entry: 1591808592
- Sigstore integration time:
-
Permalink:
malon64/floe@5d852955c76a67b579026d44196b97630d555fd3 -
Branch / Tag:
refs/tags/dagster-floe-v0.1.6 - Owner: https://github.com/malon64
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-dagster-floe.yml@5d852955c76a67b579026d44196b97630d555fd3 -
Trigger Event:
push
-
Statement type:
File details
Details for the file dagster_floe-0.1.6-py3-none-any.whl.
File metadata
- Download URL: dagster_floe-0.1.6-py3-none-any.whl
- Upload date:
- Size: 26.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81c6b9af81617cf89eabf764024380b9b2d11bf74e1beb4e8dd57a283a4ff4fe
|
|
| MD5 |
0f7f2794cba1ebce2c844515927e14e3
|
|
| BLAKE2b-256 |
c7c04e6aede4d7e54ffb8cbebae5a93724756547323626010b36dd22a36e6fd4
|
Provenance
The following attestation bundles were made for dagster_floe-0.1.6-py3-none-any.whl:
Publisher:
release-dagster-floe.yml on malon64/floe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dagster_floe-0.1.6-py3-none-any.whl -
Subject digest:
81c6b9af81617cf89eabf764024380b9b2d11bf74e1beb4e8dd57a283a4ff4fe - Sigstore transparency entry: 1591808606
- Sigstore integration time:
-
Permalink:
malon64/floe@5d852955c76a67b579026d44196b97630d555fd3 -
Branch / Tag:
refs/tags/dagster-floe-v0.1.6 - Owner: https://github.com/malon64
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-dagster-floe.yml@5d852955c76a67b579026d44196b97630d555fd3 -
Trigger Event:
push
-
Statement type: