Skip to main content

Dagster connector for Floe CLI (config-driven ingestion)

Project description

Floe + Dagster

This folder contains the Dagster connector for Floe.

For local setup of both Dagster and Airflow with isolated virtual environments, see:

  • orchestrators/LOCAL_DEV.md

Current model:

  • Parse-time reads Floe manifests (floe.manifest.v1), not Floe YAML.
  • Supports single manifest and multi-manifest directory loading.
  • Builds one asset per entity and one job per manifest.
  • Runs Floe with manifest execution contract and JSON logs.
  • Publishes run/entity metadata from NDJSON + summary.

What Is Implemented

  • Manifest-first orchestration (floe.manifest.v1).
  • Strict schema validation at manifest load time.
  • Asset generation from entities[] (asset_key, group_name respected).
  • One Dagster job per manifest (stable job names).
  • Multi-manifest loading (*.manifest.json) with collision checks.
  • Local runner (local_process) support.
  • execution.defaults.env and execution.defaults.workdir support.
  • Floe quality outcomes exposed as Dagster native Asset Checks (cast_error, not_null, unique, schema_mismatch, file_status) from Floe reports.

Install

Prereqs:

  • Floe installed (either the floe CLI binary or Docker with a Floe image).
  • Python 3.10+.
pip install dagster-floe

Development install (from this repo)

python3 -m venv orchestrators/dagster-floe/.venv
source orchestrators/dagster-floe/.venv/bin/activate
pip install -e orchestrators/dagster-floe[dev]

Generate a manifest

floe manifest generate \
  -c orchestrators/dagster-floe/example/config.yml \
  --output orchestrators/dagster-floe/example/manifest.dagster.json

Run the example (repo-only)

cd orchestrators/dagster-floe
FLOE_MANIFEST_DIR=./example/manifests dagster dev

The example workspace loads example/definitions.py, which wires local example files/manifest to the reusable connector APIs. The repository example includes two manifests by domain:

  • example/manifests/hr.manifest.json
  • example/manifests/sales.manifest.json

Notes

  • This connector does not parse YAML directly; it consumes floe.manifest.v1.
  • Connector logic lives under src/floe_dagster/; local wiring for demo lives in example/definitions.py.
  • For local development without an installed floe binary, you can point LocalRunner to a custom command, e.g.:
    • LocalRunner(\"cargo run -p floe-cli --\")
  • Manifest runner support in connector is currently local_process only.
  • For local setup commands, use orchestrators/LOCAL_DEV.md.
  • Design notes and future work: orchestrators/dagster-floe/INTEGRATION_SPEC.md

What Is Not Implemented Yet

  • Kubernetes/ECS runner adapters.
  • Cloud summary loading (s3://, gs://, abfs://).
  • Single-process multi-entity fan-out execution mode.

Releasing

This repo is a monorepo. Floe and this connector are versioned and tagged independently:

  • Floe CLI release tags: vX.Y.Z
  • Dagster connector release tags: dagster-floe-vX.Y.Z (triggers the PyPI publish workflow)

Example:

git checkout main
git pull
git tag dagster-floe-v0.1.0
git push origin dagster-floe-v0.1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagster_floe-0.1.3.tar.gz (26.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagster_floe-0.1.3-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file dagster_floe-0.1.3.tar.gz.

File metadata

  • Download URL: dagster_floe-0.1.3.tar.gz
  • Upload date:
  • Size: 26.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dagster_floe-0.1.3.tar.gz
Algorithm Hash digest
SHA256 2e9073e8c2eaaead4d17730a4222bfb51f5ab67332520e3722238f02dcbfc642
MD5 44b2bb8d796064f24688aac90071b658
BLAKE2b-256 feebf25ab5d2b27c911084eae6970bb99937a1cccbf962228690b5ed4f0f8265

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.3.tar.gz:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dagster_floe-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: dagster_floe-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dagster_floe-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c5e019de5cb673d9ee837338e876c00e938d645a10b3be5f7b079bb7ab1fa882
MD5 3ede5b11e73aba51ba5c12e7b0ef3b6b
BLAKE2b-256 a97c06c93e9286801e17243e013216bcede7494eef3fccb32e2b562d071925d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.3-py3-none-any.whl:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page