Skip to main content

Dagster connector for Floe CLI (config-driven ingestion)

Project description

Floe + Dagster

This folder contains the Dagster connector for Floe.

For local setup of both Dagster and Airflow with isolated virtual environments, see:

  • orchestrators/LOCAL_DEV.md

Current model:

  • Parse-time reads Floe manifests (floe.manifest.v1), not Floe YAML.
  • Supports single manifest and multi-manifest directory loading.
  • Builds one asset per entity and one job per manifest.
  • Runs Floe with manifest execution contract and JSON logs.
  • Publishes run/entity metadata from NDJSON + summary.

What Is Implemented

  • Manifest-first orchestration (floe.manifest.v1).
  • Strict schema validation at manifest load time.
  • Asset generation from entities[] (asset_key, group_name respected).
  • One Dagster job per manifest (stable job names).
  • Multi-manifest loading (*.manifest.json) with collision checks.
  • Local runner (local_process) support.
  • execution.defaults.env and execution.defaults.workdir support.
  • Floe quality outcomes exposed as Dagster native Asset Checks (cast_error, not_null, unique, schema_mismatch, file_status) from Floe reports.

Install

Prereqs:

  • Floe installed (either the floe CLI binary or Docker with a Floe image).
  • Python 3.10+.
pip install dagster-floe

Development install (from this repo)

python3 -m venv orchestrators/dagster-floe/.venv
source orchestrators/dagster-floe/.venv/bin/activate
pip install -e orchestrators/dagster-floe[dev]

Generate a manifest

floe manifest generate \
  -c orchestrators/dagster-floe/example/config.yml \
  --output orchestrators/dagster-floe/example/manifest.dagster.json

Run the example (repo-only)

cd orchestrators/dagster-floe
FLOE_MANIFEST_DIR=./example/manifests dagster dev

The example workspace loads example/definitions.py, which wires local example files/manifest to the reusable connector APIs. The repository example includes two manifests by domain:

  • example/manifests/hr.manifest.json
  • example/manifests/sales.manifest.json

Notes

  • This connector does not parse YAML directly; it consumes floe.manifest.v1.
  • Connector logic lives under src/floe_dagster/; local wiring for demo lives in example/definitions.py.
  • For local development without an installed floe binary, you can point LocalRunner to a custom command, e.g.:
    • LocalRunner(\"cargo run -p floe-cli --\")
  • Manifest runner support in connector is currently local_process only.
  • For local setup commands, use orchestrators/LOCAL_DEV.md.
  • Design notes and future work: orchestrators/dagster-floe/INTEGRATION_SPEC.md

What Is Not Implemented Yet

  • Kubernetes/ECS runner adapters.
  • Cloud summary loading (s3://, gs://, abfs://).
  • Single-process multi-entity fan-out execution mode.

Releasing

This repo is a monorepo. Floe and this connector are versioned and tagged independently:

  • Floe CLI release tags: vX.Y.Z
  • Dagster connector release tags: dagster-floe-vX.Y.Z (triggers the PyPI publish workflow)

Example:

git checkout main
git pull
git tag dagster-floe-v0.1.0
git push origin dagster-floe-v0.1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagster_floe-0.1.4.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagster_floe-0.1.4-py3-none-any.whl (25.9 kB view details)

Uploaded Python 3

File details

Details for the file dagster_floe-0.1.4.tar.gz.

File metadata

  • Download URL: dagster_floe-0.1.4.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dagster_floe-0.1.4.tar.gz
Algorithm Hash digest
SHA256 37c1f6d121e270feab8fabb2997328d3b1fa4d26b893d562d17dc691524cc2b0
MD5 015b28ea657097cc1aeeba3ba897ef61
BLAKE2b-256 94d999842e994a10d3597d35832824fe63d3da5aa15808b21be4a9408656d4d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.4.tar.gz:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dagster_floe-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: dagster_floe-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 25.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dagster_floe-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c529599adf9274647837e4e81507bc60c00cf93b309d1d8f711711b768dfb998
MD5 fb4067a37b69025b1557f5a145f043ab
BLAKE2b-256 12263d262adf24fc053c4e122aab845c489530da0b3a2296d34adbce4084966c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.4-py3-none-any.whl:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page