Skip to main content

Dagster connector for Floe CLI (config-driven ingestion)

Project description

Floe + Dagster

This folder contains the Dagster connector for Floe.

For local setup of both Dagster and Airflow with isolated virtual environments, see:

  • orchestrators/LOCAL_DEV.md

Current model:

  • Parse-time reads Floe manifests (floe.manifest.v1), not Floe YAML.
  • Supports single manifest and multi-manifest directory loading.
  • Builds one asset per entity and one job per manifest.
  • Runs Floe with manifest execution contract and JSON logs.
  • Publishes run/entity metadata from NDJSON + summary.

What Is Implemented

  • Manifest-first orchestration (floe.manifest.v1).
  • Strict schema validation at manifest load time.
  • Asset generation from entities[] (asset_key, group_name respected).
  • One Dagster job per manifest (stable job names).
  • Multi-manifest loading (*.manifest.json) with collision checks.
  • Local runner (local_process) support.
  • execution.defaults.env and execution.defaults.workdir support.
  • Floe quality outcomes exposed as Dagster native Asset Checks (cast_error, not_null, unique, schema_mismatch, file_status) from Floe reports.

Install

Prereqs:

  • Floe installed (either the floe CLI binary or Docker with a Floe image).
  • Python 3.10+.
pip install dagster-floe

Development install (from this repo)

python3 -m venv orchestrators/dagster-floe/.venv
source orchestrators/dagster-floe/.venv/bin/activate
pip install -e orchestrators/dagster-floe[dev]

Generate a manifest

floe manifest generate \
  -c orchestrators/dagster-floe/example/config.yml \
  --output orchestrators/dagster-floe/example/manifest.dagster.json

Run the example (repo-only)

cd orchestrators/dagster-floe
FLOE_MANIFEST_DIR=./example/manifests dagster dev

The example workspace loads example/definitions.py, which wires local example files/manifest to the reusable connector APIs. The repository example includes two manifests by domain:

  • example/manifests/hr.manifest.json
  • example/manifests/sales.manifest.json

Notes

  • This connector does not parse YAML directly; it consumes floe.manifest.v1.
  • Connector logic lives under src/floe_dagster/; local wiring for demo lives in example/definitions.py.
  • For local development without an installed floe binary, you can point LocalRunner to a custom command, e.g.:
    • LocalRunner(\"cargo run -p floe-cli --\")
  • Manifest runner support in connector is currently local_process only.
  • For local setup commands, use orchestrators/LOCAL_DEV.md.
  • Design notes and future work: orchestrators/dagster-floe/INTEGRATION_SPEC.md

What Is Not Implemented Yet

  • Kubernetes/ECS runner adapters.
  • Cloud summary loading (s3://, gs://, abfs://).
  • Single-process multi-entity fan-out execution mode.

Releasing

This repo is a monorepo. Floe and this connector are versioned and tagged independently:

  • Floe CLI release tags: vX.Y.Z
  • Dagster connector release tags: dagster-floe-vX.Y.Z (triggers the PyPI publish workflow)

Example:

git checkout main
git pull
git tag dagster-floe-v0.1.0
git push origin dagster-floe-v0.1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagster_floe-0.1.6.tar.gz (34.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagster_floe-0.1.6-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file dagster_floe-0.1.6.tar.gz.

File metadata

  • Download URL: dagster_floe-0.1.6.tar.gz
  • Upload date:
  • Size: 34.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dagster_floe-0.1.6.tar.gz
Algorithm Hash digest
SHA256 d376a8fd853bdc03920cc5222e8f373b78381dc3f026ab1d9b7e1d5ed3fd798c
MD5 46e113b54a6c31b398653a1950804d63
BLAKE2b-256 b07e819321da98417d5d29e6efee3c6533bab56471018c6d463cb026419f18d0

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.6.tar.gz:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dagster_floe-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: dagster_floe-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dagster_floe-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 81c6b9af81617cf89eabf764024380b9b2d11bf74e1beb4e8dd57a283a4ff4fe
MD5 0f7f2794cba1ebce2c844515927e14e3
BLAKE2b-256 c7c04e6aede4d7e54ffb8cbebae5a93724756547323626010b36dd22a36e6fd4

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.6-py3-none-any.whl:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page