Skip to main content

Dagster connector for Floe CLI (config-driven ingestion)

Project description

Floe + Dagster

This folder contains the Dagster connector for Floe.

For local setup of both Dagster and Airflow with isolated virtual environments, see:

  • orchestrators/LOCAL_DEV.md

Current model:

  • Parse-time reads Floe manifests (floe.manifest.v1), not Floe YAML.
  • Supports single manifest and multi-manifest directory loading.
  • Builds one asset per entity and one job per manifest.
  • Runs Floe with manifest execution contract and JSON logs.
  • Publishes run/entity metadata from NDJSON + summary.

What Is Implemented

  • Manifest-first orchestration (floe.manifest.v1).
  • Strict schema validation at manifest load time.
  • Asset generation from entities[] (asset_key, group_name respected).
  • One Dagster job per manifest (stable job names).
  • Multi-manifest loading (*.manifest.json) with collision checks.
  • Local runner (local_process) support.
  • execution.defaults.env and execution.defaults.workdir support.
  • Floe quality outcomes exposed as Dagster native Asset Checks (cast_error, not_null, unique, schema_mismatch, file_status) from Floe reports.

Install

Prereqs:

  • Floe installed (either the floe CLI binary or Docker with a Floe image).
  • Python 3.10+.
pip install dagster-floe

Development install (from this repo)

python3 -m venv orchestrators/dagster-floe/.venv
source orchestrators/dagster-floe/.venv/bin/activate
pip install -e orchestrators/dagster-floe[dev]

Generate a manifest

floe manifest generate \
  -c orchestrators/dagster-floe/example/config.yml \
  --output orchestrators/dagster-floe/example/manifest.dagster.json

Run the example (repo-only)

cd orchestrators/dagster-floe
FLOE_MANIFEST_DIR=./example/manifests dagster dev

The example workspace loads example/definitions.py, which wires local example files/manifest to the reusable connector APIs. The repository example includes two manifests by domain:

  • example/manifests/hr.manifest.json
  • example/manifests/sales.manifest.json

Notes

  • This connector does not parse YAML directly; it consumes floe.manifest.v1.
  • Connector logic lives under src/floe_dagster/; local wiring for demo lives in example/definitions.py.
  • For local development without an installed floe binary, you can point LocalRunner to a custom command, e.g.:
    • LocalRunner(\"cargo run -p floe-cli --\")
  • Manifest runner support in connector is currently local_process only.
  • For local setup commands, use orchestrators/LOCAL_DEV.md.
  • Design notes and future work: orchestrators/dagster-floe/INTEGRATION_SPEC.md

What Is Not Implemented Yet

  • Kubernetes/ECS runner adapters.
  • Cloud summary loading (s3://, gs://, abfs://).
  • Single-process multi-entity fan-out execution mode.

Releasing

This repo is a monorepo. Floe and this connector are versioned and tagged independently:

  • Floe CLI release tags: vX.Y.Z
  • Dagster connector release tags: dagster-floe-vX.Y.Z (triggers the PyPI publish workflow)

Example:

git checkout main
git pull
git tag dagster-floe-v0.1.0
git push origin dagster-floe-v0.1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagster_floe-0.1.2.tar.gz (20.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagster_floe-0.1.2-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file dagster_floe-0.1.2.tar.gz.

File metadata

  • Download URL: dagster_floe-0.1.2.tar.gz
  • Upload date:
  • Size: 20.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dagster_floe-0.1.2.tar.gz
Algorithm Hash digest
SHA256 17fe7752adfb72bf3bd4540982a92875005a441dff4c813e3e85faaba8b8c1bb
MD5 cc5b184cd403a1e402dfe01aa55b2493
BLAKE2b-256 805cb86b932dcc7e4cfc023f8b6b65a4ade2b2d51326c8e2816454be2160bd0c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.2.tar.gz:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dagster_floe-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: dagster_floe-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dagster_floe-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fa71e817bfd513cc968a805fa7e91925ef581401e10d7be934bac55d42f373dc
MD5 3537a41746af784950ecedde286dd9ab
BLAKE2b-256 606346412bdaa12b8e85f8e726a7ff79191e640cf2a5529371d1f62eb8066743

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagster_floe-0.1.2-py3-none-any.whl:

Publisher: release-dagster-floe.yml on malon64/floe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page