Project combining flowfile core (backend) and flowfile_worker (compute offloader) and flowfile_frame (api)
Project description
Flowfile
Main Repository: Edwardvaneechoud/Flowfile
Documentation:
Website -
Core -
Worker -
Frontend -
Technical Architecture
Flowfile is an open-source data platform that combines a visual pipeline builder, a data catalog with Delta Lake storage, scheduling, Kafka ingestion, sandboxed Python execution, and a Polars-compatible Python API — all in a single pip install.
Quick Start
pip install Flowfile
flowfile run ui
This starts the backend services and opens the visual ETL interface in your browser.
What You Get
- Visual pipeline builder with 30+ nodes for joins, filters, aggregations, fuzzy matching, pivots, and more
- Data catalog with Delta Lake storage, version history, and lineage tracking
- Scheduling — interval-based or triggered by catalog table updates
- Kafka/Redpanda ingestion as a canvas node with automatic schema inference
- Sandboxed Python execution in isolated Docker containers
- Code generation — export visual flows as standalone Python/Polars scripts
- Flow parameters —
${variable}substitution, configurable via UI or CLI - Cloud storage — S3, Azure Data Lake Storage, Google Cloud Storage
- Database connectivity — PostgreSQL, MySQL, SQL Server, Oracle, DuckDB, and more
- Python API with Polars-like syntax and visual flow graph generation
Python API
import flowfile as ff
from flowfile import col, open_graph_in_editor
df = ff.from_dict({
"id": [1, 2, 3, 4, 5],
"category": ["A", "B", "A", "C", "B"],
"value": [100, 200, 150, 300, 250]
})
result = df.filter(col("value") > 150).with_columns([
(col("value") * 2).alias("double_value")
])
# Open the pipeline on the visual canvas
open_graph_in_editor(result.flow_graph)
Common Operations
import flowfile as ff
from flowfile import col, when, lit
# Read from various sources
df = ff.read_csv("data.csv")
df_pq = ff.read_parquet("data.parquet")
# Transform
filtered = df.filter(col("value") > 150)
with_status = df.with_columns([
when(col("value") > 200).then(lit("High")).otherwise(lit("Low")).alias("status")
])
# Aggregate
by_category = df.group_by("category").agg([
col("value").sum().alias("total"),
col("value").mean().alias("average")
])
# Join
joined = df.join(other_df, left_on="id", right_on="product_id")
# Visualize any pipeline
ff.open_graph_in_editor(joined.flow_graph)
Code Generation
Export visual flows as standalone Python/Polars scripts:
Package Components
- Core Service (
flowfile_core) — ETL engine, catalog, scheduler, auth - Worker Service (
flowfile_worker) — CPU-intensive data processing - Web UI — Browser-based visual pipeline builder
- FlowFrame API (
flowfile_frame) — Polars-compatible Python library - Scheduler (
flowfile_scheduler) — Interval and table-trigger scheduling
CLI
flowfile run ui # Start web UI
flowfile run core --host 0.0.0.0 # Start core service
flowfile run worker --host 0.0.0.0 # Start worker service
flowfile run flow pipeline.json # Run a flow
flowfile run flow pipeline.json --param key=value # Run with parameters
More Options
- Desktop App: Download from GitHub Releases
- Docker:
docker compose up -dfor self-hosted deployments - Browser Demo: demo.flowfile.org (WASM, no server)
Resources
- Documentation: Comprehensive guides
- Main Repository: Latest code and examples
- Technical Architecture: Design overview
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flowfile-0.8.2.tar.gz.
File metadata
- Download URL: flowfile-0.8.2.tar.gz
- Upload date:
- Size: 5.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93a677d1ed3c6a72277c26bd055eb628a6676c38830df128c6da5640e03555d6
|
|
| MD5 |
f056c088f614ca198faac408288727cc
|
|
| BLAKE2b-256 |
244c134c5adf2799fd6cd8ff4e7725f7871d83f05810b4179b4bcacbd0f57e3c
|
Provenance
The following attestation bundles were made for flowfile-0.8.2.tar.gz:
Publisher:
pypi-release.yml on Edwardvaneechoud/Flowfile
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flowfile-0.8.2.tar.gz -
Subject digest:
93a677d1ed3c6a72277c26bd055eb628a6676c38830df128c6da5640e03555d6 - Sigstore transparency entry: 1279911831
- Sigstore integration time:
-
Permalink:
Edwardvaneechoud/Flowfile@ccb4a55efe4b0b0e34f03c346ffdc1d764327fcb -
Branch / Tag:
refs/tags/v0.8.2 - Owner: https://github.com/Edwardvaneechoud
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-release.yml@ccb4a55efe4b0b0e34f03c346ffdc1d764327fcb -
Trigger Event:
push
-
Statement type:
File details
Details for the file flowfile-0.8.2-py3-none-any.whl.
File metadata
- Download URL: flowfile-0.8.2-py3-none-any.whl
- Upload date:
- Size: 5.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
815f5614cc203c7c6c32ff8696831301be55f6858febc7b9e4db0293d137d849
|
|
| MD5 |
be2ce646545b8b786e7f3e7884447276
|
|
| BLAKE2b-256 |
e52faedda51ed7e2a20d9b9073ce434b37772a608e414fb7bd97654a6f77093e
|
Provenance
The following attestation bundles were made for flowfile-0.8.2-py3-none-any.whl:
Publisher:
pypi-release.yml on Edwardvaneechoud/Flowfile
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flowfile-0.8.2-py3-none-any.whl -
Subject digest:
815f5614cc203c7c6c32ff8696831301be55f6858febc7b9e4db0293d137d849 - Sigstore transparency entry: 1279911890
- Sigstore integration time:
-
Permalink:
Edwardvaneechoud/Flowfile@ccb4a55efe4b0b0e34f03c346ffdc1d764327fcb -
Branch / Tag:
refs/tags/v0.8.2 - Owner: https://github.com/Edwardvaneechoud
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-release.yml@ccb4a55efe4b0b0e34f03c346ffdc1d764327fcb -
Trigger Event:
push
-
Statement type: