Visual data pipeline debugger for Polars — stop print-debugging your pipelines
Project description
flowview
Visual data pipeline debugger for Polars. Stop print-debugging your pipelines.
Install
pip install flowview
or with uv:
uv add flowview
Quick Start
Add @fv.trace to any function that transforms a Polars DataFrame. flowview traces every method call and renders a visual flow in your terminal.
import polars as pl
import flowview as fv
@fv.trace
def process(df: pl.DataFrame) -> pl.DataFrame:
return (
df.filter(pl.col("status") == "active")
.with_columns((pl.col("price") * pl.col("quantity")).alias("revenue"))
.group_by("category")
.agg(pl.col("revenue").sum().alias("total_revenue"))
.sort("total_revenue", descending=True)
)
df = pl.DataFrame({
"status": ["active", "inactive", "active"],
"category": ["Books", "Books", "Electronics"],
"price": [14.99, 599.99, 299.99],
"quantity": [5, 1, 2],
})
result = process(df)
.pipe() chains work too:
@fv.trace
def process(df: pl.DataFrame) -> pl.DataFrame:
return df.pipe(clean).pipe(filter_active).pipe(add_revenue)
What You See
Each step in your pipeline is displayed as a box showing:
- Row count with diff from the previous step (e.g.,
700 rows x 4 cols (-300 rows)) - Schema changes — columns added or removed (e.g.,
+cols: revenue -cols: status) - Sample data — first N rows at each transformation
- Execution time per step
Steps are connected with arrows to show the flow. A summary footer shows the total step count and wall-clock time.
Supported Operations
flowview traces any DataFrame method that returns a new DataFrame. These methods get human-readable step names:
| Method | Step name example |
|---|---|
filter(expr) |
filter((col("status")) == ("active")) |
with_columns(exprs) |
with_columns(revenue, tax) |
select(cols) |
select(status, price) |
drop(cols) |
drop(status, category) |
rename(mapping) |
rename(price->unit_price) |
sort(cols) |
sort(price, quantity) |
head(n) / tail(n) |
head(10) / tail(5) |
unique(subset) |
unique(id) |
join(other, ...) |
join(on=id, how=left) |
group_by(cols).agg(exprs) |
group_by(category).agg(total_revenue) |
pipe(fn) |
uses the function name, e.g. clean_data |
Other methods (e.g., explode, melt, unpivot) are traced with a fallback name like explode('tags').
Options
@fv.trace(sample_rows=3, show_sample=True, show_schema=True)
def process(df):
...
| Option | Type | Default | Description |
|---|---|---|---|
sample_rows |
int |
5 |
Number of sample rows to capture at each step |
show_sample |
bool |
True |
Display sample data tables in the output |
show_schema |
bool |
False |
Display the full schema at each step |
How It Works
The @fv.trace decorator wraps the first DataFrame argument in a lightweight proxy before calling your function. The proxy intercepts every method call, captures a snapshot of the result (row count, schema, sample rows, timing), and delegates to the real Polars DataFrame underneath. When your function returns, the proxy is unwrapped and you get back a regular pl.DataFrame.
There is no monkey-patching and no global state. Each decorated call is fully isolated.
Limitations
- LazyFrame is not supported —
df.lazy()exits the proxy. Only eager DataFrames are traced. - GroupBy shortcuts like
.count(),.sum(),.first()on a GroupBy object are not traced — use.agg()instead. - Pipe internals are not individually traced —
df.pipe(fn)produces a single step named afterfn, not one step per operation insidefn. - IDE autocomplete may not show DataFrame methods inside the decorated function body.
type(df)returnsTracedDataFrameinside the decorated function.isinstance(df, pl.DataFrame)works correctly.- Only the first DataFrame argument is wrapped when a function takes multiple DataFrames.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flowview-0.2.0.tar.gz.
File metadata
- Download URL: flowview-0.2.0.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbafe5c131640e9e3fb11c087c2a577fad4d98867c3287aecf0fa6746beef7cd
|
|
| MD5 |
9093e7364ccdd5b953966c00f70aa1fc
|
|
| BLAKE2b-256 |
404a1a363d6bd0ad208871f316b9c73d62d1a5b261be70469f342e4ceff64a3f
|
Provenance
The following attestation bundles were made for flowview-0.2.0.tar.gz:
Publisher:
cd.yml on guillermodotn/flowview
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flowview-0.2.0.tar.gz -
Subject digest:
bbafe5c131640e9e3fb11c087c2a577fad4d98867c3287aecf0fa6746beef7cd - Sigstore transparency entry: 1199769169
- Sigstore integration time:
-
Permalink:
guillermodotn/flowview@24284a3dce4333786a1b1273ba90a1e5232097b3 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/guillermodotn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@24284a3dce4333786a1b1273ba90a1e5232097b3 -
Trigger Event:
release
-
Statement type:
File details
Details for the file flowview-0.2.0-py3-none-any.whl.
File metadata
- Download URL: flowview-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f20e640f9c229f4fcba290f230ace8bfd91a2dbd56b7fda2b18544e338dd1c8
|
|
| MD5 |
e9234ef5dd2808f64a1efdee2f3f6ee2
|
|
| BLAKE2b-256 |
475cf25db383b9daeb959fd7fa7de9a4ba0fc12b51b08ca48dfae2ab3c8ab6a1
|
Provenance
The following attestation bundles were made for flowview-0.2.0-py3-none-any.whl:
Publisher:
cd.yml on guillermodotn/flowview
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flowview-0.2.0-py3-none-any.whl -
Subject digest:
7f20e640f9c229f4fcba290f230ace8bfd91a2dbd56b7fda2b18544e338dd1c8 - Sigstore transparency entry: 1199769179
- Sigstore integration time:
-
Permalink:
guillermodotn/flowview@24284a3dce4333786a1b1273ba90a1e5232097b3 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/guillermodotn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@24284a3dce4333786a1b1273ba90a1e5232097b3 -
Trigger Event:
release
-
Statement type: