Interactive visualization of Spark jobs
Project description
spark-board
: interactive PySpark dataframes visualization
spark-board
provides an interactive way to analize PySpark data frame execution plans as a static website displaying the transformations DAG.
Check out the examples for a quick overview of the features (and the corresponding examples source code here).
If you intend to develop spark-board
or run from source, check out the documentation.
Usage
spark-board
takes a PySpark data frame and inspects the operations to build the DAG. This usually is the final step of a PySpark script, right before writing it to disk.
Install spark-board
pip install spark-board
Run spark-board
from spark_board.html import dump_dataframe, DefaultSettings
# get the PySpark data frame that will be displayed
df = ...
dump_dataframe(
df=df,
output_dir="./spark_board_output",
overwrite=True, # overwrite output_dir if it already exists
default_settings=DefaultSettings(), # override default settings if desired
)
and that's it! spark-board
will generate a static website in the defined output_dir
folder. You can now serve the website using any web server and inspect the operations.
You can check out the available default settings here.
Serving
spark-board
is intended to be a live documentation of PySpark scripts. Because of this, it's advisable to run it every time the source code is updated. For example, spark-board
can be run as part of a CI pipeline and the generated website uploaded to a static website hosting service, like Github or Gitlab pages (we actually do this to update and serve the examples in this repository).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for spark_board-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d60f15bb4cb6c6ae413c94496799aee67e5ca2ff5083a62a8e8599107c98e3bb |
|
MD5 | 193759891bbedfe05651da1b2d8fdeff |
|
BLAKE2b-256 | 85347cf1bc9d5240a72f76570334f5583e5e515cc366229842332b15cd20e71f |