No project description provided
Project description
spark-viewer-tui
A terminal UI for browsing and querying Delta Lake and Parquet tables with Apache Spark.
Built with Textual and PySpark.
GitHub: https://github.com/eritondev-stack/spark-viewer-tui
Features
- Catalog Browser - Sidebar tree with databases and tables
- SQL Editor - Write and execute Spark SQL queries with syntax highlighting
- Results Table - View query results with column types and row count
print_df- Send DataFrames from any script to the TUI in real time (see below)- Scan Paths - Auto-register Delta/Parquet folders as Spark tables
- Rescan - Refresh tables on demand (folders are live, Ctrl+R rescans)
- Save/Load Queries - Persist frequently used queries
- Themes - Multiple color themes (Transparent, Dracula, Gruvbox)
- Maximize - Focus on editor or results in full screen
Requirements
- Python 3.9+
- Java 17 (for PySpark) — must be available via
JAVA_HOMEorjavain yourPATH
Java Setup
macOS (Homebrew):
brew install openjdk@17
export JAVA_HOME=$(/usr/libexec/java_home -v 17)
Linux (Debian/Ubuntu):
sudo apt install openjdk-17-jdk
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
Add the export JAVA_HOME=... line to your ~/.bashrc or ~/.zshrc to make it persistent.
Verify:
java -version
Installation
pip install spark-viewer-tui
Or with uv:
uv pip install spark-viewer-tui
Usage
spark-viewer
Or run directly from source:
uv run spark-viewer
The Spark session starts automatically on launch. No configuration required to get started — F2 is only needed if you want to connect to an existing metastore or scan paths for Delta/Parquet files.
Keyboard Shortcuts
| Key | Action |
|---|---|
F2 |
Spark Configuration (metastore, warehouse, scan paths) |
F3 |
Save current query |
F4 |
Load saved query |
Ctrl+R |
Rescan configured paths and refresh catalog |
Ctrl+E |
Execute SQL query |
Ctrl+T |
Change theme |
Ctrl+W |
Maximize editor or results |
Ctrl+C |
Exit |
Getting Started
- Run
spark-viewer— Spark starts automatically - Click a table in the sidebar or write SQL in the editor
- Press
Ctrl+Eto run the query
To load your own Delta/Parquet files, press F2 and add scan paths.
print_df — Live DataFrame Viewer
Send any Spark or Pandas DataFrame from your script to the running TUI. The DataFrame appears instantly in the sidebar under a database called live and can be queried with SQL.
How it works
your_script.py ──print_df()──► TCP :7891 ──► live.<table> in the TUI
The TUI runs a lightweight TCP server on localhost:7891. print_df connects, sends the DataFrame as JSON, and the TUI registers it as an in-memory Spark table (global_temp.<table>), displayed as live.<table> in the sidebar.
Usage
from spark_viewer_tui import print_df
# Works with PySpark DataFrames
print_df(spark_df, "my_table")
# Works with Pandas DataFrames
print_df(pandas_df, "my_table")
The table appears in the sidebar under live. Click it to auto-generate and run SELECT * FROM global_temp.my_table LIMIT 1000, or write your own SQL query.
PySpark example
from pyspark.sql import SparkSession
from pyspark.sql.functions import rand, round as spark_round, when, col
from spark_viewer_tui import print_df
spark = SparkSession.builder.master("local[*]").getOrCreate()
df = spark.range(1, 101).select(
col("id"),
when(col("id") % 2 == 0, "Par").otherwise("Ímpar").alias("tipo"),
spark_round(rand() * 1000, 2).alias("valor"),
)
print_df(df, "minha_tabela")
Pandas example
import pandas as pd
from spark_viewer_tui import print_df
df = pd.DataFrame({
"produto": ["A", "B", "C"],
"receita": [1200.50, 850.00, 3400.75],
"ativo": [True, False, True],
})
print_df(df, "produtos")
Notes
- The TUI must be running before calling
print_df - DataFrames are truncated to 10,000 rows with a warning if larger
- Calling
print_dfwith the same table name replaces the previous data - Tables in
liveare in-memory only — they are lost when the TUI closes - Maximum payload size: 256 MB
Built-in example
Run the included example to see three DataFrames sent to the TUI at once:
# Terminal 1
spark-viewer
# Terminal 2 (after "Spark iniciado!" appears in the TUI)
spark-viewer-example
This sends three tables to the live database: vendas, metricas_servidor, and resumo_categorias.
Seed (Example Data)
The package includes a seed command that creates 6 Delta tables with 500 rows each (employees, products, orders, customers, logs, metrics). Useful for testing and exploring the tool.
# Uses paths from spark_config.json
spark-viewer-seed
# Or specify paths manually
spark-viewer-seed --metastore-db ./metastore_db --warehouse-dir ./spark-warehouse
After seeding, run spark-viewer and press Ctrl+R to load the tables.
Scan Paths
Scan paths auto-register Delta and Parquet tables from a directory. Each scan path has a database name and a folder path.
db_name: vendas
path: /data/warehouse
Subfolders are registered as tables:
- Subfolder with
_delta_log/-> Delta table - Subfolder with
.parquetfiles -> Parquet table
Every Ctrl+R (Refresh Catalog) drops and recreates the databases from scan paths, keeping tables in sync with the filesystem.
Configuration
Settings are saved in spark_config.json in the project directory:
{
"metastore_db": "/tmp/metastore_db",
"warehouse_dir": "/tmp/spark-warehouse",
"scan_paths": [
{ "path": "/data/warehouse", "db_name": "vendas" },
{ "path": "/data/lake", "db_name": "analytics" }
]
}
All fields are optional. If metastore_db and warehouse_dir are not set, the TUI uses temporary directories under /tmp/spark-viewer-tui/ automatically.
Themes are stored in ~/.config/spark-viewer-tui/themes.json. The file is created automatically on first run with the default themes. Edit it to customize colors or add new themes.
Project Structure
spark-viewer-tui/
├── src/
│ └── spark_viewer_tui/
│ ├── app.py # Main application
│ ├── client.py # print_df() client API
│ ├── ipc_server.py # TCP server for receiving DataFrames
│ ├── examples/
│ │ └── spark_example.py # spark-viewer-example entry point
│ ├── seed.py # Seed example Delta tables
│ ├── config.py # Configuration management
│ ├── spark_manager.py # Spark session and table registration
│ ├── queries.py # Query persistence
│ ├── themes.py # Theme system
│ └── screens/
│ ├── spark_config.py # Spark config modal (F2)
│ ├── save_query.py # Save query modal (F3)
│ ├── load_query.py # Load query modal (F4)
│ └── theme_selector.py # Theme selector modal (Ctrl+T)
└── pyproject.toml
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spark_viewer_tui-0.1.9.tar.gz.
File metadata
- Download URL: spark_viewer_tui-0.1.9.tar.gz
- Upload date:
- Size: 608.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8ff4ff8b21831f23b72ff971dd5ebdde8136dbdb62244d297b662a0e1900180
|
|
| MD5 |
5ff1b3d5cae0dba76719838cc9f5f0c4
|
|
| BLAKE2b-256 |
b7f3b1468763bfaaf3c64ed4e881735b56728dea319dd230d096fe62ce56fd34
|
Provenance
The following attestation bundles were made for spark_viewer_tui-0.1.9.tar.gz:
Publisher:
publish.yml on eritondev-stack/spark-viewer-tui
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spark_viewer_tui-0.1.9.tar.gz -
Subject digest:
e8ff4ff8b21831f23b72ff971dd5ebdde8136dbdb62244d297b662a0e1900180 - Sigstore transparency entry: 970962131
- Sigstore integration time:
-
Permalink:
eritondev-stack/spark-viewer-tui@7ff311016e54c3e149610be2690ae6758aa38b38 -
Branch / Tag:
refs/tags/v0.1.9 - Owner: https://github.com/eritondev-stack
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7ff311016e54c3e149610be2690ae6758aa38b38 -
Trigger Event:
push
-
Statement type:
File details
Details for the file spark_viewer_tui-0.1.9-py3-none-any.whl.
File metadata
- Download URL: spark_viewer_tui-0.1.9-py3-none-any.whl
- Upload date:
- Size: 29.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb0fc5d037a2d53fdf7dfe491c331efd7e3b4be795eb70108ef2c6584d113b12
|
|
| MD5 |
5fd6f2d3e4e1696fe6d15d04aefc23e4
|
|
| BLAKE2b-256 |
e832d4caea873a88dd58b6f19365abbf854d308becfcd5094320057da154a02a
|
Provenance
The following attestation bundles were made for spark_viewer_tui-0.1.9-py3-none-any.whl:
Publisher:
publish.yml on eritondev-stack/spark-viewer-tui
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spark_viewer_tui-0.1.9-py3-none-any.whl -
Subject digest:
bb0fc5d037a2d53fdf7dfe491c331efd7e3b4be795eb70108ef2c6584d113b12 - Sigstore transparency entry: 970962135
- Sigstore integration time:
-
Permalink:
eritondev-stack/spark-viewer-tui@7ff311016e54c3e149610be2690ae6758aa38b38 -
Branch / Tag:
refs/tags/v0.1.9 - Owner: https://github.com/eritondev-stack
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7ff311016e54c3e149610be2690ae6758aa38b38 -
Trigger Event:
push
-
Statement type: