Skip to main content

CLI for querying the Apache Spark History Server REST API

Project description

spark-history-cli

A CLI for querying the Apache Spark History Server REST API.

Prerequisites

  • Python 3.10+
  • A running Spark History Server (default: http://localhost:18080)

Start the History Server:

$SPARK_HOME/sbin/start-history-server.sh

Installation

cd spark-history-cli
pip install -e .

Or install from PyPI:

pip install spark-history-cli

Install as agent skills

Install skills for any supported agent (Claude Code, Copilot, Cursor, Codex, and 39 more):

npx skills add yaooqinn/spark-history-cli

This installs two skills:

  • spark-history-cli — Query the Spark History Server
  • spark-advisor — Diagnose, compare, and optimize Spark applications

Or install via the bundled CLI command (Copilot CLI / Claude Code only):

spark-history-cli install-skill

Usage

REPL Mode (default)

spark-history-cli
# or specify a server:
spark-history-cli --server http://my-shs:18080

One-Shot Commands

# List applications
spark-history-cli apps
spark-history-cli apps --status completed --limit 10

# Application details
spark-history-cli app <app-id>

# Jobs, stages, executors (requires --app-id or 'use' in REPL)
spark-history-cli --app-id <id> jobs
spark-history-cli --app-id <id> stages
spark-history-cli --app-id <id> executors --all
spark-history-cli --app-id <id> sql
spark-history-cli --app-id <id> env
spark-history-cli --app-id <id> summary

# SQL execution plans
spark-history-cli --app-id <id> sql-plan <exec-id>                # full plan
spark-history-cli --app-id <id> sql-plan <exec-id> --view initial # pre-AQE plan
spark-history-cli --app-id <id> sql-plan <exec-id> --view final   # post-AQE plan
spark-history-cli --app-id <id> sql-plan <exec-id> --dot          # Graphviz DOT
spark-history-cli --app-id <id> sql-plan <exec-id> --dot -o plan.dot  # save to file

# Jobs for a SQL execution
spark-history-cli --app-id <id> sql-jobs <exec-id>

# Download event logs
spark-history-cli --app-id <id> logs output.zip

# JSON output for scripting/agents
spark-history-cli --json apps
spark-history-cli --json --app-id <id> jobs
spark-history-cli --json --app-id <id> sql-plan <exec-id>
spark-history-cli --json --app-id <id> sql-jobs <exec-id>

REPL Commands

apps                    List applications
app <id>                Show app details and set as current
attempts                List attempts for current app
attempt <id>            Show attempt details
use <id>                Set current app context
jobs                    List jobs for current app
job <id>                Show job details
job-stages <id>         Show stages for a job
stages                  List stages
stage <id> [attempt]    Show stage details
stage-summary <id>      Task metric quantiles (p5-p95)
stage-tasks <id>        List tasks (--length N, --sort-by)
executors [--all]       List executors
sql [id]                List or show SQL executions
sql-plan <id> [opts]    Show SQL plan (--view, --dot, -o)
sql-jobs <id>           Show jobs for a SQL execution
summary                 Application overview (config + workload)
processes               List miscellaneous processes
rdds                    List cached RDDs
env                     Show environment/config
logs [path]             Download event logs
version                 Show Spark version
server <url>            Change server URL
status                  Show session state
help                    Show help
quit                    Exit

Environment Variables

  • SPARK_HISTORY_SERVER — Default server URL (overrides http://localhost:18080)

API Coverage

Wraps all 20 endpoints of the Spark History Server REST API (/api/v1/):

  • Applications (list, get, attempts)
  • Jobs (list, get)
  • Stages (list, get, attempts, task summary, task list)
  • Executors (active, all)
  • SQL Executions (list, get with plan graph)
  • Storage (RDD list, detail)
  • Environment
  • Event Logs (download as ZIP)
  • Miscellaneous Processes
  • Version

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_history_cli-1.5.0.tar.gz (41.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spark_history_cli-1.5.0-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file spark_history_cli-1.5.0.tar.gz.

File metadata

  • Download URL: spark_history_cli-1.5.0.tar.gz
  • Upload date:
  • Size: 41.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spark_history_cli-1.5.0.tar.gz
Algorithm Hash digest
SHA256 4cf4cab6e00d7082e9892013c257e73ccc2c9121d680b164e4e326ab5e116bcd
MD5 9b6080d4ca10a8e46f4e69e370227ded
BLAKE2b-256 3e6d79f5761222aeee5e051c6a755364ae98f1db4e4325367245d92a5fc9e9d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for spark_history_cli-1.5.0.tar.gz:

Publisher: publish.yml on yaooqinn/spark-history-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spark_history_cli-1.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for spark_history_cli-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f97b97694dc38cd5192472ef7b8fa8c9c41b2ef5bb67803d8e4a6a68888c4002
MD5 6c4de11fab148b5052f9f8a2c5625eb0
BLAKE2b-256 ac8e76c27db9c8804106cad56f791742e951f6e03ce2117c81843c9fe9703363

See more details on using hashes here.

Provenance

The following attestation bundles were made for spark_history_cli-1.5.0-py3-none-any.whl:

Publisher: publish.yml on yaooqinn/spark-history-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page