Skip to main content

AI-powered Apache Spark job analyzer and configuration advisor

Project description

spark-advisor

AI-powered Apache Spark job analyzer and configuration advisor.

Stop guessing Spark configs. Let data and AI tell you what's wrong.

Install

pip install spark-advisor-cli

Quick Start

# Analyze from event log file (rules-only, free)
spark-advisor analyze /path/to/event-log.json.gz --no-ai

# Analyze with AI recommendations
export ANTHROPIC_API_KEY=sk-ant-...
spark-advisor analyze /path/to/event-log.json.gz

# Analyze from History Server
spark-advisor analyze app-20250101120000-0001 -hs http://yarn:18080

# Agent mode (multi-turn AI analysis)
spark-advisor analyze /path/to/event-log.json.gz --agent

# Scan recent jobs
spark-advisor scan -hs http://yarn:18080 --limit 20

What it detects

11 deterministic rules: data skew, disk spill, GC pressure, shuffle partitions, executor idle, task failures, small files, broadcast join threshold, serializer choice, dynamic allocation, memory overhead.

Links

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_advisor_cli-0.1.5.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spark_advisor_cli-0.1.5-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file spark_advisor_cli-0.1.5.tar.gz.

File metadata

  • Download URL: spark_advisor_cli-0.1.5.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for spark_advisor_cli-0.1.5.tar.gz
Algorithm Hash digest
SHA256 8eb17de1b96b1a5287ab4e8632d19bdea63e2c956b21ff5eedc9bc8e35ef416a
MD5 7db6edfc715b24af7c5cae08ffbab943
BLAKE2b-256 29c3a997e40647cdb71aacb9ef71a132ef25ca2b1b71979b676391f74e2586f3

See more details on using hashes here.

File details

Details for the file spark_advisor_cli-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: spark_advisor_cli-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for spark_advisor_cli-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 90f63bfb05f04194f7ecf305d21d6e97bd35d5bf1e8d4fdd9bc290a1e5d0b218
MD5 d5e5a1f3ad02ed0d89c89f9a04416a23
BLAKE2b-256 e3ff62ceec73d02f8b0da1472b790d05629d33026b7b7df7f45098f4489ce1ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page