Skip to main content

Deterministic rules engine for Spark job analysis

Project description

spark-advisor-rules

Deterministic rules engine for Apache Spark job analysis. Part of the spark-advisor ecosystem.

Install

pip install spark-advisor-rules

What it detects

11 rules that identify common Spark performance problems:

Rule Detects
DataSkewRule Task duration skew (max/median > 5x)
SpillToDiskRule Disk spill indicating insufficient memory
GCPressureRule GC time > 20% of task time
ShufflePartitionsRule Partition count far from optimal (128MB target)
ExecutorIdleRule Slot utilization < 40% (CRITICAL if <20%)
TaskFailureRule Failed tasks (CRITICAL if >=10, WARNING if >0)
SmallFileRule Avg input bytes per task < 10MB (CRITICAL if <1MB)
BroadcastJoinThresholdRule Broadcast join disabled or too low
SerializerChoiceRule Java serializer used with shuffle stages
DynamicAllocationRule Missing min/max bounds or disabled
ExecutorMemoryOverheadRule High GC + high memory utilization

All thresholds are configurable via Thresholds model.

Usage

from spark_advisor_rules.static_analysis import StaticAnalysisService

service = StaticAnalysisService()
results = service.analyze(job_analysis)

Links

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spark_advisor_rules-0.1.7.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spark_advisor_rules-0.1.7-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file spark_advisor_rules-0.1.7.tar.gz.

File metadata

  • Download URL: spark_advisor_rules-0.1.7.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for spark_advisor_rules-0.1.7.tar.gz
Algorithm Hash digest
SHA256 3001c3d4b92adb785cd88240e38210aa77d447a2e2e677f9f1d818a36e09cb76
MD5 481cbb61dbea38d21e16ebff3b71075f
BLAKE2b-256 5661f9f627f4a15cb87c0885d66dfa5b08475ae0606d0f8ca699709833fdb507

See more details on using hashes here.

File details

Details for the file spark_advisor_rules-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: spark_advisor_rules-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for spark_advisor_rules-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0d18139aaddbc628ba56929b92fa3f1853ced3600b063b743b0afead0a0745da
MD5 fc29041d4c03bd17c6121ec8a2f05dee
BLAKE2b-256 9f82d92109e30d3be2b5b91fb6820e3d3c880cf883ac557815d909a96fd592d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page