ESLint for Apache Spark jobs — analyze event logs, diagnose performance issues
Project description
Ignis
ESLint for Apache Spark jobs. Point it at an event log and get actionable diagnostics for data skew, shuffle size, spill, and bad partitioning.
$ ignis analyze /path/to/spark-event-log
──────────────────── ignis my-spark-app ────────────────────
2 issue(s) found
Severity Rule Stage Message
────────────────────────────────────────────────────────────
WARNING data-skew 2 Stage 2 ('groupBy at job.py:42'):
max task 42,300ms vs median 1,800ms (23.5x ratio)
WARNING partition-count 3 Stage 3 ('join at job.py:71'):
2 shuffle partition(s) across 8 executor core(s)
— cluster is under-utilized
╭───────────────────── data-skew — Stage 2 ──────────────────╮
│ Repartition before the shuffle with a higher partition │
│ count, or salt the join/groupBy key to spread work across │
│ more tasks. │
╰────────────────────────────────────────────────────────────╯
╭─────────────────── partition-count — Stage 3 ──────────────╮
│ Raise spark.sql.shuffle.partitions to at least 16 │
│ (2× your 8 executor cores). │
╰────────────────────────────────────────────────────────────╯
Installation
pip install spark-ignis # core only
pip install "spark-ignis[s3]" # + AWS S3
pip install "spark-ignis[gcs]" # + Google Cloud Storage
pip install "spark-ignis[azure]" # + Azure Data Lake Storage
Or install from source:
git clone https://github.com/skatz1990/ignis
cd ignis
python3 -m venv .venv && source .venv/bin/activate
pip install -e . # local files only
pip install -e ".[s3]" # + AWS S3
pip install -e ".[gcs]" # + Google Cloud Storage
pip install -e ".[azure]" # + Azure Data Lake Storage
Usage
# Analyze a local event log (terminal output, exits 1 if issues found)
ignis analyze /path/to/spark-event-log
# Analyze directly from cloud storage
ignis analyze s3://my-bucket/spark-logs/application_1234_0001
ignis analyze gs://my-bucket/spark-logs/application_1234_0001
ignis analyze abfs://my-container/spark-logs/application_1234_0001
# Machine-readable JSON output — pipe to jq, store in CI artifacts
ignis analyze s3://my-bucket/spark-logs/application_1234_0001 --output json
# Pipe findings directly to a Slack channel (webhook URL from env var)
export IGNIS_SLACK_WEBHOOK="https://hooks.slack.com/services/..."
ignis analyze s3://my-bucket/spark-logs/application_1234_0001 --output json \
| ignis notify slack
# Send findings by email
export IGNIS_SMTP_PASSWORD="pass"
ignis analyze s3://my-bucket/spark-logs/application_1234_0001 --output json \
| ignis notify email ops@example.com \
--from ignis@example.com --smtp-host smtp.example.com
# List all rules with their thresholds
ignis rules
Exits 0 if no issues are found, 1 if any are — in both terminal and JSON modes.
Spark event logs are standard NDJSON files (Spark 3.x) or zstd-compressed directories (Spark 4.0+). Databricks writes them to DBFS, S3, GCS, or ADLS after each job.
Notifications
ignis notify reads findings JSON from stdin and routes them to a notification channel. It is silent (exit 0, no message) when there are no findings. Pass --always to send a clean-run confirmation.
Slack
export IGNIS_SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
ignis analyze /path/to/spark-event-log --output json | ignis notify slack
The webhook URL can also be passed as a positional argument, but using the environment variable keeps it out of shell history and CI logs. Create an incoming webhook at api.slack.com/apps → your app → Incoming Webhooks.
export IGNIS_SMTP_USERNAME="user"
export IGNIS_SMTP_PASSWORD="pass"
ignis analyze /path/to/spark-event-log --output json \
| ignis notify email ops@example.com \
--from ignis@example.com \
--smtp-host smtp.example.com
Sends a plain-text + HTML multipart email via SMTP with STARTTLS. Credentials are read from IGNIS_SMTP_USERNAME / IGNIS_SMTP_PASSWORD environment variables (recommended) or passed via --username / --password flags. --smtp-port defaults to 587.
Cloud storage
AWS S3
pip install -e ".[s3]"
ignis analyze s3://my-bucket/spark-logs/application_1234_0001
Credentials from the standard AWS chain:
| Source | How |
|---|---|
| Environment variables | AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY |
| Named profile | AWS_PROFILE=my-profile ignis analyze s3://... |
| Instance role (EC2/ECS) | No configuration needed |
| SSO | aws sso login then run ignis normally |
Google Cloud Storage
pip install -e ".[gcs]"
ignis analyze gs://my-bucket/spark-logs/application_1234_0001
Credentials from the standard GCP chain:
| Source | How |
|---|---|
| User credentials | gcloud auth application-default login |
| Service account key | GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json |
| Workload Identity (GKE) | No configuration needed |
Azure Data Lake Storage (ADLS Gen2)
pip install -e ".[azure]"
ignis analyze abfs://my-container/spark-logs/application_1234_0001
Credentials from the standard Azure chain:
| Source | How |
|---|---|
| Service principal | AZURE_TENANT_ID + AZURE_CLIENT_ID + AZURE_CLIENT_SECRET |
| Azure CLI | az login then run ignis normally |
| Managed identity | No configuration needed |
Rules
| Rule | What it detects | Default threshold |
|---|---|---|
data-skew |
One task takes far longer than its peers in a shuffle stage | max ≥ 5× median task duration |
shuffle-size |
A stage writes an excessive amount of data to shuffle files | total shuffle write ≥ 1 GB |
spill |
Tasks spill execution data to disk or show significant memory pressure | any disk spill (WARNING); memory spill ≥ 500 MB (INFO) |
partition-count |
Shuffle partition count leaves the cluster idle or overwhelms the driver | < 2× executor cores or > 10,000 partitions |
failed-tasks |
High rate of task failures or speculative task launches in a stage | failure rate ≥ 10% (WARNING); speculation rate ≥ 25% (INFO) |
gc-pressure |
JVM garbage collection consumes a large fraction of executor run time | GC time ≥ 10% of executor run time (WARNING) |
Run ignis rules for a live summary with thresholds.
JSON output
--output json emits a structured document to stdout:
{
"app_id": "application_1234_0001",
"app_name": "my-spark-app",
"finding_count": 1,
"findings": [
{
"rule": "data-skew",
"severity": "warning",
"stage_id": 2,
"stage_name": "groupBy at job.py:42",
"message": "Stage 2 ('groupBy at job.py:42'): max task 42,300ms vs median 1,800ms (23.5x ratio)",
"recommendation": "Repartition before the shuffle with a higher partition count, or salt the join/groupBy key to spread work across more tasks."
}
]
}
Development
git clone https://github.com/skatz1990/ignis
cd ignis
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
Project layout
ignis/
parser/ NDJSON event log parsing → Application/Stage/Task models
rules/ Diagnostic rules (one module per rule)
reporter/ Terminal (rich) and JSON output
cli.py Entry point — ignis analyze <path>, ignis rules
tests/
fixtures/ Hand-crafted NDJSON snippets that trigger each rule
docs/
rules.md Detailed explanation of each rule and its detection logic
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spark_ignis-0.3.1.tar.gz.
File metadata
- Download URL: spark_ignis-0.3.1.tar.gz
- Upload date:
- Size: 97.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37ed30e7e55956aefec9d9dd54672c25657e9f571960fec77811ca842ee568c7
|
|
| MD5 |
0b0c292f3eb048312da69ad8a4917a39
|
|
| BLAKE2b-256 |
4ed8b669f72f5e15c630dc380823ef39804f537ebbc4ded66f9574f0b1677965
|
Provenance
The following attestation bundles were made for spark_ignis-0.3.1.tar.gz:
Publisher:
publish.yml on skatz1990/ignis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spark_ignis-0.3.1.tar.gz -
Subject digest:
37ed30e7e55956aefec9d9dd54672c25657e9f571960fec77811ca842ee568c7 - Sigstore transparency entry: 1385482450
- Sigstore integration time:
-
Permalink:
skatz1990/ignis@65d0b9b63c0e17cad52ea5e383dc42a3e7305413 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/skatz1990
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@65d0b9b63c0e17cad52ea5e383dc42a3e7305413 -
Trigger Event:
push
-
Statement type:
File details
Details for the file spark_ignis-0.3.1-py3-none-any.whl.
File metadata
- Download URL: spark_ignis-0.3.1-py3-none-any.whl
- Upload date:
- Size: 22.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
789fffac22cbb19e3d23aa52b07ef10360478baa9afd23c42b7cdd9994acbb88
|
|
| MD5 |
f8045ae8d45b1299d3b753c32fc534a1
|
|
| BLAKE2b-256 |
3f263896e0ee169d9b409d905fb779ee7f7ef08f40461a5e1be89d31efeab230
|
Provenance
The following attestation bundles were made for spark_ignis-0.3.1-py3-none-any.whl:
Publisher:
publish.yml on skatz1990/ignis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spark_ignis-0.3.1-py3-none-any.whl -
Subject digest:
789fffac22cbb19e3d23aa52b07ef10360478baa9afd23c42b7cdd9994acbb88 - Sigstore transparency entry: 1385482468
- Sigstore integration time:
-
Permalink:
skatz1990/ignis@65d0b9b63c0e17cad52ea5e383dc42a3e7305413 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/skatz1990
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@65d0b9b63c0e17cad52ea5e383dc42a3e7305413 -
Trigger Event:
push
-
Statement type: