Detect configuration drift in AWS Glue jobs against a source-of-truth YAML
Project description
glue-drift
Detect configuration drift in AWS Glue jobs.
glue-drift compares your live AWS Glue job configurations against a source-of-truth YAML file and reports exactly what has drifted — field by field.
Built for data engineering teams managing multi-environment Glue deployments (DEV / QA / UAT / PROD).
Why glue-drift?
AWS Glue jobs can drift from their intended configuration due to:
- Manual edits in the AWS Console
- Failed or partial deployments
- Auto-injected AWS keys polluting comparisons
- Key-order differences creating false positives
glue-drift handles all of these correctly.
Installation
pip install glue-drift
Quickstart
1. Create your source-of-truth jobs.yaml:
jobs:
my-glue-job:
Name: my-glue-job
Role: arn:aws:iam::123456789012:role/my-glue-role
GlueVersion: "4.0"
WorkerType: G.1X
NumberOfWorkers: 2
Timeout: 120
MaxRetries: 0
Command:
Name: glueetl
ScriptLocation: s3://my-bucket/scripts/my_script.py
PythonVersion: "3"
DefaultArguments:
--enable-metrics: "true"
--TempDir: s3://my-temp-bucket/
2. Run the drift check:
glue-drift check --config jobs.yaml
3. Example output:
============================================================
GLUE DRIFT REPORT
============================================================
Jobs checked : 3
OK : 1
Drifted : 1
Missing : 1
============================================================
✔ my-glue-job-ok
✘ my-glue-job-drifted [DRIFTED]
Field: WorkerType
Expected: G.1X
Actual: G.2X
✘ my-glue-job-missing [MISSING in AWS]
Job 'my-glue-job-missing' not found in AWS Glue.
============================================================
❌ Drift detected! Review the above jobs.
============================================================
CLI Options
glue-drift check --config jobs.yaml [OPTIONS]
Options:
-c, --config PATH Path to source-of-truth YAML config [required]
-r, --region TEXT AWS region [default: us-east-2]
-p, --profile TEXT AWS CLI profile name (optional)
-o, --output PATH Write JSON report to file (e.g. report.json)
--fail-on-drift Exit code 1 if drift found (for CI/CD pipelines)
--version Show version and exit
--help Show this message and exit
CI/CD Integration
Use --fail-on-drift to block deployments when drift is detected:
# In your buildspec.yaml or GitHub Actions workflow:
- name: Check Glue job drift
run: glue-drift check --config jobs.yaml --output drift-report.json --fail-on-drift
Python API
Use glue-drift programmatically:
from glue_drift import check_all_jobs, print_terminal_report
results = check_all_jobs(config_path="jobs.yaml", region="us-east-2")
print_terminal_report(results)
for result in results:
if result.has_drift:
print(f"Job {result.job_name} has drifted!")
for drift in result.drifts:
print(f" {drift.field}: expected={drift.expected}, actual={drift.actual}")
What glue-drift normalizes automatically
- AWS auto-injected keys stripped:
--job-language,--class - AWS managed metadata ignored:
CreatedOn,LastModifiedOn,LastModifiedBy - JSON key ordering normalized — no false positives from key-order differences
Authentication
glue-drift uses standard boto3 credential resolution:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) - IAM role (recommended for EC2 / Lambda / CI runners)
- AWS CLI profile via
--profile
Development
git clone https://github.com/Pushpalatha58/glue-drift
cd glue-drift
pip install -e ".[dev]"
pytest
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file glue_drift-0.1.0.tar.gz.
File metadata
- Download URL: glue_drift-0.1.0.tar.gz
- Upload date:
- Size: 8.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22c3f0af14c3cf9e9a4ecd7786ae841f64bc07ebba7fd258df264cef2bd8a472
|
|
| MD5 |
9e8087d0ffe9d24a7ac9f41fb2408518
|
|
| BLAKE2b-256 |
3ef79bb0be65789203ab29637cfe2b06c004794f214f402995ced53341240eb4
|
File details
Details for the file glue_drift-0.1.0-py3-none-any.whl.
File metadata
- Download URL: glue_drift-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab805717c6f44118f09775d22e5289c6c0f2eb81eb82c377e5ba903b103eb7f9
|
|
| MD5 |
e1a7419f5d8b5cdaec56610f31dd09e8
|
|
| BLAKE2b-256 |
f0391aadff61e04df22dabe4757c55b69d9c15d5fee7492d82fb744523d008dd
|