Skip to main content

Great Expectations plugin that emits OpenLineage events for automated incident correlation

Project description

🔗 correlator-ge

Connect Great Expectations validations to incident correlation

PyPI version codecov Python Version License


What It Does

Links Great Expectations validation results to data pipeline incidents:

  • Connects validation failures to upstream job runs that caused data issues
  • Provides navigation from data quality alerts to root cause
  • Integrates with your existing OpenLineage infrastructure
  • Works alongside your current GE checkpoint workflows

Quick Start

# Install
pip install correlator-ge

# Configure endpoint
export CORRELATOR_ENDPOINT=http://localhost:8080/api/v1/lineage/events

# Add to your GE checkpoint (great_expectations.yml)
# See Configuration section for details

Your validation results are now being correlated with data lineage.


How It Works

correlator-ge hooks into Great Expectations checkpoint execution and emits OpenLineage events:

  1. START - Emits validation start event when checkpoint begins
  2. Validate - GE runs your expectation suites
  3. Parse - Extracts validation results and data quality metrics
  4. Emit - Sends events with DataQualityAssertions facets
  5. COMPLETE/FAIL - Emits completion event based on validation outcome

See Architecture for technical details.


Why It Matters

The Problem: When data quality checks fail, teams need to trace back through pipeline runs and lineage graphs to find what upstream job introduced the bad data.

What You Get: correlator-ge automatically connects your validation failures to their upstream causes, making it easier to identify which job run introduced the data quality issue.

Key Benefits:

  • Faster triage: Validation failures linked to upstream job runs
  • Context in one place: Data quality results correlated with lineage
  • Standard integration: Uses OpenLineage DataQualityAssertions facets
  • Non-invasive setup: Adds to existing checkpoint configuration

Built on Standards: Uses OpenLineage, the industry standard for data lineage. No vendor lock-in, no proprietary formats.


Versioning

This package follows Semantic Versioning with the following guidelines:

  • 0.x.y versions (e.g., 0.1.0, 0.2.0) indicate initial development phase:

    • The API is not yet stable and may change between minor versions
    • Features may be added, modified, or removed without major version changes
    • Not recommended for production-critical systems without pinned versions
  • 1.0.0 and above will indicate a stable API with semantic versioning guarantees:

    • MAJOR version for incompatible API changes
    • MINOR version for backwards-compatible functionality additions
    • PATCH version for backwards-compatible bug fixes

The current version is in early development stage, so expect possible API changes until the 1.0.0 release.


Documentation

For detailed usage, configuration, and development:


Requirements


Current Status

This is a skeleton release for CI pipeline testing and PyPI name reservation. Core functionality will be implemented after research into Great Expectations checkpoint action interfaces is complete.


Links


License

Apache 2.0 - See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

correlator_ge-0.0.1.tar.gz (20.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

correlator_ge-0.0.1-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file correlator_ge-0.0.1.tar.gz.

File metadata

  • Download URL: correlator_ge-0.0.1.tar.gz
  • Upload date:
  • Size: 20.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for correlator_ge-0.0.1.tar.gz
Algorithm Hash digest
SHA256 6baca9eca9397d314c6af271059c8727789958877a9495d20b770ceebd3a4dce
MD5 1b44a18f560de656c4773c3501b71615
BLAKE2b-256 5755168b601605391df5499b5cf883b857f8f88431778fca2e3cdc5f9ed02745

See more details on using hashes here.

File details

Details for the file correlator_ge-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: correlator_ge-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for correlator_ge-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 83fe144ee894426070e57f856575467958aea81c7fa81dc23dd8a7827579e197
MD5 27ac80a89192909a25f4c432d576d535
BLAKE2b-256 48ba6250fb60a3a64c56aa6e41b1170ee6b5c33c4c3e075ebf7e0453d032537e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page