Skip to main content

No project description provided

Project description

Prophecy Lineage Extractor Documentation

Description

The Prophecy Lineage Extractor is a tool to extract lineage information from Prophecy projects and pipelines. It allows users to specify a project, pipeline, and branch, and outputs the extracted lineage to a specified directory. Optional features include email notifications.


Usage

python -m prophecy_lineage_extractor --project-id <PROJECT_ID> --pipeline-id <PIPELINE_ID> --output-dir <OUTPUT_DIRECTORY> [--send-email] [--branch <BRANCH_NAME>] [--run-for-all]
  • We must need to set these env variables PROPHECY_URL and PROPHECY_PAT

Arguments

Required Arguments

  • --project-id

    • Type: str
    • Description: Prophecy Project ID.
    • Required: Yes
  • --pipeline-id

    • Type: str
    • Description: Prophecy Pipeline ID.
    • Required: Yes / Optional if using knowledge-graph type reader.
  • --output-dir

    • Type: str
    • Description: Output directory inside the project where lineage files will be stored.
    • Required: Yes

Optional Arguments

  • --reader:

    • Type: str
    • Description: Reading adapter to use
      • Spark Lineage (lineage) or
        • Note that pipeline-id is mandatory argument for this method at the moment.
      • SQL Knowledge Graph (knowledge-graph)
  • --writer:

    • Type: str
    • Description: Data Format to write to from among:
      • Excel Files (xlxs sheet)
        • We save a sheet with name lineage__(<optional_pipeline-ids>.xlsx) will be created in <output_dir>.
        • For each pipeline mentioned in the query, an Excel sheet is created. If run-for-all is used, a sheet per pipeline is created.
        • If run-for-all flag is used, an additional Overall Project sheet will be created. NOTE that it
      • Openlineage Format
        • We save Dummy Run Events JSON files in OpenLineage format in the output-dir/ folder
        • We attempt to make an API Call to an Openlineage compatible frontend like Marquez or Datahub.
        • We support Column Level Lineage as well as Project Level Lineage via this method.
  • --run-for-all

    • Type: boolean flag
    • Description: If Specified, a Project level Lineage Excel file is created as an Overall Project.
  • --send-email

    • Type: flag
    • Description: If specified, sends an email with the generated lineage report to ENV variable RECEIVER_EMAIL.
    • We must set following Env variables for this option if passed
      • SMTP_HOST
      • SMTP_PORT
      • SMTP_USERNAME
      • SMTP_PASSWORD
      • RECEIVER_EMAIL
  • --branch

    • Type: str
    • Description: Branch to run the lineage extractor on.
    • Default: default branch in Prophecy, generally 'main or master'

Running

  • Please run extractor as following, it needs env variables
  • we Only need to set SMTP creds if we plan to pass --send-email argument
export PROPHECY_URL=https://app.prophecy.io
export PROPHECY_PAT=${{ secrets.PROPHECY_PAT }}

# These are needed if you using --send-email option
export SMTP_HOST=smtp.gmail.com
export SMTP_PORT=587
export SMTP_USERNAME=${{ secrets.SMTP_USERNAME }}
export SMTP_PASSWORD=${{ secrets.SMTP_PASSWORD }}
export RECEIVER_EMAIL=ashish@prophecy.io

python -m prophecy_lineage_extractor --project-id 36587 --pipeline-id 36587/pipelines/customer_orders_demo --send-email --branch dev

Github Action Guide

  • This extactor can be setup in Github Action of a Prophecy project to get email of lineage on every commit to main

  • Following is a sample of github action we can use on default branch Github Action default branch

  • Following is a sample of github action we can use on custom branch Github Action custom branch


Gitlab Action Guide

  • Following is a sample of gitlab action we can use on a branch Gitlab Action guide
  • Note—we need to create gitlab CI/CD variables(secrets) for using them in our YML file, ex. SMTP_USER etc.
  • additionally, we will also need to setup an ACCESS_TOKEN to allow the JOB to commit if commit is enabled.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prophecy_lineage_extractor-0.21.1.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prophecy_lineage_extractor-0.21.1-py3-none-any.whl (31.2 kB view details)

Uploaded Python 3

File details

Details for the file prophecy_lineage_extractor-0.21.1.tar.gz.

File metadata

File hashes

Hashes for prophecy_lineage_extractor-0.21.1.tar.gz
Algorithm Hash digest
SHA256 bf0c0e76a7d892af36ee28416803bda78a0ba90f1fca2fef31e736c81c423baa
MD5 52bcae669f00666278da9f3654d9b5e8
BLAKE2b-256 e66d4ac8cc34fd5b82abffb4c15f4931e607488d65c5c69dec70df5ab863bd73

See more details on using hashes here.

File details

Details for the file prophecy_lineage_extractor-0.21.1-py3-none-any.whl.

File metadata

File hashes

Hashes for prophecy_lineage_extractor-0.21.1-py3-none-any.whl
Algorithm Hash digest
SHA256 36d019f9418c3afac42c767f29598d5fd2f5668a59b0908b35149b778487cf0c
MD5 5cdab1758733fe84172d3096e0e19d90
BLAKE2b-256 4631b366e7b1a3e88156511f5f6834aee164659552eeaf94f5332b7985f0bd17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page