Skip to main content

No project description provided

Project description

Prophecy Lineage Extractor Documentation

Description

The Prophecy Lineage Extractor is a tool to extract lineage information from Prophecy projects and pipelines. It allows users to specify a project, pipeline, and branch, and outputs the extracted lineage to a specified directory. Optional features include email notifications.


Usage

python -m prophecy_lineage_extractor --project-id <PROJECT_ID> --pipeline-id <PIPELINE_ID> --output-dir <OUTPUT_DIRECTORY> [--send-email] [--branch <BRANCH_NAME>] [--run-for-all]
  • We must need to set these env variables PROPHECY_URL and PROPHECY_PAT

Arguments

Required Arguments

  • --project-id

    • Type: str
    • Description: Prophecy Project ID.
    • Required: Yes
  • --pipeline-id

    • Type: str
    • Description: Prophecy Pipeline ID.
    • Required: Yes / Optional if using knowledge-graph type reader.
  • --output-dir

    • Type: str
    • Description: Output directory inside the project where lineage files will be stored.
    • Required: Yes

Optional Arguments

  • --reader:

    • Type: str
    • Description: Reading adapter to use
      • Spark Lineage (lineage) or
        • Note that pipeline-id is mandatory argument for this method at the moment.
      • SQL Knowledge Graph (knowledge-graph)
  • --writer:

    • Type: str
    • Description: Data Format to write to from among:
      • Excel Files (xlxs sheet)
        • We save a sheet with name lineage__(<optional_pipeline-ids>.xlsx) will be created in <output_dir>.
        • For each pipeline mentioned in the query, an Excel sheet is created. If run-for-all is used, a sheet per pipeline is created.
        • If run-for-all flag is used, an additional Overall Project sheet will be created. NOTE that it
      • Openlineage Format
        • We save Dummy Run Events JSON files in OpenLineage format in the output-dir/ folder
        • We attempt to make an API Call to an Openlineage compatible frontend like Marquez or Datahub.
        • We support Column Level Lineage as well as Project Level Lineage via this method.
  • --run-for-all

    • Type: boolean flag
    • Description: If Specified, a Project level Lineage Excel file is created as an Overall Project.
  • --send-email

    • Type: flag
    • Description: If specified, sends an email with the generated lineage report to ENV variable RECEIVER_EMAIL.
    • We must set following Env variables for this option if passed
      • SMTP_HOST
      • SMTP_PORT
      • SMTP_USERNAME
      • SMTP_PASSWORD
      • RECEIVER_EMAIL
  • --branch

    • Type: str
    • Description: Branch to run the lineage extractor on.
    • Default: default branch in Prophecy, generally 'main or master'

Running

  • Please run extractor as following, it needs env variables
  • we Only need to set SMTP creds if we plan to pass --send-email argument
export PROPHECY_URL=https://app.prophecy.io
export PROPHECY_PAT=${{ secrets.PROPHECY_PAT }}

# These are needed if you using --send-email option
export SMTP_HOST=smtp.gmail.com
export SMTP_PORT=587
export SMTP_USERNAME=${{ secrets.SMTP_USERNAME }}
export SMTP_PASSWORD=${{ secrets.SMTP_PASSWORD }}
export RECEIVER_EMAIL=ashish@prophecy.io

python -m prophecy_lineage_extractor --project-id 36587 --pipeline-id 36587/pipelines/customer_orders_demo --send-email --branch dev

Github Action Guide

  • This extactor can be setup in Github Action of a Prophecy project to get email of lineage on every commit to main

  • Following is a sample of github action we can use on default branch Github Action default branch

  • Following is a sample of github action we can use on custom branch Github Action custom branch


Gitlab Action Guide

  • Following is a sample of gitlab action we can use on a branch Gitlab Action guide
  • Note—we need to create gitlab CI/CD variables(secrets) for using them in our YML file, ex. SMTP_USER etc.
  • additionally, we will also need to setup an ACCESS_TOKEN to allow the JOB to commit if commit is enabled.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prophecy_lineage_extractor-0.20.5.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prophecy_lineage_extractor-0.20.5-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file prophecy_lineage_extractor-0.20.5.tar.gz.

File metadata

File hashes

Hashes for prophecy_lineage_extractor-0.20.5.tar.gz
Algorithm Hash digest
SHA256 adb6955189ad1de4315ef7e684441370657e901e7db72411191c0626e2371e55
MD5 9c0785ed67ec72b08b0b7954ec1a920f
BLAKE2b-256 a95fd49bfac52fa15fb67bf3784f3a754df4ddc9a36a6e72856c2635ac12d079

See more details on using hashes here.

File details

Details for the file prophecy_lineage_extractor-0.20.5-py3-none-any.whl.

File metadata

File hashes

Hashes for prophecy_lineage_extractor-0.20.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2d3b834349d648c2e30197d6b6213cd49f31957b3e9139c071ceca97a3de939d
MD5 e17bda366c257234764c1d2002b8f2ae
BLAKE2b-256 9aa1f0161f5b3251ac595e97fc47e066224ec703c6f0cabac9214a8a7b25b02c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page