VDK Lineage plugin collects lineage (input -> job -> output) information and send it to pre-configured destination.
Project description
VDK Lineage
VDK Lineage plugin provides lineage data (input data -> job -> output data) information and send it to pre-configured destination.
At POC level currently. It collect lineage information for each job run and for each executed query. Query execution is currently before it's executed (so not query status is logged).
Usage
pip install vdk-lineage
And it will start collecting lineage from job and sql queries.
To send data using openlineage specify VDK_OPENLINEAGE_URL. For example:
export VDK_OPENLINEAGE_URL=http://localhost:5002
vdk marquez-server --start
vdk run some-job
# check UI for lineage
# stopping the server will delete any lineage data.
vdk marquez-server --stop
Build and testing
In order to build and test a plugin go to the plugin directory and use ../build-plugin.sh
script to build it
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file vdk-lineage-0.1.6.tar.gz
.
File metadata
- Download URL: vdk-lineage-0.1.6.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 054837e8cb9f22248a001db2215e7620843d3d6fd192f8cf2968e69b11333fa0 |
|
MD5 | 1a15534ad135771a17789bd86578a379 |
|
BLAKE2b-256 | bf375d59f8b8a82f5328844a28dd1ea1350768bbd8e0fb4673c43a7106bf3c9a |