Skip to main content

Open Source Data Lineage Tool For AWS and GCP

Project description

CircleCI codecov PyPI image image

Data Lineage for Databases and Data Lakes

Data Lineage is an open source application to query and visualize data lineage in databases, data warehouses and data lakes in AWS and GCP.

Features

  • Generate lineage from SQL query history.
  • Supports ANSI SQL queries
  • Integrate with Jupyter Notebook
  • Visualize data lineage using Plotly.
  • Select source or target table.
  • Pan, Zoom, Select graph

Checkout an example data lineage notebook.

Use Cases

Data Lineage enables the following use cases:

  • Business Rules Verification
  • Change Impact Analysis
  • Data Quality Verification

Check out the post on using data lineage for cost control for an example of how data lineage can be used in production.

Quick Start

# Install packages
pip install data-lineage
pip install jupyter

jupyter notebook

# Checkout example notebook: http://tokern.io/docs/data-lineage/example/ 

Supported Technologies

  • Postgres

Coming Soon

  • MySQL
  • AWS Redshift
  • SparkSQL
  • Presto

Developer Setup

# Install dependencies
pipenv install --dev

# Setup pre-commit and pre-push hooks
pipenv run pre-commit install -t pre-commit
pipenv run pre-commit install -t pre-push

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-lineage-0.2.0.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

data_lineage-0.2.0-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file data-lineage-0.2.0.tar.gz.

File metadata

  • Download URL: data-lineage-0.2.0.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.0 CPython/3.7.5

File hashes

Hashes for data-lineage-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0a985ab7cccbed9e2993357510733678ea18ed3878294997e59b26bb29eb11af
MD5 29d73e599b434e3c646ab3845b159101
BLAKE2b-256 66eebc2fb09f3270b1790942cfc1ab6aa08c033d3cbe3345029d6b914c093628

See more details on using hashes here.

File details

Details for the file data_lineage-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: data_lineage-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.44.0 CPython/3.7.5

File hashes

Hashes for data_lineage-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f34bcc8055b9eb546d56f8aa134ba51ef085d834cf8fe8fa2855f5775af295ad
MD5 8c30dcd7d798df7547874253ff6c7621
BLAKE2b-256 73cf89434efeac1244296872ebe48d55e9f74fcb5f5d1cd0045207f0c13cd6da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page