Skip to main content

python package to extract CLL from dbt files

Project description

py-dbt-cll

PyPI version GitHub tag (latest SemVer) Publish Tests

Python packages that extracts column lineage information from dbt models based on their metadata in the manifest file. It does not require any connection to the database and it only uses sqlGlot to extract the column level lineage information from a SQL query. Before the query is passed into sqlGlot, the query is modified with additional information from the manifest file, so that the column lineage can be accurately determined.

Installation

You can install the package using pip:

pip install py-dbt-cll

Usage

Import the class from the module.

from py_dbt_cll.dbt_lineage import DbtCLL

Load your manifest file from the json file and create the class instance with it. After that you can use the method extract_cll to extract column lineage information from your SQL queries.

with open("tests/manifest.json", "r", encoding="utf-8") as file:
    manifest_data = json.load(file)
ccl = DbtCLL(manifest_data)

sql = """
    select *
    from (
        select *
        from ...
    ) as final
"""
columns = ["academic_year_id", "date_id"]
lineage = ccl.extract_cll(sql, columns, debug=False)

Parameters:

  • sql (str): The SQL query from which to extract column lineage.
  • columns (list): A list of column names to extract lineage for.
  • debug (bool): Whether to enable debug mode for more verbose output. (default is False)
  • dialect (str): The SQL dialect to use for parsing the SQL query (default is "tsql").

Returns:

  • dict: A dictionary mapping column names to their lineage information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_dbt_cll-0.1.3.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_dbt_cll-0.1.3-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file py_dbt_cll-0.1.3.tar.gz.

File metadata

  • Download URL: py_dbt_cll-0.1.3.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for py_dbt_cll-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0978e159796774ff56a5a1910b78b4c35d21d9c64b64d58637ab63af0afeac38
MD5 e1748b4ee4de7fde2f7167db548316b1
BLAKE2b-256 82a09e6b3b923645b435e81d981705125fb33e2603bc19dd2b55a6934bb329e4

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dbt_cll-0.1.3.tar.gz:

Publisher: publish.yml on ngmiduc/py-dbt-cll

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file py_dbt_cll-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: py_dbt_cll-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for py_dbt_cll-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5f9109aa915e743cdf613fabc5bb8338dce793fa9baa54b8e2a8f9b7439abaeb
MD5 ad17fcc76b1926be3c43a7db1007d443
BLAKE2b-256 1fc5afc462408b00d3dc059853bb5d2043b454b20a1ef362eb6a6168864eba70

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_dbt_cll-0.1.3-py3-none-any.whl:

Publisher: publish.yml on ngmiduc/py-dbt-cll

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page