Skip to main content

A tool to analyze and extract information from Jinja used in dbt projects.

Project description

dbt extractor

demo app

This repository contains a tool that processes the most common jinja value templates in dbt model files. The tool depends on tree-sitter and the tree-sitter-jinja2 library.

Strategy

The current strategy is for this processor to be 100% certain when it can accurately extract values from a given model file. Anything less than 100% certainty returns an exception so that the model can be rendered with python Jinja instead.

There are two cases we want to avoid because they would risk correctness to user's projects:

  1. Confidently extracting values that would not be extracted by python jinja (false positives)
  2. Confidently extracting a set of values that are missing values that python jinja would have extracted. (misses)

If we instead error when we could have confidently extracted values, there is no correctness risk to the user. Only an opportunity to expand the rules to encompass this class of cases as well.

Even though jinja in dbt is not a typed language, the type checker statically determines whether or not the current implementation can confidently extract values without relying on python jinja rendering, which is when these errors would otherwise surface. This type checker will become more permissive over time as this tool expands to include more dbt and jinja features.

Architecture

This architecture is optimized for value extraction and for future flexibility. This architecture is expected to change, and is coded in fp-style stages to make those changes easier for the future.

This processor is composed of several stages:

  1. parser
  2. type checker
  3. extractor

Additionally, the following tools utilize the above processor:

  1. browser-based demo of dbt extraction as you type

The tree-sitter parser is located in the tree-sitter-jinja2 library. The rust bindings are used to traverse the concrete syntax tree that tree-sitter creates in order to create a typed abstract syntax tree in the type checking stage. The errors in the type checking stage are not raised to the user, and are instead used by developers to debug tests.

The parser is solely responsible for turning text into recognized values, while the type checker does arity checking, and enforces argument list types (e.g. nested function calls like {{ config(my_ref=ref('table')) }} will parse but not type check even though it is valid dbt syntax. The tool at this time doesn't have an agreed serialization to communicate refs as config values, but could in the future.)

The extractor uses the typed abstract syntax tree to easily identify all the refs, sources, and configs present and extract them.

Running The Demo App

To see the full implementation extract dbt values live as you type in a browser, run:

make demo

It may take a moment for the demo to compile an optimized version of itself.

Kill the server with ctrl+c to end the demo.

Testing The Project

make test

Future Work

  • Refactor the tree-sitter jinja parser into its own repository to potentially open source and engage with the community on implementing improvements.
  • Remove ref, source, and config type checking as hard coded rules and instead read these function types from external function definition statements.
  • Create input path for a manifest file so it can be run on any project without additional pre-processing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_extractor-0.5.1.tar.gz (266.3 kB view details)

Uploaded Source

Built Distributions

dbt_extractor-0.5.1-cp38-abi3-win_amd64.whl (283.5 kB view details)

Uploaded CPython 3.8+ Windows x86-64

dbt_extractor-0.5.1-cp38-abi3-win32.whl (261.6 kB view details)

Uploaded CPython 3.8+ Windows x86

dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8+ musllinux: musl 1.2+ x86-64

dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_i686.whl (1.6 MB view details)

Uploaded CPython 3.8+ musllinux: musl 1.2+ i686

dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_armv7l.whl (1.6 MB view details)

Uploaded CPython 3.8+ musllinux: musl 1.2+ ARMv7l

dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_aarch64.whl (1.5 MB view details)

Uploaded CPython 3.8+ musllinux: musl 1.2+ ARM64

dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.5 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ s390x

dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.5 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ppc64le

dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_ppc64.manylinux2014_ppc64.whl (1.5 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ppc64

dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARMv7l

dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

dbt_extractor-0.5.1-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl (1.4 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.12+ i686

dbt_extractor-0.5.1-cp38-abi3-macosx_10_12_x86_64.whl (438.7 kB view details)

Uploaded CPython 3.8+ macOS 10.12+ x86-64

dbt_extractor-0.5.1-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (865.7 kB view details)

Uploaded CPython 3.8+ macOS 10.12+ universal2 (ARM64, x86-64) macOS 10.12+ x86-64 macOS 11.0+ ARM64

File details

Details for the file dbt_extractor-0.5.1.tar.gz.

File metadata

  • Download URL: dbt_extractor-0.5.1.tar.gz
  • Upload date:
  • Size: 266.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for dbt_extractor-0.5.1.tar.gz
Algorithm Hash digest
SHA256 cd5d95576a8dea4190240aaf9936a37fd74b4b7913ca69a3c368fc4472bb7e13
MD5 2f8c8ddbb2d4b92e3690b03cfc4c03b6
BLAKE2b-256 2cd04ee14955ad0214da695b3c15dc0acf2ab54c9d263242f36073c999cb699a

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 eecc08f3743e802a8ede60c89f7b2bce872acc86120cbc0ae7df229bb8a95083
MD5 b7f2d2bb14a628bb911bba1bf32a9c8a
BLAKE2b-256 8cadfa331537dbe97250dda06342775891ae2b1fb8b54cf9219e47781f641657

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-win32.whl.

File metadata

  • Download URL: dbt_extractor-0.5.1-cp38-abi3-win32.whl
  • Upload date:
  • Size: 261.6 kB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 6916aae085fd5f2af069fd6947933e78b742c9e3d2165e1740c2e28ae543309a
MD5 60717f0a2ea4a11a6260abd52957590c
BLAKE2b-256 10ddb3c440b8eeac318a2d3b0f190783feedad60b962fe984d6d0cb482b128b4

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 100453ba06e169cbdb118234ab3f06f6722a2e0e316089b81c88dea701212abc
MD5 b06503829e2d0477c02f3d68a19516b3
BLAKE2b-256 7c0419af8b0cb0e341d091cca21ff3cfed95f152e39f598b7313c79a6804f32f

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 475e2c05b17eb4976eff6c8f7635be42bec33f15a74ceb87a40242c94a99cebf
MD5 68c3bcf86467820ffaec0cd9ad10db52
BLAKE2b-256 58b614ab2c80385a29ad013a0a0642522b393bf1220d6c01587aad4796784cc1

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 cdf9938b36cd098bcdd80f43dc03864da3f69f57d903a9160a32236540d4ddcd
MD5 db5b0016e624a8a01d5d0365007108fb
BLAKE2b-256 6ccc6dce67509e94080535b400b03d7d13fecd2acba72c10c21df8b7755212ce

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 91e25ad78f1f4feadd27587ebbcc46ad909cfad843118908f30336d08d8400ca
MD5 7738d4e0edb4c18ef433756230800c45
BLAKE2b-256 7b2b48ad70e0490e492b1f59e260d447b3c9eaaad661eb4b46baacc2f328dabf

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 62e4f040fd338b652683421ce48e903812e27fd6e7af58b1b70a4e1f9f2c79e3
MD5 cca4356af3a5b19f62f1939fe7c1f597
BLAKE2b-256 30daa9528ca8224317aad1dab22f77468dd13e94c46b56db953b5b1e3b698a8f

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 c5651e458be910ff567c0da3ea2eb084fd01884cc88888ac2cf1e240dcddacc2
MD5 1beeedc40f2c938990a83632b6dd47f5
BLAKE2b-256 63e6a40a89c75701fa91fc7297b9d77f303fc93669a32a10be4457a02de0584f

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 1b25fa7a276ab26aa2d70ff6e0cf4cfb1490d7831fb57ee1337c24d2b0333b84
MD5 c0a4b426061f5faec7a653fb1f60cc3f
BLAKE2b-256 51e6140058fbeb482071a7b199986c40385dfdc97f23b0ea20b0740762d2e116

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_ppc64.manylinux2014_ppc64.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_ppc64.manylinux2014_ppc64.whl
Algorithm Hash digest
SHA256 cbe338b76e9ffaa18275456e041af56c21bb517f6fbda7a58308138703da0996
MD5 ce6c82d9ffc0d7301fc42105f451f8a4
BLAKE2b-256 11735ead77c8b742453e1a34a064d921933bbca4f8941ad8f14fd47d0a15c49c

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 c0ce901d4ebf0664977e4e1cbf596d4afc6c1339fcc7d2cf67ce3481566a626f
MD5 8df31f5912193e8b61234ae8db3ca79d
BLAKE2b-256 66ce8c248ba3def50203925a1404d21a03999e2fe32bf7611e6f9de1006817ba

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 d3b9bf50eb062b4344d9546fe42038996c6e7e7daa10724aa955d64717260e5d
MD5 e6c2dd4f8fb23295f11000eed87c9265
BLAKE2b-256 6d96caef63d79f3a06bcae1aca43302c1b9efa58590644efca41c4404607510e

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 ea4edf33035d0a060b1e01c42fb2d99316457d44c954d6ed4eed9f1948664d87
MD5 224e3e947f6e97c86d1363c969f19bfb
BLAKE2b-256 a9acbbe5d223a03632d4192414a8af0aa6e2c16555a6e7d33515225b4c978096

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3614ce9f83ae4cd0dc95f77730034a793a1c090a52dcf698ba1c94050afe3a8b
MD5 f89cc9e5729008dd08b9c79f72669164
BLAKE2b-256 3bbe0ae4a5c6c721ee42d849482084b5f4544acafe3c8cf4c84170f35c63fe50

See more details on using hashes here.

File details

Details for the file dbt_extractor-0.5.1-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for dbt_extractor-0.5.1-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 3b91e6106b967d908b34f83929d3f50ee2b498876a1be9c055fe060ed728c556
MD5 75c6a75fc90c116ef02a3c9b30ebddcb
BLAKE2b-256 771fca6d66d67464df1ea8e814d09b1100d15672ae4ce7f0dff41f67956e5f7f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page