Skip to main content

A dbt-core plugin to import public nodes in multi-project deployments.

Project description

dbt-loom

pypi version shield

dbt-loom is a dbt Core plugin that weaves together multi-project deployments. dbt-loom works by fetching public model definitions from your dbt artifacts, and injecting those models into your dbt project.

flowchart LR

    classDef black fill:#f2f2ebff, stroke:#000, color:#000
    classDef background fill:#f2f2ebff, stroke:#000, color:#000
    classDef hidden fill:#BADC3F, stroke:#BADC3F, color:#BADC3F

   style TOP fill:#BADC3F, stroke:#000

  subgraph TOP[Your Infrastructure]
    direction TB
    dbt_runtime[dbt Core]:::background
    proprietary_plugin[Open Source Metadata Plugin]:::background

    files[Local and Remote Files]:::background
    object_storage[Object Storage]:::background
    discovery_api[dbt Cloud APIs]:::background

    discovery_api --> proprietary_plugin
    files --> proprietary_plugin
    object_storage --> proprietary_plugin
    proprietary_plugin --> dbt_runtime
  end

  Project:::black --> TOP --> Warehouse:::black

dbt-loom currently supports obtaining model definitions from:

  • Local manifest files
  • Remote manifest files via http(s)
  • dbt Cloud
  • GCS
  • S3-compatible object storage services
  • Azure Storage

:warning: dbt Core's plugin functionality is still in beta. Please note that this may break in the future as dbt Labs solidifies the dbt plugin API in future versions.

Getting Started

To being, install the dbt-loom python package.

pip install dbt-loom

Next, create a dbt-loom configuration file. This configuration file provides the paths for your upstream project's manifest files.

manifests:
  - name: project_name # This should match the project's real name
    type: file
    config:
      # A path to your manifest. This can be either a local path, or a remote
      # path accessible via http(s).
      path: path/to/manifest.json

By default, dbt-loom will look for dbt_loom.config.yml in your working directory. You can also set the DBT_LOOM_CONFIG environment variable.

Using dbt Cloud as an artifact source

You can use dbt-loom to fetch model definitions from dbt Cloud by setting up a dbt-cloud manifest in your dbt-loom config, and setting the DBT_CLOUD_API_TOKEN environment variable in your execution environment.

manifests:
  - name: project_name
    type: dbt_cloud
    config:
      account_id: <YOUR DBT CLOUD ACCOUNT ID>

      # Job ID pertains to the job that you'd like to fetch artifacts from.
      job_id: <REFERENCE JOB ID>

      api_endpoint: <DBT CLOUD ENDPOINT>
      # dbt Cloud has multiple regions with different URLs. Update this to
      # your appropriate dbt cloud endpoint.

      step_id: <JOB STEP>
      # If your job generates multiple artifacts, you can set the step from
      # which to fetch artifacts. Defaults to the last step.

Using an S3-compatible object store as an artifact source

You can use dbt-loom to fetch manifest files from S3-compatible object stores by setting up ab s3 manifest in your dbt-loom config. Please note that this approach supports all standard boto3-compatible environment variables and authentication mechanisms. Please see the boto3 documentation for more details.

manifests:
  - name: project_name
    type: s3
    config:
      bucket_name: <YOUR S3 BUCKET NAME>
      # The name of the bucket where your manifest is stored.

      object_name: <YOUR OBJECT NAME>
      # The object name of your manifest file.

Using GCS as an artifact source

You can use dbt-loom to fetch manifest files from Google Cloud Storage by setting up a gcs manifest in your dbt-loom config.

manifests:
  - name: project_name
    type: gcs
    config:
      project_id: <YOUR GCP PROJECT ID>
      # The alphanumeric ID of the GCP project that contains your target bucket.

      bucket_name: <YOUR GCS BUCKET NAME>
      # The name of the bucket where your manifest is stored.

      object_name: <YOUR OBJECT NAME>
      # The object name of your manifest file.

      credentials: <PATH TO YOUR SERVICE ACCOUNT JSON CREDENTIALS>
      # The OAuth2 Credentials to use. If not passed, falls back to the default inferred from the environment.

Using Azure Storage as an artifact source

You can use dbt-loom to fetch manifest files from Azure Storage by setting up an azure manifest in your dbt-loom config. The azure type implements the DefaultAzureCredential class, supporting all environment variables and authentication mechanisms. Alternatively, set the AZURE_STORAGE_CONNECTION_STRING environment variable to authenticate via a connection string.

manifests:
  - name: project_name
    type: azure
    config:
      account_name: <YOUR AZURE STORAGE ACCOUNT NAME> # The name of your Azure Storage account
      container_name: <YOUR AZURE STORAGE CONTAINER NAME> # The name of your Azure Storage container
      object_name: <YOUR OBJECT NAME> # The object name of your manifest file.

Using environment variables

You can easily incorporate your own environment variables into the config file. This allows for dynamic configuration values that can change based on the environment. To specify an environment variable in the dbt-loom config file, use one of the following formats:

${ENV_VAR} or $ENV_VAR

Example:

manifests:
  - name: revenue
    type: gcs
    config:
      project_id: ${GCP_PROJECT}
      bucket_name: ${GCP_BUCKET}
      object_name: ${MANIFEST_PATH}

Gzipped files

dbt-loom natively supports decompressing gzipped manifest files. This is useful to reduce object storage size and to minimize loading times when reading manifests from object storage. Compressed file detection is triggered when the file path for the manifest is suffixed with .gz.

manifests:
  - name: revenue
    type: s3
    config:
      bucket_name: example_bucket_name
      object_name: manifest.json.gz

How does it work?

As of dbt-core 1.6.0-b8, there now exists a dbtPlugin class which defines functions that can be called by dbt-core's PluginManger. During different parts of the dbt-core lifecycle (such as graph linking and manifest writing), the PluginManger will be called and all plugins registered with the appropriate hook will be executed.

dbt-loom implements a get_nodes hook, and uses a configuration file to parse manifests, identify public models, and inject those public models when called by dbt-core.

Known Caveats

Cross-project dependencies are a relatively new development, and dbt-core plugins are still in beta. As such there are a number of caveats to be aware of when using this tool.

  1. dbt plugins are only supported in dbt-core version 1.6.0-b8 and newer. This means you must be using a dbt adapter compatible with this version.
  2. PluginNodeArgs are not fully-realized dbt ManifestNodes, so documentation generated by dbt docs generate may be sparse when viewing injected models.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_loom-0.6.0.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

dbt_loom-0.6.0-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file dbt_loom-0.6.0.tar.gz.

File metadata

  • Download URL: dbt_loom-0.6.0.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.11.10 Linux/6.5.0-1025-azure

File hashes

Hashes for dbt_loom-0.6.0.tar.gz
Algorithm Hash digest
SHA256 d158746b18e675db007de350c064a90a95ed3ef599165fdb32e666e4b8934456
MD5 759c0d2bdc8f05eaa534320dfc64d9e4
BLAKE2b-256 85e9f6861793ab4eb1b9633cf47c4351855232183359357e66efc9d06165526e

See more details on using hashes here.

File details

Details for the file dbt_loom-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_loom-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.11.10 Linux/6.5.0-1025-azure

File hashes

Hashes for dbt_loom-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 55203d2160bb40fec8bbd4a55037c2dd0c80ba96f2a208c322166687030d7cb9
MD5 973fd42f9ebb8a9e4137b14727dd9b75
BLAKE2b-256 9f1230ac49448d311cd03c39abc1c7075590ce6b0e06c6a2c5afb30d0cff9cd2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page