dbt-watsonx-spark

IBM watsonx.data spark plugin for dbt

These details have not been verified by PyPI

Project links

Homepage

Project description

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

dbt-watsonx-spark

The dbt-watsonx-spark package contains all of the code enabling dbt to work with IBM Spark on watsonx.data. Read the official documentation for using watsonx.data with dbt-watsonx-spark

Getting started

Install dbt
Read the introduction and viewpoint

Installation

To install the dbt-watsonx-spark plugin, use pip:

$ pip install dbt-watsonx-spark

Configuration

Ensure you have started a query server from watsonx.data. Create an entry in your ~/.dbt/profiles.yml file using the following options:

You can view connection details by clicking on the three-dot menu for query server.
You can construct and configure the profile using the below template
You can copy your connection information details also from going to Configuration tab -> Connection Information -> Data Build Tool (DBT)

dbt_wxd:

  target: dev
  outputs:
    dev:
      type: watsonx_spark
      method: "http"
      
      # number of threads for DBT operations, refer: https://docs.getdbt.com/docs/running-a-dbt-project/using-threads
      threads: 1

      # value of 'schema' for an existing schema in Data Manager in watsonx.data or to create a new one in watsonx.data
      schema: '<wxd_schema>'
      
      # Hostname of your watsonx.data console (ex: us-south.lakehouse.cloud.ibm.com)
      host: https://<your-host>.com

      # URI of your query server running on watsonx.data
      uri: "/lakehouse/api/v2/spark_engines/<spark_engine_id>/sql_servers/<server_id>/connect/cliservice"
      
      # Catalog linked to your Spark engine within the query server
      catalog: "<wxd_catalog>"
      
      # Optional: Disable SSL verification
      use_ssl: false

      # Optional: Control automatic schema creation (default: true)
      # Set to false if schemas are managed externally (e.g., by Ops team)
      create_schemas: true

      # Optional: Control automatic LOCATION clause in CREATE TABLE (default: true)
      # Set to false if table locations are managed externally or to avoid permission issues
      auto_location: false

      auth:
        # In case of SaaS, set it as CRN of watsonx.data service
        # In case of Software, set it as instance id of watsonx.data
        instance: "<CRN/InstanceId>"
        
        # In case of SaaS, set it as your email id
        # In case of Software, set it as your username
        user: "<user@example.com/username>"

        # This must be your API Key
        apikey: "<apikey>"

Schema Creation Control

By default, dbt-watsonx-spark automatically creates schemas if they don't exist. However, in some environments where schema creation is managed by an operations team or through automation, you may want to disable this behavior.

You can control schema creation at three levels:

Profile level (applies to all models in the profile):

dbt_wxd:
  target: dev
  outputs:
    dev:
      type: watsonx_spark
      # ... other settings ...
      create_schemas: false  # Disable automatic schema creation

Project level (in dbt_project.yml):

models:
  my_project:
    +create_schemas: false  # Disable for all models in project

Model level (in model config or dbt_project.yml):

# In model file
{{ config(create_schemas=false) }}

# Or in dbt_project.yml
models:
  my_project:
    my_folder:
      +create_schemas: false  # Disable for specific folder

When create_schemas is set to false, dbt will skip schema creation and assume the schema already exists. This is useful when:

Schemas are created by an external automation or Ops team
You want to enforce strict schema management policies
You need to prevent accidental schema creation in production environments

Table Location Control

By default, dbt-watsonx-spark automatically adds a LOCATION clause when creating tables based on the location_root configuration. However, in some environments where table locations are managed externally or to avoid S3 permission issues, you may want to disable this behavior.

You can control automatic location setting at three levels:

Profile level (applies to all models in the profile):

dbt_wxd:
  target: dev
  outputs:
    dev:
      type: watsonx_spark
      # ... other settings ...
      auto_location: false  # Disable automatic LOCATION clause

Project level (in dbt_project.yml):

models:
  my_project:
    +auto_location: false  # Disable for all models in project

Model level (in model config or dbt_project.yml):

# In model file
{{ config(auto_location=false) }}

# Or in dbt_project.yml
models:
  my_project:
    my_folder:
      +auto_location: false  # Disable for specific folder

When auto_location is set to false, dbt will not add a LOCATION clause to CREATE TABLE statements, allowing the database to use its default location or a location specified by external schema management. This is useful when:

Table locations are managed by an external automation or Ops team
You want to avoid S3 permission issues related to specific paths
The schema already has a default location configured
You need to comply with strict data governance policies

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.6

Jun 14, 2026

0.1.5

Jun 8, 2026

0.1.4

May 14, 2026

0.1.3

Apr 7, 2026

0.1.2

Dec 1, 2025

0.1.1

Oct 13, 2025

0.1.0

Sep 9, 2025

0.0.9

Aug 19, 2025

0.0.8

Dec 5, 2024

0.0.7

Nov 14, 2024

0.0.6

Nov 13, 2024

0.0.5

Sep 25, 2024

0.0.4

Sep 2, 2024

0.0.3

Sep 2, 2024

0.0.2

Sep 2, 2024

0.0.1

Sep 2, 2024

0.0.1b3 pre-release

May 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_watsonx_spark-0.1.6.tar.gz (49.8 kB view details)

Uploaded Jun 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbt_watsonx_spark-0.1.6-py3-none-any.whl (60.6 kB view details)

Uploaded Jun 14, 2026 Python 3

File details

Details for the file dbt_watsonx_spark-0.1.6.tar.gz.

File metadata

Download URL: dbt_watsonx_spark-0.1.6.tar.gz
Upload date: Jun 14, 2026
Size: 49.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for dbt_watsonx_spark-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`53574135ff884c9181f73f3ebdd7ceb0ff046520e80d1adbef54fb976fa3f763`
MD5	`e8add00e4143b75b4b41c542d189a74d`
BLAKE2b-256	`9c1a3c5d593c8bb66e178b9932d1ed54a3d732d3c4c6069fdcf4155451275692`

See more details on using hashes here.

File details

Details for the file dbt_watsonx_spark-0.1.6-py3-none-any.whl.

File metadata

Download URL: dbt_watsonx_spark-0.1.6-py3-none-any.whl
Upload date: Jun 14, 2026
Size: 60.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for dbt_watsonx_spark-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`82239eace33a8053c52b1a11a6ef24d8074396264c9b886be74fe7f2d2650672`
MD5	`f763b4d4b97a6efc73050254094dabd5`
BLAKE2b-256	`fa4392adc117daeaeba889313365b88b1ae39e26c6eabec6d80efb6bcd0df165`

See more details on using hashes here.

dbt-watsonx-spark 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dbt-watsonx-spark

Getting started

Installation

Configuration

Schema Creation Control

Table Location Control

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes