Skip to main content

IBM watsonx.data spark plugin for dbt

Project description

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

dbt-watsonx-spark

The dbt-watsonx-spark package contains all of the code enabling dbt to work with IBM Spark on watsonx.data. Read the official documentation for using watsonx.data with dbt-watsonx-spark

Getting started

Installation

To install the dbt-watsonx-spark plugin, use pip:

$ pip install dbt-watsonx-spark

Configuration

Ensure you have started a query server from watsonx.data. Create an entry in your ~/.dbt/profiles.yml file using the following options:

  • You can view connection details by clicking on the three-dot menu for query server.
  • You can construct and configure the profile using the below template
  • You can copy your connection information details also from going to Configuration tab -> Connection Information -> Data Build Tool (DBT)
dbt_wxd:

  target: dev
  outputs:
    dev:
      type: watsonx_spark
      method: "http"
      
      # number of threads for DBT operations, refer: https://docs.getdbt.com/docs/running-a-dbt-project/using-threads
      threads: 1

      # value of 'schema' for an existing schema in Data Manager in watsonx.data or to create a new one in watsonx.data
      schema: '<wxd_schema>'
      
      # Hostname of your watsonx.data console (ex: us-south.lakehouse.cloud.ibm.com)
      host: https://<your-host>.com

      # URI of your query server running on watsonx.data
      uri: "/lakehouse/api/v2/spark_engines/<spark_engine_id>/sql_servers/<server_id>/connect/cliservice"
      
      # Catalog linked to your Spark engine within the query server
      catalog: "<wxd_catalog>"
      
      # Optional: Disable SSL verification
      use_ssl: false

      auth:
        # In case of SaaS, set it as CRN of watsonx.data service
        # In case of Software, set it as instance id of watsonx.data
        instance: "<CRN/InstanceId>"
        
        # In case of SaaS, set it as your email id
        # In case of Software, set it as your username
        user: "<user@example.com/username>"

        # This must be your API Key
        apikey: "<apikey>"
        

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_watsonx_spark-0.1.2.tar.gz (54.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_watsonx_spark-0.1.2-py3-none-any.whl (55.1 kB view details)

Uploaded Python 3

File details

Details for the file dbt_watsonx_spark-0.1.2.tar.gz.

File metadata

  • Download URL: dbt_watsonx_spark-0.1.2.tar.gz
  • Upload date:
  • Size: 54.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.1

File hashes

Hashes for dbt_watsonx_spark-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b241d89c2e08ef22933dc1ad52294708e5221e143d41db13b922073c5b9a7292
MD5 d9f48b757a29411d988fd431a6b73b29
BLAKE2b-256 7c8b0f66011bf6d7fb798fc73b6a3a809e73a83f84e622ecda3a9337606926dd

See more details on using hashes here.

File details

Details for the file dbt_watsonx_spark-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_watsonx_spark-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dc7f82656ba0584547f0f03819e67cd3049756f70b3896faacb76959ba30fbcf
MD5 270c27fc2d4b64a22a203b14c3ce200f
BLAKE2b-256 7d4711a77713c1d5d841452f7e414d202a25e784a810d603575b6ede0b9d5fc6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page