Skip to main content

The Watsonx.data spark plugin for dbt

Project description

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

dbt-watsonx-spark

The dbt-watsonx-spark package contains all of the code enabling dbt to work with IBM Spark on watsonx.data. Read the official documentation for using watsonx.data with dbt-watsonx-spark -

  • Documetnation for IBM Cloud and SaaS offerrings
  • Documentation for IBM watsonx.data software

Getting started

Installation

To install the dbt-watsonx-presto plugin, use pip:

$ pip install dbt-watsonx-presto

Configuration

Ensure you have started Spark SQL server from watsonx.data. Create an entry in your ~/.dbt/profiles.yml file using the following options:

  • You can view connection details by clicking on the three-dot menu for SQL server.
  • You can construct and configure the profile using the below template -
dbt_wxd:

  target: dev
  outputs:
    dev:
      type: watsonx_spark
      method: "http"
      
      # number of threads for DBT operations, refer: https://docs.getdbt.com/docs/running-a-dbt-project/using-threads
      threads: 1

      # value of 'schema' must be one of the schema defined in Data Manager in watsonx.data
      schema: '<wxd_schema>'
      
      # Hostname of your watsonx.data console (ex: us-south.lakehouse.cloud.ibm.com)
      host: https://<your-host>.com

      # Uri of your Spark SQL server running on watsonx.data
      uri: "/lakehouse/api/v2/spark_engines/<spark_engine_id>/sql_servers/<server_id>/connect/cliservice"

      auth:
        # In case of SaaS, set it as CRN of watsonx.data service
        # In case of Software, set it as instance id of watsonx.data
        instance: "<CRN/InstanceId>"
        
        # In case of SaaS, set it as your email id
        # In case of Software, set it as your username
        user: "<user@example.com/username>"

        # This must be your API Key
        apikey: "<apikey>"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_watsonx_spark-0.0.1.tar.gz (61.1 kB view details)

Uploaded Source

Built Distribution

dbt_watsonx_spark-0.0.1-py3-none-any.whl (72.5 kB view details)

Uploaded Python 3

File details

Details for the file dbt_watsonx_spark-0.0.1.tar.gz.

File metadata

  • Download URL: dbt_watsonx_spark-0.0.1.tar.gz
  • Upload date:
  • Size: 61.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for dbt_watsonx_spark-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b746e19fae8832532b7f9bee0799240fdfabd45d59e75271fec0b791c567fac5
MD5 82086f7a3ad5c6f5c0ac5ba6411d6eda
BLAKE2b-256 ffec30bc274fd6b572a28f29ea8599a8a430043ba87761621e844c4746c6efb7

See more details on using hashes here.

File details

Details for the file dbt_watsonx_spark-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_watsonx_spark-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c89f40fd50ed4300f961e84c3a129d116f41ed3e8c2ed0bc1676a770d0a6e197
MD5 cefaa1249041c6f382dac02faf3c2a71
BLAKE2b-256 37adc9736dcc7c21ffc3e7538c8be3e40d866953bbe7e513c59a4247ae7ce663

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page