IBM watsonx.data spark plugin for dbt
Project description
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.
dbt-watsonx-spark
The dbt-watsonx-spark
package contains all of the code enabling dbt to work with IBM Spark on watsonx.data. Read the official documentation for using watsonx.data with dbt-watsonx-spark
Getting started
- Install dbt
- Read the introduction and viewpoint
Installation
To install the dbt-watsonx-spark
plugin, use pip:
$ pip install dbt-watsonx-spark
Configuration
Ensure you have started a query server from watsonx.data. Create an entry in your ~/.dbt/profiles.yml file using the following options:
- You can view connection details by clicking on the three-dot menu for query server.
- You can construct and configure the profile using the below template
- You can copy your connection information details also from going to Configuration tab -> Connection Information -> Data Build Tool (DBT)
dbt_wxd:
target: dev
outputs:
dev:
type: watsonx_spark
method: "http"
# number of threads for DBT operations, refer: https://docs.getdbt.com/docs/running-a-dbt-project/using-threads
threads: 1
# value of 'schema' for an existing schema in Data Manager in watsonx.data or to create a new one in watsonx.data
schema: '<wxd_schema>'
# Hostname of your watsonx.data console (ex: us-south.lakehouse.cloud.ibm.com)
host: https://<your-host>.com
# URI of your query server running on watsonx.data
uri: "/lakehouse/api/v2/spark_engines/<spark_engine_id>/sql_servers/<server_id>/connect/cliservice"
# Catalog linked to your Spark engine within the query server
catalog: "<wxd_catalog>"
# Optional: Disable SSL verification
use_ssl: false
auth:
# In case of SaaS, set it as CRN of watsonx.data service
# In case of Software, set it as instance id of watsonx.data
instance: "<CRN/InstanceId>"
# In case of SaaS, set it as your email id
# In case of Software, set it as your username
user: "<user@example.com/username>"
# This must be your API Key
apikey: "<apikey>"
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dbt_watsonx_spark-0.0.7.tar.gz
.
File metadata
- Download URL: dbt_watsonx_spark-0.0.7.tar.gz
- Upload date:
- Size: 63.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a6a0d1aaff02012fa94776ab62e5c9b5722209ab25c75592e8d6d24084dd122 |
|
MD5 | 96618ccef3cd4583099ae257307681a1 |
|
BLAKE2b-256 | 9262eaa24c3776c4ed0ed3b8f7b1582c565efc4dfa3cc79013e32a2af06b0501 |
File details
Details for the file dbt_watsonx_spark-0.0.7-py3-none-any.whl
.
File metadata
- Download URL: dbt_watsonx_spark-0.0.7-py3-none-any.whl
- Upload date:
- Size: 74.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14afdf062a96f362624622327c8ce905fc8288cf2f3a3a3acee2193b3d3a9b4e |
|
MD5 | c93e52c235d0eb87c270b6ba582a64d7 |
|
BLAKE2b-256 | d33f799e6bbf6e6d2b39a3413b1c4218fbb0b02bccbeedb2f21483454ba9521e |