The Apache Spark adapter plugin for dbt

These details have not been verified by PyPI

Project links

Project description

This is a fork of the dbt-spark adapter with added compatibility for <catalog>.<schema>.<table> format. This allows creating tables across different catalogs with a single dbt run.
Works locally with the vscode databricks extension. Sign-in with databricks auth login and ensure that the .databricks/.databricks.env file is created by the extension.
Works with databricks job clusters by using the spark session.

dbt logo

dbt

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

dbt-spark

dbt-spark enables dbt to work with Apache Spark. For more information on using dbt with Spark, consult the docs.

Getting started

Review the repository README.md as most of that information pertains to dbt-spark.

Running locally

A docker-compose environment starts a Spark Thrift server and a Postgres database as a Hive Metastore backend. Note: dbt-spark now supports Spark 3.3.2.

The following command starts two docker containers:

docker-compose up -d

It will take a bit of time for the instance to start, you can check the logs of the two containers. If the instance doesn't start correctly, try the complete reset command listed below and then try start again.

Create a profile like this one:

spark_testing:
  target: local
  outputs:
    local:
      type: spark
      method: thrift
      host: 127.0.0.1
      port: 10000
      user: dbt
      schema: analytics
      connect_retries: 5
      connect_timeout: 60
      retry_all: true

Connecting to the local spark instance:

The Spark UI should be available at http://localhost:4040/sqlserver/
The endpoint for SQL-based testing is at http://localhost:10000 and can be referenced with the Hive or Spark JDBC drivers using connection string jdbc:hive2://localhost:10000 and default credentials dbt:dbt

Note that the Hive metastore data is persisted under ./.hive-metastore/, and the Spark-produced data under ./.spark-warehouse/. To completely reset you environment run the following:

docker-compose down
rm -rf ./.hive-metastore/
rm -rf ./.spark-warehouse/

Additional Configuration for MacOS

If installing on MacOS, use homebrew to install required dependencies.

brew install unixodbc

Contribute

Want to help us build dbt-spark? Check out the Contributing Guide.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.9.2

Aug 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_spark_dbx_compat-1.9.2.tar.gz (139.9 kB view details)

Uploaded Aug 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbt_spark_dbx_compat-1.9.2-py3-none-any.whl (50.9 kB view details)

Uploaded Aug 4, 2025 Python 3

File details

Details for the file dbt_spark_dbx_compat-1.9.2.tar.gz.

File metadata

Download URL: dbt_spark_dbx_compat-1.9.2.tar.gz
Upload date: Aug 4, 2025
Size: 139.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.20

File hashes

Hashes for dbt_spark_dbx_compat-1.9.2.tar.gz
Algorithm	Hash digest
SHA256	`a40a6cbef22d5e897cd1d8d4cb299c8d99f3e9e4922ca35126b8fa6bdb4d6279`
MD5	`766abca9f67753608895e429a78856c4`
BLAKE2b-256	`ef6042ae7cb974937001610b5fa963e4ec0419e67ade013237a4a0122d3d47bc`

See more details on using hashes here.

File details

Details for the file dbt_spark_dbx_compat-1.9.2-py3-none-any.whl.

File metadata

Download URL: dbt_spark_dbx_compat-1.9.2-py3-none-any.whl
Upload date: Aug 4, 2025
Size: 50.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.20

File hashes

Hashes for dbt_spark_dbx_compat-1.9.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`afe6b58ad2f071bae1036eed28f87246a4d82859a5b6d4bb0ee6ea2529d4f4e6`
MD5	`458b0d487b1eec0ac19dd3e51f310d13`
BLAKE2b-256	`15754ac1df1c6a8dc2d05325e68cd1fdefa6b180f629aefac3043d8cca9536cf`

See more details on using hashes here.

dbt-spark-dbx-compat 1.9.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dbt

dbt-spark

Getting started

Running locally

Additional Configuration for MacOS

Contribute

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes