Skip to main content

dlt destination for Firebolt (HTTP upload or S3 staging + COPY INTO)

Project description

dlt-firebolt

Community dlt destination for Firebolt.

Load data into Firebolt with dlt using direct HTTP upload (default) or S3 staging + COPY INTO for large loads.

Requires dlt-firebolt 0.2.0+ for upload mode. Earlier PyPI releases (0.1.x) support S3 staging only.

Two ways to load

Mode Best for Firebolt load path
upload (default) Firebolt Core, local dev, quick starts HTTP multipart → READ_PARQUET('upload://…')
s3 Managed Firebolt production, large bulk loads Parquet on S3 → COPY INTO

On managed Firebolt today, set FIREBOLT_STAGING_MODE=s3. Upload is the code default but the managed engine does not accept multipart upload yet; you will get a clear error if you use upload mode there. Upload mode is verified on Firebolt Core and local workflows.

Install

pip install "dlt-firebolt>=0.2.0"

Requires Python 3.10+.

Register the destination once in your project:

import firebolt_dest  # registers destination="firebolt"

Prerequisites

Upload mode (Core / local)

  1. Firebolt Core running locally (or another environment that supports upload://), or a managed account once upload is supported there.

No S3 bucket, external location, or AWS credentials required for the Firebolt load step. dlt still writes Parquet to a local staging directory during normalize.

S3 mode (managed production)

  1. Firebolt service account with access to your database and engine.
  2. S3 bucket for dlt filesystem staging.
  3. Firebolt external location for that bucket prefix:
CREATE LOCATION "your_location_name" WITH
  SOURCE = 'CLOUD_STORAGE'
  URL = 's3://your-bucket/your-prefix/'
  CREDENTIALS = (AWS_ROLE_ARN = 'arn:aws:iam::...:role/...');

Set FIREBOLT_S3_LOCATION_NAME (or s3_location_name in secrets) to the exact location name from CREATE LOCATION. Set FIREBOLT_STAGING_MODE=s3.

Quick start

Firebolt Core (upload, no S3)

export FIREBOLT_USE_CORE=1
export FIREBOLT_CORE_URL=http://localhost:3473
export FIREBOLT_STAGING_MODE=upload   # default
import dlt
import firebolt_dest
from firebolt_dest.configuration import make_firebolt_pipeline

@dlt.resource(name="orders", write_disposition="append")
def orders():
    yield {"order_id": 1, "customer": "Acme"}

pipeline = make_firebolt_pipeline(
    pipeline_name="my_pipeline",
    dataset_name="my_dataset",
)

pipeline.run(orders(), loader_file_format="parquet")

Managed Firebolt (S3)

Set service-account credentials and FIREBOLT_STAGING_MODE=s3 (see Configuration). The same pipeline code applies; only env vars change.

Tables are created as {dataset}_{table} (for example my_dataset_orders).

Using .dlt/secrets.toml

Environment variables are the primary path used in the e2e scripts. You can also load configuration from .dlt/secrets.toml with from_secrets=True (verified on Firebolt Core with upload mode; see .dlt/secrets.toml.example).

Note: In dlt's credential block, host is the Firebolt database name and database is the Firebolt engine name (matching FIREBOLT_DATABASE and FIREBOLT_ENGINE, not swapped).

pipeline = make_firebolt_pipeline(
    pipeline_name="my_pipeline",
    dataset_name="my_dataset",
    from_secrets=True,
)

Example for managed Firebolt (S3 mode):

[destination.firebolt]
staging_mode = "s3"
s3_location_name = "your_location_name"
s3_prefix = "dlt-landing"

[destination.firebolt.credentials]
host = "YOUR_FIREBOLT_DATABASE"
database = "YOUR_FIREBOLT_ENGINE"
username = "YOUR_CLIENT_ID"
password = "YOUR_CLIENT_SECRET"
account_name = "YOUR_ACCOUNT_NAME"

[destination.filesystem]
bucket_url = "s3://your-bucket/dlt-landing/dlt/staging"

See .dlt/secrets.toml.example for a Firebolt Core upload template.

Configuration

Managed Firebolt

Variable Required Description
FIREBOLT_CLIENT_ID yes Service account client ID
FIREBOLT_CLIENT_SECRET yes Service account secret
FIREBOLT_ACCOUNT_NAME yes Firebolt account name
FIREBOLT_DATABASE yes Target database
FIREBOLT_ENGINE yes Engine name
FIREBOLT_STAGING_MODE no s3 for managed production (recommended)
FIREBOLT_S3_LOCATION_NAME s3 mode Firebolt external location name
S3_BUCKET s3 mode Staging bucket
S3_PREFIX no Key prefix (default: dlt-landing)

The destination resolves the engine URL from your account; you do not set an HTTP endpoint manually.

Firebolt Core

Variable Required Description
FIREBOLT_USE_CORE yes Set to 1
FIREBOLT_CORE_URL no Core HTTP endpoint (default: http://localhost:3473)
FIREBOLT_CORE_DATABASE no Database name (default: firebolt)
FIREBOLT_STAGING_MODE no upload (default)
Variable Description
DLT_LOCAL_STAGING_DIR Local path for upload-mode normalize staging
DLT_DATASET_NAME Default dataset name (default: demo)

Set credentials via environment variables or .dlt/secrets.toml. Do not commit secrets.

Supported capabilities

Feature Support
Loader format Parquet only
Staging upload (HTTP, default) or s3 (COPY INTO)
append Yes
replace truncate-and-insert, insert-from-staging
merge delete-insert (single-table and nested)

Development

Clone the repository and install in editable mode with dev dependencies:

git clone https://github.com/firebolt-db/dlt-firebolt.git
cd dlt-firebolt
pip install -e ".[dev]"
cp .env.example .env   # fill in Firebolt credentials (and S3 for integration tests)
pytest -m "not integration"

Core e2e (upload, no S3):

bash scripts/core_e2e.sh my_dataset

GitHub stargazers example (blog):

bash scripts/github_stars_core_e2e.sh oss_analytics 1

Optional live integration tests (requires Firebolt, S3, and AWS credentials):

FIREBOLT_RUN_INTEGRATION=1 pytest -m integration -v

License

Apache License 2.0. See LICENSE.

Status

Community package maintained by Firebolt. Not part of core dlt.

  • Published on PyPI (pip install dlt-firebolt)
  • Append, merge, and replace dispositions
  • Nested multi-table merge
  • HTTP upload mode (0.2.0+, Firebolt Core)
  • Website integration docs
  • Optional listing on dlt community destinations page

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dlt_firebolt-0.2.0.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dlt_firebolt-0.2.0-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file dlt_firebolt-0.2.0.tar.gz.

File metadata

  • Download URL: dlt_firebolt-0.2.0.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dlt_firebolt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4a2ad9c6f7924951af775bd198de19efa6ed748a367d4f0d13345415abe3b182
MD5 7785066f95b892610bb501dfb4520d6e
BLAKE2b-256 c92e62f2fbb2bf3a9f25e9246d30c52b08bcb4501a81c87b7d5ed88d20fad7e3

See more details on using hashes here.

File details

Details for the file dlt_firebolt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dlt_firebolt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dlt_firebolt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41b17f6e5f562e94a44fd8adeae78f5c0ffb113243d8a536f66f8bb65965aef7
MD5 e7cabe6c391a7c787b86af8628019ae5
BLAKE2b-256 c79bb0d4e7fe58dfd068ddff694f813d7f47ac10364ffacf972c65a72bd4532f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page