Skip to main content

dlt destination for Firebolt (HTTP upload or S3 staging + COPY INTO)

Project description

dlt-firebolt

Community dlt destination for Firebolt.

Load data into Firebolt with dlt using direct HTTP upload (default) or S3 staging + COPY INTO for large loads.

Requires dlt-firebolt 0.3.0+ for upload mode and simplified Core/managed configuration.

Two ways to load

Mode Best for Firebolt load path
upload (default) Firebolt Core, local dev, quick starts HTTP multipart → READ_PARQUET('upload://…')
s3 Managed Firebolt production, large bulk loads Parquet on S3 → COPY INTO

On managed Firebolt today, set FIREBOLT_STAGING_MODE=s3. Upload is the code default but the managed engine does not accept multipart upload yet; you will get a clear error if you use upload mode there. The runner needs AWS credentials to write staging files to S3; Firebolt reads them via your external location. Staging objects are not deleted automatically after COPY INTO.

Install

pip install "dlt-firebolt>=0.3.0"

Requires Python 3.10+.

Register the destination once in your project:

import firebolt_dest  # registers destination="firebolt"

Prerequisites

Upload mode (Core / local)

  1. Firebolt Core running locally (or another environment that supports upload://), or a managed account once upload is supported there.

No S3 bucket, external location, or AWS credentials required for the Firebolt load step. dlt still writes Parquet to a local staging directory during normalize.

S3 mode (managed production)

  1. Firebolt service account with access to your database and engine.
  2. S3 bucket for dlt filesystem staging.
  3. Firebolt external location for that bucket prefix:
CREATE LOCATION "your_location_name" WITH
  SOURCE = 'CLOUD_STORAGE'
  URL = 's3://your-bucket/your-prefix/'
  CREDENTIALS = (AWS_ROLE_ARN = 'arn:aws:iam::...:role/...');

Set FIREBOLT_S3_LOCATION_NAME (or s3_location_name in secrets) to the exact location name from CREATE LOCATION. Set FIREBOLT_STAGING_MODE=s3.

Quick start

Firebolt Core (upload, no S3)

export FIREBOLT_CORE_URL=http://localhost:3473
import dlt
import firebolt_dest
from firebolt_dest.configuration import make_firebolt_pipeline

@dlt.resource(name="orders", write_disposition="append")
def orders():
    yield {"order_id": 1, "customer": "Acme"}

pipeline = make_firebolt_pipeline(
    pipeline_name="my_pipeline",
    dataset_name="my_dataset",
)

pipeline.run(orders(), loader_file_format="parquet")

Managed Firebolt (S3)

Set service-account credentials and FIREBOLT_STAGING_MODE=s3 (see Configuration). The same pipeline code applies; only env vars change.

Tables are created as {dataset}_{table} (for example my_dataset_orders).

Using .dlt/secrets.toml

Environment variables are the primary path used in the e2e scripts. You can also load configuration from .dlt/secrets.toml with from_secrets=True (verified on Firebolt Core with upload mode; see .dlt/secrets.toml.example).

Note: In dlt's credential block, host is the Firebolt database name and database is the Firebolt engine name (matching FIREBOLT_DATABASE and FIREBOLT_ENGINE, not swapped).

pipeline = make_firebolt_pipeline(
    pipeline_name="my_pipeline",
    dataset_name="my_dataset",
    from_secrets=True,
)

Example for managed Firebolt (S3 mode):

[destination.firebolt]
staging_mode = "s3"
s3_location_name = "your_location_name"
s3_prefix = "dlt-landing"

[destination.firebolt.credentials]
host = "YOUR_FIREBOLT_DATABASE"
database = "YOUR_FIREBOLT_ENGINE"
username = "YOUR_CLIENT_ID"
password = "YOUR_CLIENT_SECRET"
account_name = "YOUR_ACCOUNT_NAME"

[destination.filesystem]
bucket_url = "s3://your-bucket/dlt-landing/dlt/staging"

See .dlt/secrets.toml.example for a Firebolt Core upload template.

Configuration

Managed Firebolt

Variable Required Description
FIREBOLT_CLIENT_ID yes Service account client ID
FIREBOLT_CLIENT_SECRET yes Service account secret
FIREBOLT_ACCOUNT_NAME yes Firebolt account name
FIREBOLT_DATABASE yes Target database
FIREBOLT_ENGINE yes Engine name
FIREBOLT_STAGING_MODE no s3 for managed production (recommended)
FIREBOLT_S3_LOCATION_NAME s3 mode Firebolt external location name
S3_BUCKET s3 mode Staging bucket
S3_PREFIX no Key prefix (default: dlt-landing)

The destination resolves the engine URL from your account; you do not set an HTTP endpoint manually.

Firebolt Core

Variable Required Description
FIREBOLT_CORE_URL yes Core HTTP endpoint (selects Core when set)
FIREBOLT_CORE_DATABASE no Database name (default: firebolt)
FIREBOLT_STAGING_MODE no upload (default)
Variable Description
DLT_LOCAL_STAGING_DIR Local path for upload-mode normalize staging
DLT_DATASET_NAME Default dataset name (default: demo)

Set credentials via environment variables or .dlt/secrets.toml. Do not commit secrets.

Supported capabilities

Feature Support
Loader format Parquet only
Staging upload (HTTP, default) or s3 (COPY INTO)
append Yes
replace truncate-and-insert, insert-from-staging
merge delete-insert (single-table and nested)

Development

Clone the repository and install in editable mode with dev dependencies:

git clone https://github.com/firebolt-db/dlt-firebolt.git
cd dlt-firebolt
pip install -e ".[dev]"
cp .env.example .env   # fill in Firebolt credentials (and S3 for integration tests)
pytest -m "not integration"

Core e2e (upload, no S3):

bash scripts/core_e2e.sh my_dataset          # nested merge
bash scripts/hn_core_e2e.sh hn_blog 30     # blog example

Optional live integration tests (requires Firebolt, S3, and AWS credentials):

FIREBOLT_RUN_INTEGRATION=1 pytest -m integration -v

License

Apache License 2.0. See LICENSE.

Status

Community package maintained by Firebolt. Not part of core dlt.

  • Published on PyPI (pip install dlt-firebolt)
  • Append, merge, and replace dispositions
  • Nested multi-table merge
  • HTTP upload mode (0.3.0+, Firebolt Core)
  • Website integration docs
  • Optional listing on dlt community destinations page

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dlt_firebolt-0.3.0.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dlt_firebolt-0.3.0-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file dlt_firebolt-0.3.0.tar.gz.

File metadata

  • Download URL: dlt_firebolt-0.3.0.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dlt_firebolt-0.3.0.tar.gz
Algorithm Hash digest
SHA256 eb0a64d7380ca92a70a8b7f5521bdb6ac7e24bdafef80a53d222feb54c1e7ef9
MD5 5de5e8c3d1082d6f80b11892a79797ad
BLAKE2b-256 fd34e004272248690c2b8e5c378b1f871bcdcaee28283f85427457f2123f4d5c

See more details on using hashes here.

File details

Details for the file dlt_firebolt-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dlt_firebolt-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 16.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dlt_firebolt-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8389c2d994f769ce859ca29689402af533d713f08682715b64b204cead8ee7ff
MD5 66198e28f8f445d3bb745eb9854014a7
BLAKE2b-256 606e654569b965dc6157e688b1310a17ebc10a24c79746ccfa4539c351194221

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page