Skip to main content

dlt destination for Firebolt (staged Parquet + COPY INTO)

Project description

dlt-firebolt

Prototype dlt destination for Firebolt.

Loads dlt pipelines into Firebolt using filesystem staging (Parquet on S3) + COPY INTO, the same pattern as dlt's Snowflake and Redshift destinations.

Status

Spike complete. Hardening done; packaging and upstream prep in progress.

Phase What it proved
1 dlt → S3 Parquet → manual COPY INTO
2 Generic sqlalchemy destination is not viable on Firebolt
3 Native destination="firebolt" end-to-end
4 Append / merge / replace disposition scripts

See SPIKE.md for spike notes.

License

Apache License 2.0 — see LICENSE.

Install

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
cp .env.example .env   # fill in Firebolt + S3 creds

Or install dependencies only (no editable package):

pip install -r requirements.txt
pip install -r requirements-dev.txt

Quick start (Phase 3 demo)

Requires:

  • Firebolt CREATE LOCATION for your S3 bucket — set FIREBOLT_S3_LOCATION_NAME to the location name (e.g. sprinto_s3)
  • HubSpot private app token in .env (demo only)
  • AWS credentials for S3 staging
export AWS_PROFILE=your-profile
python phase3_hubspot_to_firebolt.py

Optional: copy .dlt/secrets.toml.example to .dlt/secrets.toml and run with DLT_USE_SECRETS=1.

Before running demos, validate credentials:

python check_firebolt_env.py

Disposition checks (Phase 4)

Run each command separately (do not paste inline comments):

python phase4_dispositions.py --mode merge
python phase4_dispositions.py --mode append
python phase4_dispositions.py --mode append
python phase4_dispositions.py --mode replace

For append, run the command twice and confirm the row count grows.

Verify in Firebolt (default dataset demo):

SELECT COUNT(*) FROM demo_hubspot_contacts;

Usage in a dlt pipeline

import sys
from pathlib import Path

sys.path.insert(0, str(Path(__file__).resolve().parent))

import dlt
from firebolt_dest.configuration import make_firebolt_pipeline

pipeline = make_firebolt_pipeline(
    pipeline_name="my_pipeline",
    dataset_name="my_dataset",
)

pipeline.run(my_resource(), loader_file_format="parquet")

Or with .dlt/secrets.toml:

pipeline = make_firebolt_pipeline(
    pipeline_name="my_pipeline",
    dataset_name="my_dataset",
    from_secrets=True,
)

Tables land as {dataset}_{table} (e.g. my_dataset_orders).

Connection details from environment variables — see .env.example — or from .dlt/secrets.toml — see .dlt/secrets.toml.example.

Layout

firebolt_dest/          # destination implementation (fork Redshift COPY pattern)
  factory.py            # registers destination="firebolt"
  client.py             # COPY load jobs
  sql_client.py         # Firebolt SQLAlchemy client
  copy_sql.py           # COPY INTO SQL generation
  configuration.py      # credentials + S3 location config
phase1_*.py             # spike: dlt → S3 only
phase2_*.py             # spike: dialect smoke test
phase3_*.py             # spike: full native destination demo
phase4_*.py             # append / merge / replace disposition checks
.dlt/config.toml        # non-sensitive dlt defaults (parquet loader)
.dlt/secrets.toml.example
tests/                  # unit tests (no Firebolt connection)

Configuration

Variable Required Description
FIREBOLT_CLIENT_ID yes Service account client ID
FIREBOLT_CLIENT_SECRET yes Service account secret
FIREBOLT_ACCOUNT_NAME yes Firebolt account name
FIREBOLT_DATABASE yes Target database
FIREBOLT_ENGINE yes Engine name
FIREBOLT_S3_LOCATION_NAME yes* Firebolt external location name (must match CREATE LOCATION; e.g. sprinto_s3)
S3_BUCKET yes Staging bucket
S3_PREFIX no Key prefix (default: dlt-landing)
DLT_DATASET_NAME no Demo dataset (default: demo)

Credentials belong in .env (gitignored) or .dlt/secrets.toml (gitignored). See .dlt/secrets.toml.example.

Tests

pip install -r requirements-dev.txt
pytest

# Optional: live Firebolt + S3 (requires .env and AWS creds)
FIREBOLT_RUN_INTEGRATION=1 pytest -m integration -v

Roadmap

  • Package as installable module (pip install -e . / dlt-firebolt)
  • Config via env vars and .dlt/secrets.toml (both live-tested)
  • Merge/append/replace dispositions (merge via delete-insert; replace via truncate-and-insert or insert-from-staging)
  • Unit tests for COPY and merge SQL generation
  • Integration test harness (env-gated)
  • Destination README (dlt-style setup doc)
  • PyPI publish (pip install dlt-firebolt from PyPI)
  • Upstream PR to dlt or community listing

Related

Customer connector demos that consume this pattern live separately in sprinto-connectors (private).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dlt_firebolt-0.1.1.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dlt_firebolt-0.1.1-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file dlt_firebolt-0.1.1.tar.gz.

File metadata

  • Download URL: dlt_firebolt-0.1.1.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dlt_firebolt-0.1.1.tar.gz
Algorithm Hash digest
SHA256 545515970e94e9025347bce57cfb7e394a8f15a92b34f61a13cbdb75ff8a6126
MD5 3bc31bbb0a6dd5efc98c21fde431cea2
BLAKE2b-256 08e42d1931bd40f2de590574c59670d1ff16e77d716efb9987c7c44bf4f87d1b

See more details on using hashes here.

File details

Details for the file dlt_firebolt-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dlt_firebolt-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dlt_firebolt-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d9319f97b4981fa372d7e3c24add169a6e78b7c9d7fba5977917803bea8b6177
MD5 fc23bc531364e14c0f6e678541d28d0d
BLAKE2b-256 2ff956e24fca87bf9ae735c683f347a5e6ba0019af85a304d38d17c530f81e98

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page