BigQuery SCD Type 2 dimension builder with SQLX generation
Project description
SCD2 BigQuery Engine
Version: 0.1.0
License: MIT
Status: ✅ Production Ready
Overview
BigQuery-first SCD Type 2 dimension builder with SQLX template generation for Dataform/dbt.
Generates production-ready SQLX files that implement full SCD Type 2 logic including:
- Hash-based change detection
- Effective date tracking
- Soft deletes
- Late arrival handling
- BigQuery optimizations (partitioning, clustering)
Why This Module?
- dbt snapshots are dbt-locked and Snowflake-optimized
- Existing SCD2 tutorials use stored procedures (hard to version control)
- No declarative, config-driven SCD2 solution for BigQuery
Features (v0.1.0 MVP)
- ✅ SQLX template generator for standard SCD2 pattern
- ✅ Hash-based change detection (MD5/SHA256)
- ✅ Effective dating with late arrival handling
- ✅ Soft delete support
- ✅ YAML config → SQLX output
- ✅ CLI + Python API
Quick Start
Installation
pip install scd2-bq-engine
Initialize Configuration
# Create a new dimension configuration
scd2-bq init dim_employee \
--source-table project.dataset.stg_employees \
--business-keys employee_number \
--tracked-columns first_name,last_name,job_code,department
Generate SQLX
# Generate SCD2 dimension SQLX file
scd2-bq generate \
--config dim_employee_config.yaml \
--output-file dim_employee.sqlx
Python API
from scd2_bq_engine import SCD2Generator, SCD2Config
# Create configuration
config = SCD2Config(
dimension_name="dim_employee",
source_table="project.dataset.stg_employees",
business_keys=["employee_number"],
tracked_columns=["first_name", "last_name", "job_code"]
)
# Generate SQLX
generator = SCD2Generator(config)
generator.write_sqlx("dim_employee.sqlx")
Directory Structure
scd2-bq-engine/
├── src/scd2_bq_engine/
│ ├── generator.py # Core SQLX generator
│ ├── templates/ # Jinja2 SQLX templates
│ └── validators.py # Config validation
├── tests/
├── examples/
└── docs/
License
MIT - see LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scd2_bq_engine-0.1.0.tar.gz.
File metadata
- Download URL: scd2_bq_engine-0.1.0.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a42681ba0cda0e36ed0f46abbadd431674fd6b1362eea77cb14831cb4db40c71
|
|
| MD5 |
0670f158a4569d464a63dacb8b400da6
|
|
| BLAKE2b-256 |
8a073b7a53a0c618d5bfbf1dcd3d99fe4230d7e8ac0c00b829b63e4c3d789efe
|
File details
Details for the file scd2_bq_engine-0.1.0-py3-none-any.whl.
File metadata
- Download URL: scd2_bq_engine-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2861f96d7641cb85b9d828db5912545a5fa2614718d7d7272610b2ed2085941e
|
|
| MD5 |
cea5c7c5bab1a5af6ee8b1bb31d53c25
|
|
| BLAKE2b-256 |
1a7b75083272e77841120af363a5febe14da845c0afdb8aa7c361084991707ce
|