Skip to main content

Standardizing models

Project description

Bollhav

A lightweight config class for defining data pipeline models.

Usage

Basic table

model = Model(
    name="orders",
    source_table="raw.orders",
    table="orders",
    schema="public",
    write_mode=WriteMode.TRUNCATE_INSERT,
    database=Database.POSTGRES,
    columns=[
        PostgresColumn(name="id", data_type=PostgresType.BIGINT, primary_key=True, nullable=False),
        PostgresColumn(name="customer_id", data_type=PostgresType.BIGINT),
        PostgresColumn(name="amount", data_type=PostgresType.NUMERIC, precision=10, scale=2),
        PostgresColumn(name="created_at", data_type=PostgresType.TIMESTAMPTZ),
    ],
)

View

model = Model(
    name="orders_view",
    source_table="raw.orders",
    table="orders_view",
    schema="public",
    model_type=ModelType.VIEW,
    write_mode=WriteMode.VIEW,
    database=Database.POSTGRES,
    columns=[
        PostgresColumn(name="id", data_type=PostgresType.BIGINT),
        PostgresColumn(name="amount", data_type=PostgresType.NUMERIC, precision=10, scale=2),
    ],
)

With a schedule

model = Model(
    name="orders",
    source_table="raw.orders",
    database=Database.POSTGRES,
    columns=[
        PostgresColumn(name="id", data_type=PostgresType.BIGINT),
        PostgresColumn(name="amount", data_type=PostgresType.NUMERIC, precision=10, scale=2),
    ],
    cron="0 3 * * *",
)
model.batch_size  # BatchSize.DAILY

batch_size is inferred from the cron expression and is read-only.

BatchSize Example cron
YEARLY 0 0 1 1 *
MONTHLY 0 0 1 * *
WEEKLY 0 0 * * 0
DAILY 0 3 * * *
HOURLY 0 * * * *

With dynamic kwargs

def my_ddl(table_name: str, schema: str, **kwargs) -> str:
    return f"CREATE TABLE {schema}.{table_name} (id SERIAL PRIMARY KEY);"

model = Model(
    name="orders",
    source_table="raw.orders",
    database=Database.POSTGRES,
    columns=[
        PostgresColumn(name="id", data_type=PostgresType.BIGINT, primary_key=True, nullable=False),
    ],
    ddl=my_ddl,
    table_name="orders",
    schema="public",
)
model.extra["ddl"]  # resolved DDL string

Callables in **kwargs are resolved at init using the non-callable kwargs as arguments.

With debug

model = Model(
    name="orders",
    source_table="raw.orders",
    debug=True,
)

Write modes

WriteMode ModelType Description
APPEND TABLE Insert without truncating
TRUNCATE_INSERT TABLE Truncate then insert
OVERWRITE_INSERT TABLE Overwrite matching rows
MERGE TABLE Upsert based on keys
VIEW VIEW Create or replace view

ModelType and WriteMode are validated against each other at init.

PostgresColumn

Field Type Default Notes
name str required
data_type PostgresType required
nullable bool True
order int | None None
primary_key bool False Implies nullable=False
unique bool False
precision int | None None For NUMERIC, DECIMAL
scale int | None None For NUMERIC, DECIMAL
length int | None None For VARCHAR, CHAR, BIT

primary_key=True with nullable=True raises at init.

Tags

Optional and freeform.

model = Model(
    name="orders",
    source_table="raw.orders",
    database=Database.POSTGRES,
    columns=[
        PostgresColumn(name="id", data_type=PostgresType.BIGINT),
    ],
    tags=["finance", "critical"],
)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bollhav-1.1.8.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bollhav-1.1.8-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file bollhav-1.1.8.tar.gz.

File metadata

  • Download URL: bollhav-1.1.8.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for bollhav-1.1.8.tar.gz
Algorithm Hash digest
SHA256 c15a1a7eb1d8542833c15b6d8196a1b38042d32284eb981e457a377aaeb7bf6f
MD5 1b328554d7d6cf81f9d26dffdf4e12d0
BLAKE2b-256 8e67b8dbd0f543ae60c7f2d399cacb6261cc6b1bbe1799406b86bca140e6b05f

See more details on using hashes here.

File details

Details for the file bollhav-1.1.8-py3-none-any.whl.

File metadata

  • Download URL: bollhav-1.1.8-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for bollhav-1.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 884a56fc1064bcbe15d2618b10b28adf70e5565f8dcd56c9376e22dece268cb3
MD5 e3f16adb86c240d1d07777d2e72da940
BLAKE2b-256 aebabe0e120b96e019ca639e36bdec38b1ae9845e81e806f5c94a13316103009

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page