Skip to main content

Standardizing models

Project description

Bollhav

A lightweight config class for defining data pipeline models. The goal is simple: standardize how models are declared across a project without boxing anyone in.

Usage

Basic table

model = Model(
    name="orders",
    source_table="raw.orders",
    destination_table="orders",
    destination_schema="public",
    write_mode=WriteMode.TRUNCATE_INSERT,
    columns={
        "id": pl.Int64,
        "customer_id": pl.Int64,
        "amount": pl.Float64,
        "created_at": pl.Datetime,
    },
)

View

model = Model(
    name="orders_view",
    source_table="raw.orders",
    destination_table="orders_view",
    destination_schema="public",
    model_type=ModelType.VIEW,
    write_mode=WriteMode.VIEW,
    columns={
        "id": pl.Int64,
        "amount": pl.Float64,
    },
)

With a schedule

model = Model(
    name="orders",
    source_table="raw.orders",
    destination_table="orders",
    destination_schema="public",
    write_mode=WriteMode.APPEND,
    columns={"id": pl.Int64, "amount": pl.Float64},
    cron="0 3 * * *",
)

model.batch_size  # BatchSize.DAILY

batch_size is inferred automatically from the cron expression — a daily cron means you're pulling a day's worth of data, a monthly cron a month's worth, and so on. It is read-only and not something you set manually.

BatchSize Example cron
YEARLY 0 0 1 1 *
MONTHLY 0 0 1 * *
WEEKLY 0 0 * * 0
DAILY 0 3 * * *
HOURLY 0 * * * *

With dynamic DDL via callable

def my_ddl(table_name: str, schema: str, **kwargs) -> str:
    return f"CREATE TABLE {schema}.{table_name} (id SERIAL PRIMARY KEY);"

model = Model(
    name="orders",
    source_table="raw.orders",
    destination_table="orders",
    destination_schema="public",
    columns={"id": pl.Int64},
    destination_ddl=my_ddl,
    table_name="orders",
    schema="public",
)
# model.extra["destination_ddl"] is the resolved DDL string

Callables in **kwargs are resolved at init time using the non-callable values in the same kwargs as arguments. This means you can define DDL, index creation, or any other dynamic logic as functions and pass them in alongside the data they need — no subclassing required.

Column definitions

Columns are defined using Polars dtypes as the source of truth. Conversion to the target database's type system (e.g. pl.Int64 -> BIGINT for Postgres) is left to the implementor — Bollhav makes no assumptions about your destination.

Write modes

WriteMode ModelType Description
APPEND TABLE Insert without truncating
TRUNCATE_INSERT TABLE Truncate then insert
OVERWRITE_INSERT TABLE Overwrite matching rows
MERGE TABLE Upsert based on keys
VIEW VIEW Create or replace view

ModelType and WriteMode are validated against each other at init — passing WriteMode.VIEW with ModelType.TABLE or vice versa raises immediately.

Tags

Tags are optional and freeform. Use them however makes sense for your project — filtering, grouping, documentation.

model = Model(
    name="orders",
    source_table="raw.orders",
    columns={"id": pl.Int64},
    tags=["finance", "critical"],
)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bollhav-1.0.1.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bollhav-1.0.1-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file bollhav-1.0.1.tar.gz.

File metadata

  • Download URL: bollhav-1.0.1.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for bollhav-1.0.1.tar.gz
Algorithm Hash digest
SHA256 aec34d85713b6c734a99b1b764f9ce3b3c79a087b34c4cd9b30584ac5af0469f
MD5 c94048e4f3d7a22cc0ccf45ad02aefc4
BLAKE2b-256 5c5e6b5ef2c1a7e3e6e45071db925f08227c31091cff85bc303f0fc6f623d660

See more details on using hashes here.

File details

Details for the file bollhav-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: bollhav-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for bollhav-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c8592c9625479d79fd908149bce6845acf1521ac484a43339536c318f9f7b632
MD5 9fb4b80faf85ec6ba7befc92cdff099d
BLAKE2b-256 2844480890f6280369851a3947091a134995142457fa0dfdb9621e98c7d03864

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page