Skip to main content

Standardizing models

Project description

Bollhav

A lightweight config class for defining data pipeline models. The goal is simple: standardize how models are declared across a project without boxing anyone in.

Usage

Basic table

model = Model(
    name="orders",
    source_table="raw.orders",
    destination_table="orders",
    destination_schema="public",
    write_mode=WriteMode.TRUNCATE_INSERT,
    columns={
        "id": pl.Int64,
        "customer_id": pl.Int64,
        "amount": pl.Float64,
        "created_at": pl.Datetime,
    },
)

View

model = Model(
    name="orders_view",
    source_table="raw.orders",
    destination_table="orders_view",
    destination_schema="public",
    model_type=ModelType.VIEW,
    write_mode=WriteMode.VIEW,
    columns={
        "id": pl.Int64,
        "amount": pl.Float64,
    },
)

With a schedule

model = Model(
    name="orders",
    source_table="raw.orders",
    destination_table="orders",
    destination_schema="public",
    write_mode=WriteMode.APPEND,
    columns={"id": pl.Int64, "amount": pl.Float64},
    cron="0 3 * * *",
)

model.batch_size  # BatchSize.DAILY

batch_size is inferred automatically from the cron expression — a daily cron means you're pulling a day's worth of data, a monthly cron a month's worth, and so on. It is read-only and not something you set manually.

BatchSize Example cron
YEARLY 0 0 1 1 *
MONTHLY 0 0 1 * *
WEEKLY 0 0 * * 0
DAILY 0 3 * * *
HOURLY 0 * * * *

With dynamic DDL via callable

def my_ddl(table_name: str, schema: str, **kwargs) -> str:
    return f"CREATE TABLE {schema}.{table_name} (id SERIAL PRIMARY KEY);"

model = Model(
    name="orders",
    source_table="raw.orders",
    destination_table="orders",
    destination_schema="public",
    columns={"id": pl.Int64},
    destination_ddl=my_ddl,
    table_name="orders",
    schema="public",
)
# model.extra["destination_ddl"] is the resolved DDL string

Callables in **kwargs are resolved at init time using the non-callable values in the same kwargs as arguments. This means you can define DDL, index creation, or any other dynamic logic as functions and pass them in alongside the data they need — no subclassing required.

Column definitions

Columns are defined using Polars dtypes as the source of truth. Conversion to the target database's type system (e.g. pl.Int64 -> BIGINT for Postgres) is left to the implementor — Bollhav makes no assumptions about your destination.

Write modes

WriteMode ModelType Description
APPEND TABLE Insert without truncating
TRUNCATE_INSERT TABLE Truncate then insert
OVERWRITE_INSERT TABLE Overwrite matching rows
MERGE TABLE Upsert based on keys
VIEW VIEW Create or replace view

ModelType and WriteMode are validated against each other at init — passing WriteMode.VIEW with ModelType.TABLE or vice versa raises immediately.

Tags

Tags are optional and freeform. Use them however makes sense for your project — filtering, grouping, documentation.

model = Model(
    name="orders",
    source_table="raw.orders",
    columns={"id": pl.Int64},
    tags=["finance", "critical"],
)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bollhav-1.1.0.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bollhav-1.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file bollhav-1.1.0.tar.gz.

File metadata

  • Download URL: bollhav-1.1.0.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for bollhav-1.1.0.tar.gz
Algorithm Hash digest
SHA256 7ddad2e0b5100ec2994ab93bc7e951275402357202b471549224956a402be15b
MD5 8f7e3406f66f10af413fcc8345cfcdce
BLAKE2b-256 4feb7e3e71821773ee29391a62e6b3f3eacb6338ca464cc962351e13a9f0e345

See more details on using hashes here.

File details

Details for the file bollhav-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: bollhav-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for bollhav-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5c7ca915edfbde2520a8b08f18a751e675c8f4451b84043329ef502345916d87
MD5 73901548be9a8f993b3e22cc9e7e752c
BLAKE2b-256 2b74434fd0b0f1606411dfe1717a13743d0704cd9eb7fca0c4f98a9876117b67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page