Utilities for SQLMesh
Project description
SQLMesh Utils
This repository contains things that are not included in SQLMesh Core for various reasons but may be useful in very specific cases.
For example, the custom materializations included here will typically do things like relax constraints / guarantees of the SQLMesh Core model kinds to work around environment-specific limitations.
Usage
Install the sqlmesh-utils library into your SQLMesh Python environment:
$ pip install sqlmesh-utils
Custom Materializations
After installing the sqlmesh-utils library above, you can reference the custom materializations via the CUSTOM model kind like so:
MODEL (
name my_db.my_model,
kind CUSTOM (
materialization 'custom_materialization_name',
materialization_properties (
config_key = 'config_value'
)
)
);
Before using any of these materializations, you should take the time to understand the tradeoffs. There is generally a reason they are not in upstream SQLMesh.
Non-Idempotent Incremental By Time Range
This behaves similar to INCREMENTAL_BY_TIME_RANGE model kind in SQLMesh, but with one important difference - it loads and restates data using a MERGE statement instead of INSERT OVERWRITE or DELETE+INSERT.
The reason you might want to use it is to prevent dirty reads during data restatement on engines that do not support atomically replacing a partition of data, such as Trino on Iceberg / Delta Lake.
Due to the use of a MERGE statement, this materialization type supports upserts only. That is, if records are deleted from the source data, these deletions will not be reflected in the target table. So the downside of using this materialization is that it's possible to end up with ghost records in your target table after a restatement. For this reason, we call it "non-idempotent".
[!NOTE] Note that some engines can propagate deletes in a
MERGEstatement using syntax likeWHEN NOT MATCHED [IN SOURCE] THEN DELETE. However, this is not part of ANSI SQL so engines like Trino and Postgres do not implement it.
Usage
This mostly follows the same usage as INCREMENTAL_BY_TIME_RANGE:
MODEL (
name my_db.my_model,
kind CUSTOM (
materialization 'non_idempotent_incremental_by_time_range',
materialization_properties (
time_column = event_timestamp,
primary_key = (event_id, event_source)
)
)
);
SELECT event_id, event_source, event_data, event_timestamp
FROM upstream.table
The properties are as follows:
time_column
This is the column in the dataset that contains the timestamp. It follows the same syntax as upstream INCREMENTAL_BY_TIME_RANGE and also the same rules with regards to respecting the project time_column_format property and being automatically added to the model partition_by field list.
primary_key
This is the column or combination of columns that uniquely identifies a record.
The columns listed here are used in the ON clause of the SQL Merge to join the source and target datasets.
Note that the time_column is not automatically injected into this list (to allow timestamps on records to be updated), so if the time_column does actually form part of the primary key in your dataset then it needs to be added here.
partition_by_time_column
By default, the time_column will get added to the list of fields in the model partitioned_by property, causing it to be included in the table partition key. This may be undesirable in some circumstances.
To opt out of this behaviour, you can set partition_by_time_column = false like so:
MODEL (
name my_db.my_model,
kind CUSTOM (
materialization 'non_idempotent_incremental_by_time_range',
materialization_properties (
...,
partition_by_time_column = false
)
)
);
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sqlmesh_utils-0.1.1.tar.gz.
File metadata
- Download URL: sqlmesh_utils-0.1.1.tar.gz
- Upload date:
- Size: 10.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cc15a01e8238bc91dffcaa3d24a961adbce78b2df07bc4ce57faa1f99028de1
|
|
| MD5 |
a51b3f6888b4c50cc77c0c46ebc1352e
|
|
| BLAKE2b-256 |
a5fb4126bd4bf713cbea82c175831a7273b36bf8c2ae2daa3cb4f0ec20b459c9
|
Provenance
The following attestation bundles were made for sqlmesh_utils-0.1.1.tar.gz:
Publisher:
publish.workflow.yaml on TobikoData/sqlmesh-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sqlmesh_utils-0.1.1.tar.gz -
Subject digest:
9cc15a01e8238bc91dffcaa3d24a961adbce78b2df07bc4ce57faa1f99028de1 - Sigstore transparency entry: 546123695
- Sigstore integration time:
-
Permalink:
TobikoData/sqlmesh-utils@afe6b3a3ac7dc4475704030a7a04dd9d962fa6f5 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/TobikoData
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.workflow.yaml@afe6b3a3ac7dc4475704030a7a04dd9d962fa6f5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file sqlmesh_utils-0.1.1-py3-none-any.whl.
File metadata
- Download URL: sqlmesh_utils-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49bb05f3420ccfe5032c32590cd8c0a26ae199453781f3511dc915dffb52e930
|
|
| MD5 |
87014ae3ac7e5dc4b215ff2c21314612
|
|
| BLAKE2b-256 |
03b0904f890dcf60fdcbb96b398377b1775f7f71bb5a702d80ddff2e1b825e50
|
Provenance
The following attestation bundles were made for sqlmesh_utils-0.1.1-py3-none-any.whl:
Publisher:
publish.workflow.yaml on TobikoData/sqlmesh-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sqlmesh_utils-0.1.1-py3-none-any.whl -
Subject digest:
49bb05f3420ccfe5032c32590cd8c0a26ae199453781f3511dc915dffb52e930 - Sigstore transparency entry: 546123754
- Sigstore integration time:
-
Permalink:
TobikoData/sqlmesh-utils@afe6b3a3ac7dc4475704030a7a04dd9d962fa6f5 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/TobikoData
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.workflow.yaml@afe6b3a3ac7dc4475704030a7a04dd9d962fa6f5 -
Trigger Event:
push
-
Statement type: