Shared utilities for dlt data pipelines with multi-company support
Project description
dlt_utils
Shared utilities for dlt data pipelines with multi-company support.
Features
- PartitionedIncremental: Incremental state tracking per partition key (e.g., company_id)
- Date utilities: Generate (year, week) and (year, month) tuples for time-based partitioning
- Schema utilities: Ensure tables exist in destination database
Installation
# From PyPI
pip install dlt_utils
# For development
pip install -e ".[dev]"
Usage
PartitionedIncremental
Track incremental state per company (or any partition key):
import dlt
from dlt_utils import PartitionedIncremental
@dlt.resource
def sync_resource():
state = dlt.current.resource_state()
inc = PartitionedIncremental(
state=state,
state_key="sequences",
cursor_path="sequenceNumber",
initial_value=0,
)
for company_id in ["company_a", "company_b"]:
start_seq = inc.get_last_value(company_id)
for record in fetch_data(company_id, since=start_seq):
inc.track(company_id, record["sequenceNumber"])
yield record
Date utilities
Generate time periods for partitioned data extraction:
from dlt_utils import generate_year_weeks, generate_year_months
# Generate weeks from 2024 to now + 52 weeks
weeks = generate_year_weeks(start_year=2024)
# [(2024, 1), (2024, 2), ..., (2025, 52)]
# Generate months from October 2024 to February 2025
months = generate_year_months(2024, 10, 2025, 2)
# [(2024, 10), (2024, 11), (2024, 12), (2025, 1), (2025, 2)]
Schema utilities
Ensure tables exist before running pipeline:
from dlt_utils import ensure_all_tables_exist, ensure_tables_for_resources
# Create all tables from schema
ensure_all_tables_exist(pipeline)
# Create only specific resource tables (including child tables)
ensure_tables_for_resources(pipeline, ["trade_items", "organizations"])
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linter
ruff check dlt_utils/
CI/CD Pipeline
De pipeline draait automatisch bij:
- Push naar
main: Voert tests uit - Tag met
v*prefix: Voert tests uit én publiceert naar PyPI
Pipeline Workflow
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Push/Tag │────▶│ Test Stage │────▶│ Publish Stage │
│ naar repo │ │ (altijd) │ │ (alleen tags) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │
▼ ▼
- Install deps - Build package
- Run pytest - Upload to PyPI
- Publish results
Nieuwe Versie Releasen
Optie 1: Via Git CLI
# 1. Zorg dat alle changes gecommit zijn
git add .
git commit -m "Release v0.2.0"
# 2. Maak een tag aan
git tag v0.2.0
# 3. Push commit én tag naar remote
git push origin main
git push origin v0.2.0
Optie 2: Via Azure DevOps
- Ga naar Repos → Tags
- Klik op New tag
- Vul in:
- Name:
v0.2.0(moet beginnen metv) - Based on: selecteer de commit of branch (bijv.
main) - Description: optioneel, bijv. "Added new feature X"
- Name:
- Klik op Create
De pipeline wordt automatisch getriggered en publiceert naar PyPI.
Versienummering
Gebruik Semantic Versioning:
vMAJOR.MINOR.PATCH(bijv.v1.2.3)- MAJOR: Breaking changes
- MINOR: Nieuwe features (backwards compatible)
- PATCH: Bugfixes
⚠️ Belangrijk: Vergeet niet de versie in
pyproject.tomlbij te werken vóór het taggen!
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dlt_utils-0.1.0.tar.gz.
File metadata
- Download URL: dlt_utils-0.1.0.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f591c2a1d1677d5a841e89b05edfc689d540537e7ebb53a2be7c70bda9a11574
|
|
| MD5 |
3d7427cf43093aca2e0e7a6d412f3964
|
|
| BLAKE2b-256 |
bb5d572edffa2f025addb65526cadb9a12cccb790af6192c5db5bbe98b25a7e9
|
File details
Details for the file dlt_utils-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dlt_utils-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed7e25ee34c77dafec0ec7a9b8a70f869c2201c17bf863b58a4653c6d3bff356
|
|
| MD5 |
50c206d9915f3be76f66b7c574ca1000
|
|
| BLAKE2b-256 |
78bc18faba499c803449dc42b06c619b8a11b054eebbe6e33542368ec021e664
|