Skip to main content

SSB POC Statlog Model

Project description

SSB POC Statlog Model

This repo is a proof of concept (POC) of models for logging data from a statistics production run.

PyPI Status Python Version License

Documentation Tests Coverage Quality Gate Status

pre-commit Black Ruff Poetry

Features

  • Contains json schema models for the data to be logged.
  • Contains pydantic models, that is python data validation classes for the json schemas.

Requirements

  • Python >= 3.13

Installation

You can install SSB POC Statlog Model via pip from PyPI:

pip install ssb-poc-statlog-model

Usage

Please see the Reference Guide for API details. A quick example using the generated ChangeDataLog model:

from datetime import datetime, timezone
from ssb_poc_statlog_model.change_data_log import ChangeDataLog, DataChangeType

change = ChangeDataLog(
    statistics_name="arblonn",
    data_source=["gs://ssb-prod-superteam-data-produkt/arblonn/inndata/arbeidloenn_p2023-12_v1.parquet"],
    data_target="gs://ssb-prod-superteam-data-produkt/arblonn/klargjorte-data/arbeidloenn_p2023-12_v1.parquet",
    data_period="2023-12",
    change_event="A",
    change_event_reason="OTHER_SOURCE",
    change_datetime = datetime(2024, 1, 10, 15, 0, tzinfo=timezone.utc),
    changed_by="user@example.com",
    data_change_type=DataChangeType.NEW,
    change_comment="Opprettet ny enhet (person) fra ny datakilde ...",
    change_details={
        "detail_type": "unit",
        "unit_id": [
            {"unit_id_variable": "fnr", "unit_id_value": "170598nnnnn"},
            {"unit_id_variable": "orgnr", "unit_id_value": "123456789"}
        ],
        "new_value": [
            {"variable_name": "bostedskommune", "value": "0101"},
            {"variable_name": "type_loenn", "value": "time"},
            {"variable_name": "loenn", "value": "38000"},
            {"variable_name": "overtid_loenn", "value": "3000"}
        ]
    }
)

print(change.model_dump_json())

Tip about timestamps in JSON: use ISO 8601 with timezone information (e.g. …Z for UTC) to satisfy Pydantic’s AwareDatetime requirement used in several models.

Project structure

  • src/model → JSON Schemas for the domain models (source of truth)
    • example_logs/*.json → Example payloads used in tests
  • src/ssb_poc_statlog_model → Generated Pydantic models (Python)
  • src/scripts → Scripts for generating examples from the pydantic models
  • tests → Pytest suite validating models and examples

Schema versioning

The schema version is defined in the schema_version and $id field in the root of each schema. It uses semantic versioning (major.minor.patch) with the following rules:

  • Major (breaking change):
    • Field removed or renamed
    • Field type changed
    • Required field added
  • Minor (backward compatible):
    • New optional field added
    • Enum extended
  • Patch (non-structural change):
    • Description changes

Development

Set up the environment (installs runtime + dev tools):

poetry install

Run tests:

poetry run pytest -v

Code style and quality:

poetry run pre-commit run --all-files

Regenerate the Pydantic models from JSON Schemas

This repository keeps the source of truth for the models as JSON Schema files in src\model. Python classes are generated into src\ssb_poc_statlog_model using datamodel-code-generator via a small helper CLI.

You can run the generator using the console script (defined in pyproject.toml). All examples assume you are in the project root.

# Ensure dev dependencies are available (only needed once)
poetry install

# Generate models for all *-json-schema.json files under src/model
poetry run generate-ssb-models

Useful options:

  • Generate a single schema only:
poetry run generate-ssb-models --schemas src/model/change-data-log-json-schema.json
  • Use explicit directories (defaults shown):
poetry run generate-ssb-models \
  --schemas-dir src/model \
  --out-dir src/ssb_poc_statlog_model
  • Forward extra flags directly to datamodel-code-generator (repeatable):
poetry run generate-ssb-models \
  --extra-arg --collapse-root \
  --extra-arg --use-schema-description

What the helper does under the hood:

  • Discovers *-json-schema.json files in src/model (or uses --schemas if given)
  • Runs datamodel-code-generator targeting Pydantic v2 with options compatible with Python 3.10+ (see src/ssb_poc_statlog_model/generate_python.py for the exact flags)
  • Writes the generated models into src/ssb_poc_statlog_model

After regenerating, commit the updated Python files to version control.

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the MIT license, SSB POC Statlog Model is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from Statistics Norway's SSB PyPI Template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssb_poc_statlog_model-2.0.0.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ssb_poc_statlog_model-2.0.0-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file ssb_poc_statlog_model-2.0.0.tar.gz.

File metadata

  • Download URL: ssb_poc_statlog_model-2.0.0.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ssb_poc_statlog_model-2.0.0.tar.gz
Algorithm Hash digest
SHA256 9a407bc14762da196701fc63e56687e44ff36da9488eb4f4140849426587ae1b
MD5 33c2e4df04a42ec179277997e0a5879c
BLAKE2b-256 8431d1a24a020538ac9c99955a7111129fb2d71c84f7383bbf35034ff3ecb399

See more details on using hashes here.

Provenance

The following attestation bundles were made for ssb_poc_statlog_model-2.0.0.tar.gz:

Publisher: release.yml on statisticsnorway/ssb-poc-statlog-model

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ssb_poc_statlog_model-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ssb_poc_statlog_model-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 931e1adbaf20b15b3387d0485cc648a3095bd1933989b992056002d3c4704c1d
MD5 4a8b07e2ee905848b7452977cf3c3dd2
BLAKE2b-256 2e22828f370fe7247607a7b6382a766a780de2fa89d4e3e4a6c054eff6f1f8ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for ssb_poc_statlog_model-2.0.0-py3-none-any.whl:

Publisher: release.yml on statisticsnorway/ssb-poc-statlog-model

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page