Skip to main content

Library for logging and data quality tracking (lakehouse).

Project description

logsteplib

Package Description

Package containing a standard format for the logging module.

Usage

Stream Console Logs

From a script:

# Initialise a logger for the process and log an informational message
from logsteplib.streamer import StreamLogger

logger = StreamLogger(name="my_process").logger

logger.info(msg="Something to log!")
# 2025-11-02 00:00:01 - my_process           - INFO     - Something to log!

Lakehouse DQ Logs

From a SQL script:

-- Create the Delta lakehouse table for tracking SharePoint file uploads
-- Includes metadata such as file details, user info, and processing status
-- Table (example): workspace.default.sharepoint_uploader_monitoring_logs
DROP TABLE IF EXISTS workspace.default.sharepoint_uploader_monitoring_logs;
CREATE TABLE workspace.default.sharepoint_uploader_monitoring_logs (
  target STRING,
  key STRING,
  input_file_name STRING,
  file_name STRING,
  user_name STRING,
  user_email STRING,
  modify_date STRING,     -- Should be timestamp
  file_size STRING,       -- Should be INT
  file_row_count STRING,  -- Should be INT
  status STRING,
  rejection_reason STRING,
  file_web_url STRING
)
USING DELTA;

From a Python script:

from logsteplib.dq import DQStatusCode

print(DQStatusCode.get_description("SchemaMismatch"))  # DQ FAIL: SCHEMA MISMATCH
print(DQStatusCode.get_description("UnknownCode"))     # UNKNOWN STATUS CODE

Status Code Table

Code Description
NA NOT APPLICABLE
EmptyFile DQ FAIL: EMPTY FILE
SchemaMismatch DQ FAIL: SCHEMA MISMATCH
SchemaMismatchAndEmptyFile DQ FAIL: SCHEMA MISMATCH AND EMPTY FILE
InvalidNumericFormat DQ FAIL: INVALID NUMERIC FORMAT
InvalidDateFormat DQ FAIL: INVALID DATE FORMAT
from logsteplib.dq import DQWriter
from logsteplib.dq import DQMetadata

# Init DQWriter
monitoring_table = "workspace.default.sharepoint_uploader_monitoring_logs"
dq_writer = DQWriter(table_name=monitoring_table)

```python
# Create a DQMetadata instance containing metadata about a processed file
# This metadata can be used for logging, auditing, or writing to a lakehouse table
metadata = DQMetadata(
    target="my_folder/my_system",
    key="customer_20251031",
    input_file_name="raw_customers.csv",
    file_name="clean_customers.csv",
    user_name="Parker, Peter",
    user_email="peter.parker@example.com",
    modify_date="2025-11-02",
    file_size="204800",
    file_row_count="15000",
    status="FAIL",
    rejection_reason=DQStatusCode.get_description("SchemaMismatch"),
    file_web_url="https://lakehouse.company.com/files/clean_customers.parquet"
)

# Write the metadata (DQMetadata) to the lakehouse monitoring table
dq_writer.write_metadata(metadata=metadata)

Email Notifications

# Init email notifications
notifier = EmailNotifier(
    smtp_server="smtp.example.com",
    smtp_port=587,
    username="user",
    password="pass",
    sender_email="sender@example.com"
)

# Send email notification
notifier.send_email(
    recipient_email="recipient@example.com",
    subject="Test Email",
    message_body="This is a test."
)

Installation

Install python and pip if you have not already.

Then run:

pip install pip --upgrade

For production:

pip install logsteplib

This will install the package and all of it's python dependencies.

If you want to install the project for development:

git clone https://github.com/aghuttun/logsteplib.git
cd logsteplib
pip install -e ".[dev]"

Docstring

The script's docstrings follow the numpydoc style.

License

BSD License (see license file)

top

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logsteplib-0.0.17.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logsteplib-0.0.17-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file logsteplib-0.0.17.tar.gz.

File metadata

  • Download URL: logsteplib-0.0.17.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for logsteplib-0.0.17.tar.gz
Algorithm Hash digest
SHA256 b554b0599b506e7605b62980705cde3c02fd1c7f977b1bc88342686a0817a0c8
MD5 739fe57e8339ad4b0181db6b560ad03e
BLAKE2b-256 84093f128d3d147c5d539ecf0aa5e9f51e3b5fd71db5fc6adcf8018505c822e8

See more details on using hashes here.

File details

Details for the file logsteplib-0.0.17-py3-none-any.whl.

File metadata

  • Download URL: logsteplib-0.0.17-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for logsteplib-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 b9402ee2ae2d6ca42767ade18f67a8c3172e60be1ae9899cceeb45c3b05ad68f
MD5 76800fbffe0caa53cefe2e0893e109af
BLAKE2b-256 515a017c717565cc6d0b19d84701f351711ea2363c01ff166998d6dde54c36a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page