General utility library for Python applications running in Keboola Connection environment
Project description
Python Utility library
Introduction
The library provides a useful set of utility functions frequently used when creating Python components for Keboola Connection. The utility library should be used in cooperation with the main Python Component library.
The Python Utility library is developed the Keboola Data Services team and is officially supported by Keboola. The library aims to ease the component creation process by removing the necessity to write frequently used functions all over again.
Links
- API Documentation: API Docs
- Source code: https://github.com/keboola/python-utils
- PYPI project code: https://pypi.org/project/keboola.utils
- Documentation: https://developers.keboola.com/extend/component/python-component-library
Quick start
Installation
The package can be installed via pip
using:
pip install keboola.utils
Structure and functionality
The package currently contains one core module:
keboola.utils.date
- a set of methods for date manipulation.keboola.utils.helpers
- a general helper functions and classes that are relevant in Keboola Connection environment.
keboola.utils.header_normalizer
- Different strategies to convert column names to a valid KBC format.
Helpers
The module contains general helper functions and classes that are relevant in Keboola Connection environment.
Date Utilities
The module contains all date related functions, which can be utilized to effective work with dates, when creating components for Keboola Connection.
Initialization
All util functions can be imported from keboola.utils
module.
from keboola.utils import *
or
import keboola.utils.date
to import only functions from a certain module.
Getting converted date period from string
The function parse_datetime_interval()
allows to parse any string containing date format into a Python datetime; or if strformat
parameter is specified, into a datetime formatted string.
The positional arguments period_from
and period_to
can be specified in relative format (e.g. 3 days ago
, 2 months ago
, etc.) or in absolute format (e.g. 2020-01-01
). For full list of supported formats, please refer to dateparser
documentation.
from keboola.utils import *
dt_str_1 = '5 days ago'
dt_str_2 = 'today'
dt_format = '%Y-%m-%d'
start_date, end_date = parse_datetime_interval(dt_str_1, dt_str_2, dt_format)
Generating date period chunks
The function split_dates_to_chunks()
allows to split time interval into chunks of specified size.
import keboola.utils.date as dutils
from datetime import date
dt_1 = date(2021, 1, 1)
dt_2 = date(2021, 1, 10)
dt_format = '%Y-%m-%d'
intervals = dutils.split_dates_to_chunks(dt_1, dt_2, intv=2, strformat=dt_format)
for intv in intervals:
print(intv['start_date'], intv['end_date'])
Usage Example
import keboola.utils.date as dutils
dt_str_1 = '5 days ago'
dt_str_2 = 'today'
dt_format = '%Y-%m-%d'
start_date, end_date = dutils.parse_datetime_interval(dt_str_1, dt_str_2)
intervals = dutils.split_dates_to_chunks(start_date, end_date, intv=2, strformat=dt_format)
for intv in intervals:
print(intv['start_date'], intv['end_date'])
Header normalizer
This module provides different strategies to normalize CSV column names to a format supported by the Keboola Connection Storage:
Only alphanumeric characters and underscores are allowed in column name. Underscore is not allowed on the beginning.
Example:
import keboola.utils.header_normalizer as hnorm
head_norm = hnorm.get_normalizer(strategy=hnorm.NormalizerStrategy.ENCODER, char_encoder="unicode")
header = ["dactor#fd", "a*ruas$", "48DHBb#@"]
norm_headers = head_norm.normalize_header(header)
# Results in: ['dactor_35_fd', 'a_42_ruas_36_', '48DHBb_35__64_'])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for keboola.utils-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c73faa4a81f371a2eecd8465b08a51b3f7608969dd91d38d5b3bcfad7ef0da5 |
|
MD5 | 8d3153ccbef09ece25d3aa1d1b6dd848 |
|
BLAKE2b-256 | f9f46697a0c2ff512baa7b84413972e51d5449a0a145f68dc750f05a8b1da39d |