Skip to main content

General utility library for Python applications running in Keboola Connection environment

Reason this release was yanked:

Camel_tools dependencies Torch is lot of big

Project description

Python Utility library

Introduction

Build & Test Code Climate PyPI version

The library provides a useful set of utility functions frequently used when creating Python components for Keboola Connection. The utility library should be used in cooperation with the main Python Component library.

The Python Utility library is developed the Keboola Data Services team and is officially supported by Keboola. The library aims to ease the component creation process by removing the necessity to write frequently used functions all over again.

Links

Quick start

Installation

The package can be installed via pip using:

pip install keboola.utils

Structure and functionality

The package currently contains one core module:

  • keboola.utils.date - a set of methods for date manipulation.
  • keboola.utils.helpers
    • a general helper functions and classes that are relevant in Keboola Connection environment.
  • keboola.utils.header_normalizer
    • Different strategies to convert column names to a valid KBC format.

Helpers

The module contains general helper functions and classes that are relevant in Keboola Connection environment.

Date Utilities

The module contains all date related functions, which can be utilized to effective work with dates, when creating components for Keboola Connection.

Initialization

All util functions can be imported from keboola.utils module.

from keboola.utils import *

or

import keboola.utils.date

to import only functions from a certain module.

Getting converted date period from string

The function parse_datetime_interval() allows to parse any string containing date format into a Python datetime; or if strformat parameter is specified, into a datetime formatted string.

The positional arguments period_from and period_to can be specified in relative format (e.g. 3 days ago, 2 months ago, etc.) or in absolute format (e.g. 2020-01-01). For full list of supported formats, please refer to dateparser documentation.

from keboola.utils import *

dt_str_1 = '5 days ago'
dt_str_2 = 'today'
dt_format = '%Y-%m-%d'

start_date, end_date = parse_datetime_interval(dt_str_1, dt_str_2, dt_format)

Generating date period chunks

The function split_dates_to_chunks() allows to split time interval into chunks of specified size.

import keboola.utils.date as dutils
from datetime import date

dt_1 = date(2021, 1, 1)
dt_2 = date(2021, 1, 10)
dt_format = '%Y-%m-%d'

intervals = dutils.split_dates_to_chunks(dt_1, dt_2, intv=2, strformat=dt_format)

for intv in intervals:
    print(intv['start_date'], intv['end_date'])

Usage Example

import keboola.utils.date as dutils

dt_str_1 = '5 days ago'
dt_str_2 = 'today'
dt_format = '%Y-%m-%d'

start_date, end_date = dutils.parse_datetime_interval(dt_str_1, dt_str_2)

intervals = dutils.split_dates_to_chunks(start_date, end_date, intv=2, strformat=dt_format)

for intv in intervals:
    print(intv['start_date'], intv['end_date'])

Header normalizer

This module provides different strategies to normalize CSV column names to a format supported by the Keboola Connection Storage:

Only alphanumeric characters and underscores are allowed in column name. Underscore is not allowed on the beginning.

Example:

import keboola.utils.header_normalizer as hnorm

head_norm = hnorm.get_normalizer(strategy=hnorm.NormalizerStrategy.ENCODER, char_encoder="unicode")
header = ["dactor#fd", "a*ruas$", "48DHBb#@"]
norm_headers = head_norm.normalize_header(header)

# Results in: ['dactor_35_fd', 'a_42_ruas_36_', '48DHBb_35__64_'])

Example transliterate:

import keboola.utils.header_normalizer as hnorm

head_norm = hnorm.get_normalizer(strategy=hnorm.NormalizerStrategy.TRANSLITERATE, transliterator_mapper="ar2safebw")
header = ["اسم","قيمة"]
norm_headers = head_norm.normalize_header(header)

# Results in: ['Asm','qymp'])

Available transliterator mappers:

Mapper Description
ar2bw Arabic to Buckwalter
ar2safebw Arabic to Safe Buckwalter
ar2xmlbw Arabic to XML Buckwalter
ar2hsb Arabic to Habash-Soudi-Buckwalter
bw2ar Buckwalter to Arabic
bw2safebw Buckwalter to Safe Buckwalter
bw2xmlbw Buckwalter to XML Buckwalter
bw2hsb Buckwalter to Habash-Soudi-Buckwalter
safebw2ar Safe Buckwalter to Arabic
safebw2bw Safe Buckwalter to Buckwalter
safebw2xmlbw Safe Buckwalter to XML Buckwalter
safebw2hsb Safe Buckwalter to Habash-Soudi-Buckwalter
xmlbw2ar XML Buckwalter to Arabic
xmlbw2bw XML Buckwalter to Buckwalter
xmlbw2safebw XML Buckwalter to Safe Buckwalter
xmlbw2hsb XML Buckwalter to Habash-Soudi-Buckwalter
hsb2ar Habash-Soudi-Buckwalter to Arabic
hsb2bw Habash-Soudi-Buckwalter to Buckwalter
hsb2safebw Habash-Soudi-Buckwalter to Safe Buckwalter
hsb2xmlbw Habash-Soudi-Buckwalter to XML Buckwalter

=======

License

MIT licensed, see LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keboola.utils-1.1.2.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keboola.utils-1.1.2-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file keboola.utils-1.1.2.tar.gz.

File metadata

  • Download URL: keboola.utils-1.1.2.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.17

File hashes

Hashes for keboola.utils-1.1.2.tar.gz
Algorithm Hash digest
SHA256 9e533ba126a4ef03a1099d092a6bfb2506058314439284ab7e1b7c742fa5decb
MD5 37671d2daaf087d57062732a1157060d
BLAKE2b-256 e7452bf6ad82e1febf4830ace3cad3ee42c3e9a1c461c73dfb40c52f90d16b65

See more details on using hashes here.

File details

Details for the file keboola.utils-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: keboola.utils-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.17

File hashes

Hashes for keboola.utils-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 16816cf24d1e3e7a69ccf8d2000e2a888ef124a87f8b8da2ae5cc59aed8b3a22
MD5 099a6f23b36a980b0026368ff17e5144
BLAKE2b-256 7e4a6b4105b4b5fda17668946a5a1e68f81b4d070c97073b4d3154e35bd6c1f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page