Skip to main content

Lightweight formatter objects

Project description

Formatter

test codecov pypi version python support version size gh license code style: black type check: mypy pre-commit

Lightweight formatter objects, this fmtutil package was created for parse and format any string values that match a format pattern which created base on Python regular expression.

:dart: First objective of this project is include necessary formatter objects for any data components package which mean we can parse any complicate names on data source and ingest the right names to in-house or data target.

:round_pushpin: Installation

pip install -U fmtutil[all]

Python version supported:

Python Version Installation Support Fixed Bug
== 3.8 pip install "fmtutil>=0.4,<0.5.0" :x:
>=3.9,<3.14 pip install -U fmtutil :heavy_check_mark:

[!NOTE] This package has one dependency package, python-dateutil, this package use for support add and sub datetime value on the Datetime formatter only. If you do not want to install this package, you can use pip install -U fmtutil.

:beers: Introduction

For example, we want to get filename with the format like, filename_20220101.csv, on the file system storage, and we want to incremental ingest the latest file with date 2022-03-25 date. So we will implement Datetime object and parse that filename to it,

assert (
    Datetime.parse('filename_20220101.csv', 'filename_%Y%m%d.csv').value
    == datetime.datetime(2022, 1, 1, 0)
)

The above example is :yawning_face: NOT SURPRISE!!! for you right? Because the Python already provide the build-in datetime to parse by datetime.strptime and format by {dt}.strftime :banana:.

This package will be the special thing when we group more than one format-able objects together as Naming, Version, and Datetime. For a complex filename format like :triumph:;

{filename:%s}_{datetime:%Y_%m_%d}.{version:%m.%n.%c}.csv

[!WARNING] Disclaimer: The above filename format, the datetime package that already build-in in Python does not enough for this scenario :snake: but you can handle by your code function or create the better package than this project :dash:.

[!NOTE] Any formatter object was implemented the self.valid method for help us validate format string value like the above the example scenario,

this_date = Datetime.parse('20220101', '%Y%m%d')
assert this_date.valid('any_files_20220101.csv', 'any_files_%Y%m%d.csv')

:tada: Usage

If you have multi-format filenames on the data source directory, and you want to dynamic getting max datetime on these filenames to your app, you can use a formatter group.

from fmtutil import (
  make_group, Naming, Datetime, FormatterGroup, FormatterGroupType, FormatterArgumentError,
)

name: Naming = Naming.parse('Google Map', fmt='%t')

fmt_group: FormatterGroupType = make_group({
    "naming": name.to_const(),
    "timestamp": Datetime,
})

rs: list[FormatterGroup] = []
for file in (
    'googleMap_20230101.json',
    'googleMap_20230103.json',
    'googleMap_20230103_bk.json',
    'googleMap_with_usage_20230105.json',
    'googleDrive_with_usage_20230105.json',
):
    try:
        rs.append(
            fmt_group.parse(file, fmt=r'{naming:c}_{timestamp:%Y%m%d}\.json')
        )
    except FormatterArgumentError:
        continue

repr(max(rs).groups['timestamp'])
>>> <Datetime.parse('2023-01-03 00:00:00.000000', '%Y-%m-%d %H:%M:%S.%f')>

[!TIP] The above Example will convert the name, Naming instance, to Constant instance before passing to the Formatter Group because it does not want to dynamic parsing this format when find any matching filenames at destination path.

:dart: Next Step

I will change formatter object construction from changing with inside method to assert design. The code already implement and testing stage at file __assets.py.

That mean, you can create any formatter object by dynamic asset changed strategy.

class Datetime(Formatter, asset=DATETIME_ASSET, config=DATETIME_CONF, level=10):
    """Datetime Formatter object."""
    ...

:speech_balloon: Contribute

I do not think this project will go around the world because it has specific propose and you can create by your coding without this project dependency for long term solution. So, on this time, you can open the GitHub issue on this project :raised_hands: for fix bug or request new feature if you want it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fmtutil-1.0.11.tar.gz (43.2 kB view details)

Uploaded Source

Built Distribution

fmtutil-1.0.11-py3-none-any.whl (43.7 kB view details)

Uploaded Python 3

File details

Details for the file fmtutil-1.0.11.tar.gz.

File metadata

  • Download URL: fmtutil-1.0.11.tar.gz
  • Upload date:
  • Size: 43.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fmtutil-1.0.11.tar.gz
Algorithm Hash digest
SHA256 350feea466459ee3cba0bc277a4a9440d2a9f6bffdfb6ac862442b13693004eb
MD5 da7b088b4b71cffd1cd14383bdf93e5d
BLAKE2b-256 f83b072223e44a30c0e3f79ff12ef54d7becc09be02c2618e128c133a446110b

See more details on using hashes here.

File details

Details for the file fmtutil-1.0.11-py3-none-any.whl.

File metadata

  • Download URL: fmtutil-1.0.11-py3-none-any.whl
  • Upload date:
  • Size: 43.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fmtutil-1.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 1d96cd39414a9451cdd33031bcefdb9abb4a16a40a0ad469a376efd3513038e3
MD5 ddc9547786e67039c0d7eb22f2593265
BLAKE2b-256 fe7496c736a50eb9d18f7fe6d6188d25c8c205be1222492e57e06061b25b19f9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page