Skip to main content

The hiereskopia package is a library to infer properties like date formats or numeric separators in pandas series of type object or string.

Project description

Hieroskopia

codecov

The hiereskopia package is a library to infer properties like date formats or numeric separators in pandas series of type object or string.

Support

Date-times:

  • Support to dates and datetime format
  • This library receive a series as input and try to return a dictionary with the format found in the series Based on the 1989 C (Default) , Snowflake Standard or Java Simple date time format code.

Numeric:

  • This library receive a series as input and try to return a dictionary with the three digit and decimal character separator

Usage

Infer datetime

>>> from hieroskopia import InferDatetime
>>> InferDatetime.infer(pd.Series(["2019-11-27",
                     "2019/11/28",
                     "2018-11-08"]))
>>> {'formats': ['%Y-%m-%d', '%Y/%m/%d'], 'type':'datetime'}

Using return_format parameter

>>> from hieroskopia import InferDatetime
>>> InferDatetime.infer(pd.Series(["2019-11-27",
                     "2019/11/28",
                     "2018-11-08"]), return_format='snowflake')
>>> {'formats': ['yyyy-mm-dd', 'yyyy/mm/dd'], 'type':'datetime'}
>>> from hieroskopia import InferDatetime
>>> InferDatetime.infer(pd.Series(["2019-11-27",
                     "2019/11/28",
                     "2018-11-08"]), return_format='java')
>>> {'formats': ['yyyy-MM-dd', 'yyyy/MM/dd'], 'type':'datetime'}

The above method works with a best guess approach to detect a format in a object type series and try to return a datetime.strftime/strptime, Snowflake Date format, Java Simple Date Format format that will cover or parse the majority of the samples.

Infer numeric

>>> from hieroskopia import InferNumeric
>>> InferNumeric.infer(pd.Series(['767313628196.2', '76731362819.546', '767313628196']))
>>> {'three_digit_separator': '', 'decimal_separator': '.', 'type':'float'}

The above method will try to detect and return certain properties in a object type series like datatype, three_digit_separator or decimal_separator character that will cover the majority of the samples.

To do:

  • Feed more regular expressions
  • Add Time format
  • Develop multiple algorithms to get a better precision.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hieroskopia-0.1.26.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

hieroskopia-0.1.26-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file hieroskopia-0.1.26.tar.gz.

File metadata

  • Download URL: hieroskopia-0.1.26.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.9.1

File hashes

Hashes for hieroskopia-0.1.26.tar.gz
Algorithm Hash digest
SHA256 9f56269ee379afb5ed746953d2d2ec83626d5cabbdfe9900375adfee9c076981
MD5 5561e994e995f6112385096251ae9f98
BLAKE2b-256 3644df17b307cd6c9255995eb35281eb35d0eb567260ff933b51f0bb8df721c8

See more details on using hashes here.

File details

Details for the file hieroskopia-0.1.26-py3-none-any.whl.

File metadata

  • Download URL: hieroskopia-0.1.26-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.9.1

File hashes

Hashes for hieroskopia-0.1.26-py3-none-any.whl
Algorithm Hash digest
SHA256 d9c8b37458595c3eeb9f096fdd9fc0b9c64c8bc5a80b505ab89e2e7ad7fb5cec
MD5 96d86f529e49e323f0c44e9245f1e16f
BLAKE2b-256 86de26ccd51b85efbc2fe1bdfb5b0cd2877a7aea4814c0d1ceff21557e38283f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page