Skip to main content

The hiereskopia package is a library to infer properties like date formats or numeric separators in pandas series of type object or string.

Project description

Hieroskopia

codecov

The hiereskopia package is a library to infer properties like date formats or numeric separators in pandas series of type object or string.

Support

Date-times:

  • Support to dates and datetime format
  • This library receive a series as input and try to return a dictionary with the format found in the series Based on the 1989 C (Default) or Snowflake Standard date time format code.

Numeric:

  • This library receive a series as input and try to return a dictionary with the three digit and decimal character separator

Usage

Infer datetime

>>> from hieroskopia import InferDatetime
>>> InferDatetime.infer(pd.Series(["2019-11-27",
                     "2019/11/28",
                     "2018-11-08"]))
>>> {'formats': ['%Y-%m-%d', '%Y/%m/%d'], 'type':'datetime'}

Using return_format parameter

>>> from hieroskopia import InferDatetime
>>> InferDatetime.infer(pd.Series(["2019-11-27",
                     "2019/11/28",
                     "2018-11-08"]), return_format='snowflake')
>>> {'formats': ['yyyy-mm-dd', 'yyyy/mm/dd'], 'type':'datetime'}

The above method works with a best guess approach to detect a format in a object type series and try to return a datetime.strftime/strptime or snowflake format that will cover or parse the majority of the samples.

Infer numeric

>>> from hieroskopia import InferNumeric
>>> InferNumeric.infer(pd.Series(['767313628196.2', '76731362819.546', '767313628196']))
>>> {'three_digit_separator': '', 'decimal_separator': '.', 'type':'float'}

The above method will try to detect and return certain properties in a object type series like datatype, three_digit_separator or decimal_separator character that will cover the majority of the samples.

To do:

  • Feed more regular expressions
  • Add Time format
  • Develop multiple algorithms to get a better precision.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hieroskopia-0.0.8.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

hieroskopia-0.0.8-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file hieroskopia-0.0.8.tar.gz.

File metadata

  • Download URL: hieroskopia-0.0.8.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.5

File hashes

Hashes for hieroskopia-0.0.8.tar.gz
Algorithm Hash digest
SHA256 ddd03f1528914210b129d40dd71364f77b6832012fc209bd67cae922e3db82c0
MD5 d295c92737978a5c3dffa5b88b88019e
BLAKE2b-256 e6ae814c66f1ef9491ed4619de45c32b4a63a54166a29f8bea1cb30042473fff

See more details on using hashes here.

File details

Details for the file hieroskopia-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: hieroskopia-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.5

File hashes

Hashes for hieroskopia-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 a1c87d8e326f0071f057ddb63fb8476d23f93c430c6114ba332438b830a0cf82
MD5 b7459f57380b4d177841b44ff34f5a71
BLAKE2b-256 3955da399893d811aef2f0d2bf234ae7be73dbabc429ac1d7876d7ffcf0f5b37

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page