The hiereskopia package is a library to infer properties like date formats or numeric separators in pandas series of type object or string.
Project description
Hieroskopia
The hiereskopia package is a library to infer properties like date formats or numeric separators in pandas series of type object or string.
Support
Date-times:
- Support to dates and datetime format
- This library receive a series as input and try to return a dictionary with the format found in the series Based on the 1989 C Standard date time format code
Numeric:
- This library receive a series as input and try to return a dictionary with the three digit and decimal character separator
Usage
Infer datetime
>>> from hieroskopia import InferDatetime
>>> InferDatetime.infer(pd.Series(["2019-11-27",
"2019/11/28",
"2018-11-08"]))
>>> {'formats': ['%Y-%m-%d', '%Y/%m/%d'], 'type':'datetime'}
The above method works with a best guess approach to detect a format in a object type series and try
to return a datetime.strftime
/strptime
format that will cover or parse the majority
of the samples.
Infer numeric
>>> from hieroskopia import InferNumeric
>>> InferNumeric.infer(pd.Series(['767313628196.2', '76731362819.546', '767313628196']))
>>> {'three_digit_separator': '', 'decimal_separator': '.', 'type':'float'}
The above method will try to detect and return certain properties in a object type series
like datatype
, three_digit_separator
or decimal_separator
character that will cover
the majority of the samples.
To do:
- Specify another output standard format to support java date format, snowflake date characters definition etc.
- Feed more regular expressions
- Add Time format
- Develop multiple algorithms to get a better precision.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for hieroskopia-0.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8cc1831acc9cbff8ae9b8e586dbacc1e751cc3f3a6b8ac8c89e88a1c9fdaa775 |
|
MD5 | 16378440d581c64eef1e1416c7785189 |
|
BLAKE2b-256 | 900958df0cb0712562f1026405b309a8e8665daad7fb3638bd3d335cd27a8c2c |