Skip to main content

Python library for extract property from data.

Project description

Summary

A Python library for extract property from data.

PyPI package version https://img.shields.io/pypi/pyversions/DataProperty.svg Linux/macOS CI status Windows CI status Test coverage

Usage

Extract property of data

e.g. Extract a float value property

>>> from dataproperty import DataProperty
>>> DataProperty(-1.1)
data=-1.1, type=REAL_NUMBER, align=right, ascii_width=4, int_digits=1, decimal_places=1, extra_len=1

e.g. Extract a int value property

>>> from dataproperty import DataProperty
>>> DataProperty(123456789)
data=123456789, type=INTEGER, align=right, ascii_width=9, int_digits=9, decimal_places=0, extra_len=0

e.g. Extract a str (ascii) value property

>>> from dataproperty import DataProperty
>>> DataProperty("sample string")
data=sample string, type=STRING, align=left, length=13, ascii_width=13, extra_len=0

e.g. Extract a str (multi-byte) value property

>>> import six
>>> from dataproperty import DataProperty
>>> six.text_type(DataProperty("吾輩は猫である"))
data=吾輩は猫である, type=STRING, align=left, length=7, ascii_width=14, extra_len=0

e.g. Extract a time (datetime) value property

>>> import datetime
>>> from dataproperty import DataProperty
>>> DataProperty(datetime.datetime(2017, 1, 1, 0, 0, 0))
data=2017-01-01 00:00:00, type=DATETIME, align=left, ascii_width=19, extra_len=0

e.g. Extract a bool value property

>>> from dataproperty import DataProperty
>>> DataProperty(True)
data=True, type=BOOL, align=left, ascii_width=4, extra_len=0

Extract data property for each element from a matrix

DataPropertyExtractor.to_dp_matrix method returns a matrix of DataProperty instances from a data matrix. An example data set and the result are as follows:

Sample Code:
import datetime
from dataproperty import DataPropertyExtractor

dp_extractor = DataPropertyExtractor()
dt = datetime.datetime(2017, 1, 1, 0, 0, 0)
inf = float("inf")
nan = float("nan")

dp_matrix = dp_extractor.to_dp_matrix([
    [1, 1.1, "aa", 1, 1, True, inf, nan, dt],
    [2, 2.2, "bbb", 2.2, 2.2, False, "inf", "nan", dt],
    [3, 3.33, "cccc", -3, "ccc", "true", inf, "NAN", "2017-01-01T01:23:45+0900"],
])

for row, dp_list in enumerate(dp_matrix):
    for col, dp in enumerate(dp_list):
        print("row={:d}, col={:d}, {}".format(row, col, str(dp)))
Output:
row=0, col=0, data=1, type=INTEGER, align=right, ascii_width=1, int_digits=1, decimal_places=0, extra_len=0
row=0, col=1, data=1.1, type=REAL_NUMBER, align=right, ascii_width=3, int_digits=1, decimal_places=1, extra_len=0
row=0, col=2, data=aa, type=STRING, align=left, ascii_width=2, length=2, extra_len=0
row=0, col=3, data=1, type=INTEGER, align=right, ascii_width=1, int_digits=1, decimal_places=0, extra_len=0
row=0, col=4, data=1, type=INTEGER, align=right, ascii_width=1, int_digits=1, decimal_places=0, extra_len=0
row=0, col=5, data=True, type=BOOL, align=left, ascii_width=4, extra_len=0
row=0, col=6, data=Infinity, type=INFINITY, align=left, ascii_width=8, extra_len=0
row=0, col=7, data=NaN, type=NAN, align=left, ascii_width=3, extra_len=0
row=0, col=8, data=2017-01-01 00:00:00, type=DATETIME, align=left, ascii_width=19, extra_len=0
row=1, col=0, data=2, type=INTEGER, align=right, ascii_width=1, int_digits=1, decimal_places=0, extra_len=0
row=1, col=1, data=2.2, type=REAL_NUMBER, align=right, ascii_width=3, int_digits=1, decimal_places=1, extra_len=0
row=1, col=2, data=bbb, type=STRING, align=left, ascii_width=3, length=3, extra_len=0
row=1, col=3, data=2.2, type=REAL_NUMBER, align=right, ascii_width=3, int_digits=1, decimal_places=1, extra_len=0
row=1, col=4, data=2.2, type=REAL_NUMBER, align=right, ascii_width=3, int_digits=1, decimal_places=1, extra_len=0
row=1, col=5, data=False, type=BOOL, align=left, ascii_width=5, extra_len=0
row=1, col=6, data=Infinity, type=INFINITY, align=left, ascii_width=8, extra_len=0
row=1, col=7, data=NaN, type=NAN, align=left, ascii_width=3, extra_len=0
row=1, col=8, data=2017-01-01 00:00:00, type=DATETIME, align=left, ascii_width=19, extra_len=0
row=2, col=0, data=3, type=INTEGER, align=right, ascii_width=1, int_digits=1, decimal_places=0, extra_len=0
row=2, col=1, data=3.33, type=REAL_NUMBER, align=right, ascii_width=4, int_digits=1, decimal_places=2, extra_len=0
row=2, col=2, data=cccc, type=STRING, align=left, ascii_width=4, length=4, extra_len=0
row=2, col=3, data=-3, type=INTEGER, align=right, ascii_width=2, int_digits=1, decimal_places=0, extra_len=1
row=2, col=4, data=ccc, type=STRING, align=left, ascii_width=3, length=3, extra_len=0
row=2, col=5, data=True, type=BOOL, align=left, ascii_width=4, extra_len=0
row=2, col=6, data=Infinity, type=INFINITY, align=left, ascii_width=8, extra_len=0
row=2, col=7, data=NaN, type=NAN, align=left, ascii_width=3, extra_len=0
row=2, col=8, data=2017-01-01T01:23:45+0900, type=STRING, align=left, ascii_width=24, length=24, extra_len=0

Full example source code can be found at examples/py/to_dp_matrix.py

Extract property for each column from a matrix

DataPropertyExtractor.to_column_dp_list method returns a list of DataProperty instances from a data matrix. The list represents the properties for each column. An example data set and the result are as follows:

Example data set and result are as follows:

Sample Code:
import datetime
from dataproperty import DataPropertyExtractor

dp_extractor = DataPropertyExtractor()
dt = datetime.datetime(2017, 1, 1, 0, 0, 0)
inf = float("inf")
nan = float("nan")

data_matrix = [
    [1, 1.1,  "aa",   1,   1,     True,   inf,   nan,   dt],
    [2, 2.2,  "bbb",  2.2, 2.2,   False,  "inf", "nan", dt],
    [3, 3.33, "cccc", -3,  "ccc", "true", inf,   "NAN", "2017-01-01T01:23:45+0900"],
]

dp_extractor.headers = ["int", "float", "str", "num", "mix", "bool", "inf", "nan", "time"]
col_dp_list = dp_extractor.to_column_dp_list(dp_extractor.to_dp_matrix(dp_matrix))

for col_idx, col_dp in enumerate(col_dp_list):
    print(str(col_dp))
Output:
column=0, type=INTEGER, align=right, ascii_width=3, bit_len=2, int_digits=1, decimal_places=0
column=1, type=REAL_NUMBER, align=right, ascii_width=5, int_digits=1, decimal_places=(min=1, max=2)
column=2, type=STRING, align=left, ascii_width=4
column=3, type=REAL_NUMBER, align=right, ascii_width=4, int_digits=1, decimal_places=(min=0, max=1), extra_len=(min=0, max=1)
column=4, type=STRING, align=left, ascii_width=3, int_digits=1, decimal_places=(min=0, max=1)
column=5, type=BOOL, align=left, ascii_width=5
column=6, type=INFINITY, align=left, ascii_width=8
column=7, type=NAN, align=left, ascii_width=3
column=8, type=STRING, align=left, ascii_width=24

Full example source code can be found at examples/py/to_column_dp_list.py

Installation

Install from PyPI

pip install DataProperty

Install from PPA (for Ubuntu)

sudo add-apt-repository ppa:thombashi/ppa
sudo apt update
sudo apt install python3-dataproperty

Dependencies

Python 2.7+ or 3.5+

Optional dependencies

  • logbook
    • Logging using logbook if the package installed

Test dependencies

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DataProperty-0.43.1.tar.gz (30.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

DataProperty-0.43.1-py2.py3-none-any.whl (24.6 kB view details)

Uploaded Python 2Python 3

File details

Details for the file DataProperty-0.43.1.tar.gz.

File metadata

  • Download URL: DataProperty-0.43.1.tar.gz
  • Upload date:
  • Size: 30.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for DataProperty-0.43.1.tar.gz
Algorithm Hash digest
SHA256 ac2bc221a53fcd2afa140d1feb45cadbe62b335f667dba1e08e3c43a93339103
MD5 8280a85691959d988e58f52c0ed3d311
BLAKE2b-256 9568fb3498145a106f700ff304158b6d0f0c76eb3001ee5129a4f16c9846b338

See more details on using hashes here.

File details

Details for the file DataProperty-0.43.1-py2.py3-none-any.whl.

File metadata

  • Download URL: DataProperty-0.43.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for DataProperty-0.43.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 4835c2c9fdcf9ea7e721dce5c32cc7e8b502476b83b80a055aa246c0e8d163fd
MD5 c8f9e0cf9d9aa7766cddcffce48f8295
BLAKE2b-256 55693d304c499d701b5349ad829c20e8640d256312e710b2fb4eee61356c628b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page