Skip to main content

Data type system for different data structures (arrays, lists of dictionaries, etc.).

Project description

DataTypeSystem

This Python package provides a type system for different data structures that are coercible to full arrays. It is Python translation of the code of the Raku package "Data::Reshapers", [AAp1].


Installation

Install from GitHub

pip install -e git+https://github.com/antononcube/Python-packages.git#egg=DataTypeSystem-antononcube\&subdirectory=DataTypeSystem

From PyPi

pip install DataTypeSystem

Usage examples

The type system conventions follow those of Mathematica's Dataset -- see the presentation "Dataset improvements".

Here we get the Titanic dataset, change the "passengerAge" column values to be numeric, and show dataset's dimensions:

import pandas

dfTitanic = pandas.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/titanic.csv')
dfTitanic = dfTitanic[["sex", "age", "pclass", "survived"]]
dfTitanic = dfTitanic.rename(columns ={"pclass": "class"})
dfTitanic.shape
(891, 4)

Here is a sample of dataset's records:

from DataTypeSystem import *

dfTitanic.sample(3)
sex age class survived
555 male 62.0 1 0
278 male 7.0 3 0
266 male 16.0 3 0

Here is the type of a single record:

deduce_type(dfTitanic.iloc[12].to_dict())
Struct([age, class, sex, survived], [float, int, str, int])

Here is the type of single record's values:

deduce_type(dfTitanic.iloc[12].to_dict().values())
Tuple([Atom(<class 'str'>), Atom(<class 'float'>), Atom(<class 'int'>), Atom(<class 'int'>)])

Here is the type of the whole dataset:

deduce_type(dfTitanic.to_dict())
Assoc(Atom(<class 'str'>), Assoc(Atom(<class 'int'>), Atom(<class 'str'>), 891), 4)

Here is the type of "values only" records:

valArr = dfTitanic.transpose().to_dict().values()
deduce_type(valArr)
Vector(Struct([age, class, sex, survived], [float, int, str, int]), 891)

References

[AAp1] Anton Antonov, Data::TypeSystem Raku package, (2023), GitHub/antononcube.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DataTypeSystem-0.1.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

DataTypeSystem-0.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file DataTypeSystem-0.1.0.tar.gz.

File metadata

  • Download URL: DataTypeSystem-0.1.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for DataTypeSystem-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a136aefa7e0f885af168b8498248eb0d527a7db17d29bc7ebcc45c7366f0afcc
MD5 ff7a8143c7674a2d8830d1e106abf5ce
BLAKE2b-256 ac613328f4ed8b04729a601f4290025b08cbe6d09bbec6b499d3cf3c88b2c315

See more details on using hashes here.

File details

Details for the file DataTypeSystem-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for DataTypeSystem-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9f10139d6cbaf486359c01a89bd7592e119b3c6279fcd9597b6901c881e71de3
MD5 6fd49d7779e7a0819cd1e7cc6af93044
BLAKE2b-256 12201968d733503efdc4c87070758b38f1d81944c74c69c60f072d2c75e08401

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page