Skip to main content

Data type system for different data structures (arrays, lists of dictionaries, etc.).

Project description

DataTypeSystem

This Python package provides a type system for different data structures that are coercible to full arrays. It is Python translation of the code of the Raku package "Data::TypeSystem", [AAp1].


Installation

Install from GitHub

pip install -e git+https://github.com/antononcube/Python-packages.git#egg=DataTypeSystem-antononcube\&subdirectory=DataTypeSystem

From PyPi

pip install DataTypeSystem

Usage examples

The type system conventions follow those of Mathematica's Dataset -- see the presentation "Dataset improvements".

Here we get the Titanic dataset, change the "passengerAge" column values to be numeric, and show dataset's dimensions:

import pandas

dfTitanic = pandas.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/titanic.csv')
dfTitanic = dfTitanic[["sex", "age", "pclass", "survived"]]
dfTitanic = dfTitanic.rename(columns ={"pclass": "class"})
dfTitanic.shape
(891, 4)

Here is a sample of dataset's records:

from DataTypeSystem import *

dfTitanic.sample(3)
sex age class survived
555 male 62.0 1 0
278 male 7.0 3 0
266 male 16.0 3 0

Here is the type of a single record:

deduce_type(dfTitanic.iloc[12].to_dict())
Struct([age, class, sex, survived], [float, int, str, int])

Here is the type of single record's values:

deduce_type(dfTitanic.iloc[12].to_dict().values())
Tuple([Atom(<class 'str'>), Atom(<class 'float'>), Atom(<class 'int'>), Atom(<class 'int'>)])

Here is the type of the whole dataset:

deduce_type(dfTitanic.to_dict())
Assoc(Atom(<class 'str'>), Assoc(Atom(<class 'int'>), Atom(<class 'str'>), 891), 4)

Here is the type of "values only" records:

valArr = dfTitanic.transpose().to_dict().values()
deduce_type(valArr)
Vector(Struct([age, class, sex, survived], [float, int, str, int]), 891)

References

[AAp1] Anton Antonov, Data::TypeSystem Raku package, (2023), GitHub/antononcube.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DataTypeSystem-0.1.1.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

DataTypeSystem-0.1.1-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file DataTypeSystem-0.1.1.tar.gz.

File metadata

  • Download URL: DataTypeSystem-0.1.1.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for DataTypeSystem-0.1.1.tar.gz
Algorithm Hash digest
SHA256 dcb369c4fbd0c7439da4a3ca9a6b1c832f6f0c816adeec5933a4661dc5aeea4a
MD5 2b15ff5acc038c875674e31e01f74a95
BLAKE2b-256 ed1d9a6d19411a7604ea50239e0ef277d4542add6bed080e40a5cdd5615d2c0e

See more details on using hashes here.

File details

Details for the file DataTypeSystem-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for DataTypeSystem-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8394173daaee087b8b0782e1c2bd7784e18eef47b7c6d457902f0da947eff93b
MD5 5512c1f490fd390e9400f64f965be6f1
BLAKE2b-256 c1971c23cca8b7ea88724ff30ff77bfa3424ded8cc61eb90e58eb0be3cd9ba3e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page