Data type system for different data structures (arrays, lists of dictionaries, etc.).
Project description
DataTypeSystem
This Python package provides a type system for different data structures that are coercible to full arrays. It is Python translation of the code of the Raku package "Data::TypeSystem", [AAp1].
Installation
Install from GitHub
pip install -e git+https://github.com/antononcube/Python-packages.git#egg=DataTypeSystem-antononcube\&subdirectory=DataTypeSystem
From PyPi
pip install DataTypeSystem
Usage examples
The type system conventions follow those of Mathematica's
Dataset
-- see the presentation
"Dataset improvements".
Here we get the Titanic dataset, change the "passengerAge" column values to be numeric, and show dataset's dimensions:
import pandas
dfTitanic = pandas.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/titanic.csv')
dfTitanic = dfTitanic[["sex", "age", "pclass", "survived"]]
dfTitanic = dfTitanic.rename(columns ={"pclass": "class"})
dfTitanic.shape
(891, 4)
Here is a sample of dataset's records:
from DataTypeSystem import *
dfTitanic.sample(3)
sex | age | class | survived | |
---|---|---|---|---|
555 | male | 62.0 | 1 | 0 |
278 | male | 7.0 | 3 | 0 |
266 | male | 16.0 | 3 | 0 |
Here is the type of a single record:
deduce_type(dfTitanic.iloc[12].to_dict())
Struct([age, class, sex, survived], [float, int, str, int])
Here is the type of single record's values:
deduce_type(dfTitanic.iloc[12].to_dict().values())
Tuple([Atom(<class 'str'>), Atom(<class 'float'>), Atom(<class 'int'>), Atom(<class 'int'>)])
Here is the type of the whole dataset:
deduce_type(dfTitanic.to_dict())
Assoc(Atom(<class 'str'>), Assoc(Atom(<class 'int'>), Atom(<class 'str'>), 891), 4)
Here is the type of "values only" records:
valArr = dfTitanic.transpose().to_dict().values()
deduce_type(valArr)
Vector(Struct([age, class, sex, survived], [float, int, str, int]), 891)
References
[AAp1] Anton Antonov, Data::TypeSystem Raku package, (2023), GitHub/antononcube.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file DataTypeSystem-0.1.1.tar.gz
.
File metadata
- Download URL: DataTypeSystem-0.1.1.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dcb369c4fbd0c7439da4a3ca9a6b1c832f6f0c816adeec5933a4661dc5aeea4a |
|
MD5 | 2b15ff5acc038c875674e31e01f74a95 |
|
BLAKE2b-256 | ed1d9a6d19411a7604ea50239e0ef277d4542add6bed080e40a5cdd5615d2c0e |
File details
Details for the file DataTypeSystem-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: DataTypeSystem-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8394173daaee087b8b0782e1c2bd7784e18eef47b7c6d457902f0da947eff93b |
|
MD5 | 5512c1f490fd390e9400f64f965be6f1 |
|
BLAKE2b-256 | c1971c23cca8b7ea88724ff30ff77bfa3424ded8cc61eb90e58eb0be3cd9ba3e |