Skip to main content

Science as data transformation

Project description

aiuna scientific data for the classroom

WARNING: This project is still subject to major changes, e.g., in the next rewrite.

Bradypus variegatus - By Stefan Laube - (Dreizehenfaultier (Bradypus infuscatus), Gatunsee, Republik Panama), Public Domain

Installation

Examples

Creating data from ARFF file

d = file("iris.arff").data

print(d.Xd)
# ['sepallength', 'sepalwidth', 'petallength', 'petalwidth']
print(d.X[:5])
# [[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]
print(d.y[:5])
# ['Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa' 'Iris-setosa']
print(d.y_pd.value_counts())
# class          
Iris-setosa        50
Iris-versicolor    50
Iris-virginica     50
dtype: int64

Acessing a data field as a pandas DataFrame

d = dataset.data  # 'iris' is the default dataset
df = d.X_pd
print(df.head())
#    sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
0                5.1               3.5                1.4               0.2
1                4.9               3.0                1.4               0.2
2                4.7               3.2                1.3               0.2
3                4.6               3.1                1.5               0.2
4                5.0               3.6                1.4               0.2
mycol = d.X_pd["petal length (cm)"]
print(mycol[:5])
# 0    1.4
1    1.4
2    1.3
3    1.5
4    1.4
Name: petal length (cm), dtype: float64

Creating data from numpy arrays

import numpy as np

X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
y = np.array([0, 1, 1])
d = new(X=X, y=y)
print(d)
# {
    "uuid": "06NLDM4mLEMrHPOaJvEBqdo",
    "uuids": {
        "changed": "3Sc2JjUPMlnNtlq3qdx9Afy",
        "X": "13zbQMwRwU3WB8IjMGaXbtf",
        "Y": "1IkmDz3ATFmgzeYnzygvwDu"
    },
    "step": {
        "id": "06NLDM4mLEMrHPOT2pd5lzo",
        "desc": {
            "name": "New",
            "path": "aiuna.step.new",
            "config": {
                "hashes": {
                    "X": "586962852295d584ec08e7214393f8b2",
                    "Y": "f043eb8b1ab0a9618ad1dc53a00d759e"
                }
            }
        }
    },
    "changed": [
        "X",
        "Y"
    ],
    "X": [
        "[[1 2 3]",
        " [4 5 6]",
        " [7 8 9]]"
    ],
    "Y": [
        "[[0]",
        " [1]",
        " [1]]"
    ]
}

Checking history

d = dataset.data  # 'iris' is the default dataset
print(d.history)
# {
    "02o8BsNH0fhOYFF6JqxwaLF": {
        "name": "New",
        "path": "aiuna.step.new",
        "config": {
            "hashes": {
                "X": "19b2d27779bc2d2444c11f5cc24c98ee",
                "Y": "8baa54c6c205d73f99bc1215b7d46c9c",
                "Xd": "0af9062dccbecaa0524ac71978aa79d3",
                "Yd": "04ceed329f7c3eb43f93efd981fde313",
                "Xt": "60d4f429fcd642bbaf1d976002479ea2",
                "Yt": "4660adc31e2c25d02cb751dcb96ecfd3"
            }
        }
    }
}
del d["X"]
print(d.history)
# {
    "02o8BsNH0fhOYFF6JqxwaLF": {
        "name": "New",
        "path": "aiuna.step.new",
        "config": {
            "hashes": {
                "X": "19b2d27779bc2d2444c11f5cc24c98ee",
                "Y": "8baa54c6c205d73f99bc1215b7d46c9c",
                "Xd": "0af9062dccbecaa0524ac71978aa79d3",
                "Yd": "04ceed329f7c3eb43f93efd981fde313",
                "Xt": "60d4f429fcd642bbaf1d976002479ea2",
                "Yt": "4660adc31e2c25d02cb751dcb96ecfd3"
            }
        }
    },
    "06fV1rbQVC1WfPelDNTxEPI": {
        "name": "Del",
        "path": "aiuna.step.delete",
        "config": {
            "field": "X"
        }
    }
}
d["Z"] = 42
print(d.Z, type(d.Z))
# [[42]] <class 'numpy.ndarray'>
print(d.history)
# {
    "02o8BsNH0fhOYFF6JqxwaLF": {
        "name": "New",
        "path": "aiuna.step.new",
        "config": {
            "hashes": {
                "X": "19b2d27779bc2d2444c11f5cc24c98ee",
                "Y": "8baa54c6c205d73f99bc1215b7d46c9c",
                "Xd": "0af9062dccbecaa0524ac71978aa79d3",
                "Yd": "04ceed329f7c3eb43f93efd981fde313",
                "Xt": "60d4f429fcd642bbaf1d976002479ea2",
                "Yt": "4660adc31e2c25d02cb751dcb96ecfd3"
            }
        }
    },
    "06fV1rbQVC1WfPelDNTxEPI": {
        "name": "Del",
        "path": "aiuna.step.delete",
        "config": {
            "field": "X"
        }
    },
    "05eIWbfCJS7vWJsXBXjoUAh": {
        "name": "Let",
        "path": "aiuna.step.let",
        "config": {
            "field": "Z",
            "value": 42
        }
    }
}

Grants

Part of the effort spent in the present code was kindly supported by Fapesp under supervision of Prof. André C. P. L. F. de Carvalho at CEPID-CeMEAI (Grants 2013/07375-0 – 2019/01735-0).

History

The novel ideias presented here are a result of a years-long process of drafts, thinking, trial/error and rewrittings from scratch in several languages from Delphi, passing through Haskell, Java and Scala to Python - including frustration with well stablished libraries at the time. The fundamental concepts were lightly borrowed from basic category theory concepts like algebraic data structures that permeate many recent tendencies, e.g., in programming language design.

For more details, refer to https://github.com/davips/kururu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiuna-0.2101.1.tar.gz (77.2 kB view details)

Uploaded Source

Built Distribution

aiuna-0.2101.1-py3-none-any.whl (104.1 kB view details)

Uploaded Python 3

File details

Details for the file aiuna-0.2101.1.tar.gz.

File metadata

  • Download URL: aiuna-0.2101.1.tar.gz
  • Upload date:
  • Size: 77.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.9.1

File hashes

Hashes for aiuna-0.2101.1.tar.gz
Algorithm Hash digest
SHA256 564a2bee03496825cfffc2681af8d1a26fd2b2e1e8aac88d42d5846485e4b8c2
MD5 0af337875efaa3d99143b2568595f437
BLAKE2b-256 a4e41c1b8b4193a21432f4aa12e047ce3f936ccfe2a13b0d5010ba8ec1f1d08f

See more details on using hashes here.

File details

Details for the file aiuna-0.2101.1-py3-none-any.whl.

File metadata

  • Download URL: aiuna-0.2101.1-py3-none-any.whl
  • Upload date:
  • Size: 104.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.9.1

File hashes

Hashes for aiuna-0.2101.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8ed3f1f8fca976113c6c7b080291decf5ec601f0f1014d84b7fb501f822f75bb
MD5 bcabf5ce7f49a96683d6a3dbe26ebfbc
BLAKE2b-256 fb10aed488c1f251e9e25f217a2de4a953d294ecbbf1867917b8a9d59a30483f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page