Python dictionary with broadcast support.

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Broadcast Dictionary

Python dictionary with broadcast support.

Usage

from bcdict import BCDict
>>> d = BCDict({"a": "hello", "b": "world!"})
>>> d
{'a': 'hello', 'b': 'world!'}

Regular element access:

>>> d['a']
'hello'

Regular element assignments

>>> d['a'] = "Hello"
>>> d
{'a': 'Hello', 'b': 'world!'}

Calling functions:

>>> d.upper()
{'a': 'HELLO', 'b': 'WORLD!'}

Slicing:

>>> d[1:3]
{'a': 'el', 'b': 'or'}

Applying functions:

>>> d.pipe(len)
{'a': 5, 'b': 6}

When there is a conflict between an attribute in the values and an attribute in BCDict, use the attribute accessor explicitly:

>>> d.a.upper()
{'a': 'HELLO', 'b': 'WORLD!'}

Slicing with conflicting keys:

>>> n = BCDict({1:"hello", 2: "world"})
>>> n[1]
'hello'
>>> # Using the attribute accessor:
>>> n.a[1]
{1: 'e', 2: 'o'}

Full example

Here we create a dictionary with 3 datasets and then train, apply and validate a linear regression on all 3 datasets without a single for loop or dictionary comprehension.

from collections.abc import Collection
from pprint import pprint
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

def get_random_data(datasets: Collection) -> dict[str, pd.DataFrame]:
    """Just create some random data."""
    columns = list("ABCD") + ["target"]
    dfs = {}
    for name in datasets:
        dfs[name] = pd.DataFrame(
            np.random.random((10, len(columns))), columns=columns
        )
    return dfs

datasets = ["noord", "brabant", "limburg"]

# make dict with three dataframes, one for each grid:
train_dfs = BCDict(get_random_data(datasets))
test_dfs = BCDict(get_random_data(datasets))

features = list("ABCD")
target = "target"

# get X, y *for all 3 grids at once*:
X_train = train_dfs[features]
y_train = train_dfs[target]

# get X, y *for all 3 grids at once*:
X_test = test_dfs[features]
y_test = test_dfs[target]

# creates models for all 3 grids at once:
# we call the `train` function on each dataframe in X_train, and pass the
# corresponding y_train series into the function.
def train(X: pd.DataFrame, y: pd.Series) -> LinearRegression:
    """We use this function to train a model."""
    model = LinearRegression()
    model.fit(X, y)
    return model

models = X_train.pipe(train, y_train)

# Apply each model to the correct grid.
# `models` is a BCDict.
# When calling the `predict` function, it knows that `test_dfs` is a dict with
# the same keys as `models`. When calling predict on each model, the corresponding
# dataframe from `test_dfs` is passed to the function.
preds = models.predict(X_test)

# now we pipe all predictions and the
scores = y_test.pipe(r2_score, preds)
pprint(scores)
# {'brabant': -2.2075573154836925,
#  'limburg': -1.3066288799673251,
#  'noord': -0.8467452520467658}

assert list(scores.keys()) == datasets
assert all((isinstance(v, float) for v in scores.values()))

# Conclusion: not a single for loop or dict comprehension used to train 3 models
# predict and evaluate 3 data sets :)

Original repository: https://github.com/mariushelf/bcdict

Author: Marius Helf (helfsmarius@gmail.com)

Changelog

v0.2.0

remove item() function. Use .a[] instead.

v0.1.0

initial release

License

MIT -- see LICENSE

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.5.0

May 31, 2022

0.4.3

May 2, 2022

0.4.2

Apr 26, 2022

0.4.1

Apr 20, 2022

0.4.0

Apr 20, 2022

0.3.0

Apr 19, 2022

This version

0.2.0

Apr 11, 2022

0.1.0

Apr 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bcdict-0.2.0.tar.gz (6.4 kB view hashes)

Uploaded Apr 11, 2022 Source

Built Distribution

bcdict-0.2.0-py3-none-any.whl (6.1 kB view hashes)

Uploaded Apr 11, 2022 Python 3

Hashes for bcdict-0.2.0.tar.gz

Hashes for bcdict-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c4727348f0834690997008eaa93393fa34bab3c6b9d6809694f656178464200a`
MD5	`0a38da3352c0372fab48964c513fea2b`
BLAKE2b-256	`8be828ab667ee3ffac500c87d6d3c534cc9a89bd476a5fc1575e05298dd45081`

Hashes for bcdict-0.2.0-py3-none-any.whl

Hashes for bcdict-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fa225ef5db2c598ebdc618258c2b2be74cc27fd9a14ac6bbd5e7d23249e28a15`
MD5	`0db168a8727fb2c1dc5a80250c081603`
BLAKE2b-256	`9f90c671fab4441988d7a9485e19d9a70fde56d5faa3233e250249785300fb5b`