Skip to main content

Kdict

Project description

Kdict: dict with multi-dimensional, sliceable keys

CI

kdict is like dict for multi-dimensional keys. With kdict, you can easily filter and slice your dictionary by key dimensions.

Example: machine learning model evaluation. Suppose you're evaluating several models on three cross validation folds, each with a training set and a test set.

Before kdict, you might store evaluation scores in a nested dictionary. But that's cumbersome and error-prone. Here's what it would take to get the mean accuracy for a particular model across all folds:

# To access inner nested data without kdict, you'd need to write iterators like this:
import numpy as np
np.mean(
    [
        data[fold_id][fold_label]["lasso"]
        for fold_id in data.keys()
        for fold_label in data[fold_id].keys()
    ]
)

kdict makes storing and accessing this type of data a breeze. No more nesting:

# Store data in a three-dimensional kdict.
# Dimensions: fold ID, fold label, model name
data = kdict(...)

# Slice the kdict to get lasso model's mean accuracy across all folds:
# data[:, :, 'lasso'] is a subset of the full dictionary
np.mean(list(data[:, :, 'lasso'].values()))

In this example, data is a three-dimensional kdict that you can slice along any dimension. So how did we make this kdict?

from kdict import kdict
data = kdict() # make a blank kdict
for fold_id in range(3):
    for fold_label in ['train', 'test']:
        for model_name in ['lasso', 'randomforest']:
            # add an entry for each fold ID, fold label, and model name
            data[fold_id, fold_label, model_name] = get_model_score(
                fold_id,
                fold_label,
                model_name
            )

The syntax, in a nutshell:

  • Read or write a single element by accessing [key_dimension_1, key_dimension_2] and so on.
  • Or get a subset of the dictionary by slicing, e.g. [:, key_dimension_2].

Installation

pip install kdict

Usage

Create a kdict

Import: from kdict import kdict

Create a blank kdict: data = kdict(). Or initialize from an existing dict: data = kdict(existing_dict). You can also use a dict comprehension there, such as:

data = kdict({
    (fold_id, fold_label, model_name): get_model_score(fold_id, fold_label, model_name)
    for model_name in ['lasso', 'randomforest']
    for fold_label in ['train', 'test']
    for fold_id in range(3)
})

Slice a kdict

Access an individual item with data[0, 'train', 'lasso'].

Or get a subset of the dictionary with slices: data[0, :, :] will have all items where the first dimension of the key is 0. This slice is also a kdict, so you can keep slicing and filtering further.

You can also iterate over specific key dimensions:

# get final dimension of the keys
available_models = data.keys(dimensions=2)

# or get all pairs of first two dimensions
for fold_id, fold_label in data.keys(dimensions=[0, 1]):
    ... # now do something with data[fold_id, fold_label, :]

Eject

A kdict behaves just like a dict, except all keys must have the same number of dimensions.

To get a raw dict back, call data.eject().

Development

Submit PRs against develop branch, then make a release pull request to master.

# Install requirements
pip install --upgrade pip wheel
pip install -r requirements_dev.txt

# Install local package
pip install -e .

# Install pre-commit
pre-commit install

# Run tests
make test

# Run lint
make lint

# bump version before submitting a PR against master (all master commits are deployed)
bump2version patch # possible: major / minor / patch

# also ensure CHANGELOG.md updated

Changelog

0.0.1

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kdict-0.1.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

kdict-0.1.0-py2.py3-none-any.whl (6.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file kdict-0.1.0.tar.gz.

File metadata

  • Download URL: kdict-0.1.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for kdict-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bcf332c2ffd039d87a7a58e483137825b39cf1fb3c24f1bb5ddfa33f5a2b6058
MD5 0740a935aee8b5a7db41f81b2726ff2c
BLAKE2b-256 9790002e727ff4ffde1c678c07fbc6ddc091ad58dc336e92a6545d8a1f9bc54e

See more details on using hashes here.

File details

Details for the file kdict-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: kdict-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for kdict-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 04c5fc751fa5f2958b309df4dc5d1c41e15281d1ef72ec5b7ac3ae464f347555
MD5 3418ebdb3aae28c47ad7cec4ede0742a
BLAKE2b-256 1855a89315d830e93f4400e5ca41085a6760d37159ffc82a0270b21bccd970c5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page