Skip to main content

Lazy dict with predictable deterministic universally unique identifiers

Project description

test codecov pypi Python version license: GPL v3

arXiv API documentation

idict

A lazy dict with universally unique deterministic identifiers.

Latest release | Current code | API documentation

See also

  • identification package used by idict: GaROUPa
  • only laziness, i.e., without the identification part: ldict

Overview

An idict is an identified dict with str keys. We consider that every value is generated by a process, starting from an empty idict. The process is a sequence of transformation steps done through the operator >>, which symbolizes the ordering of the steps. There are two types of steps:

  • value insertion - represented by dict-like objects
  • function application - represented by ordinary Python functions

Functions, idicts, and values have a deterministic UUID (called hosh - operable hash). Identifiers (hoshes) for idicts and values are predictable through the magic available here. An idict is completely defined by its key-value pairs so that it can be converted from/to a built-in dict.

Creating an idict is not different from creating an ordinary dict. Optionally it can be created through the >> operator used after empty or Ø (usually AltGr+Shift+o in most keyboards). The resulting idict always contains two extra entries id and ids: img.png

Function application is done in the same way. The parameter names define the input fields, while the keys in the returned dict define the output fields: img_1.png

After evaluated, the value will not be calculated again: img_2.png

Functions can accept parameters: img_3.png

Installation

...as a standalone lib

# Set up a virtualenv. 
python3 -m venv venv
source venv/bin/activate

# Install from PyPI...
pip install --upgrade pip
pip install -U idict
pip install -U idict[full]  # use the flag 'full' for extra functionality (recommended)

# ...or, install from updated source code.
pip install git+https://github.com/davips/idict

...from source

git clone https://github.com/davips/idict
cd idict
poetry install
poetry install -E full  # use the flag 'full' for extra functionality (recommended)

Examples

Overview

# Creation by direct instantiation.
from idict import idict

d = idict(x=5, y=7, z=10)

# Creation from scratch.
# The expression 'v >> a >> b' means "Value 'v' will be processed by step 'a' then 'b'".
# A step can be a value insertion or a function application.
from idict import empty

d = empty >> {"x": 5} >> {"y": 7, "z": 10}

# Empty alias ('Ø') usage.
from idict import Ø

d = Ø >> {"x": 5} >> {"y": 7, "z": 10}
print(d)
"""
{
    "x": 5,
    "y": 7,
    "z": 10,
    "_id": "SV_4fc23c71a6bb954f6f2ed40e440cfd2b76087",
    "_ids": "GS_cb0fda15eac732cb08351e71fc359058b93bd... +1 ...gk_64fdf435fc9aa10be990397ff8fa92888792c"
}
"""
# Inverting color theme for a white background.
from garoupa import setup

setup(dark_theme=False)
d = idict(x=5, y=7, z=10)
print(d)


"""
{
    "x": 5,
    "y": 7,
    "z": 10,
    "_id": "SV_4fc23c71a6bb954f6f2ed40e440cfd2b76087",
    "_ids": "GS_cb0fda15eac732cb08351e71fc359058b93bd... +1 ...gk_64fdf435fc9aa10be990397ff8fa92888792c"
}
"""
# Function application.
# The input and output fields are detected from the function signature and returned dict.
def f(x):
    return {"y": x ** 2}


d2 = d >> f
print(d2)
"""
{
    "y": "→(x)",
    "x": 5,
    "z": 10,
    "_id": "J.WMNtqzVe2LiO8Ez4Wkgpa6zsBS.o529OTeWhNo",
    "_ids": "j6dsrYpQ-9A6BFtY5T98d-UeFJOS.o529OTeWhNo... +1 ...gk_64fdf435fc9aa10be990397ff8fa92888792c"
}
"""
# Anonymous function application.
d2 = d >> (lambda y: {"y": y / 5})
print(d)
"""
{
    "x": 5,
    "y": 7,
    "z": 10,
    "_id": "SV_4fc23c71a6bb954f6f2ed40e440cfd2b76087",
    "_ids": "GS_cb0fda15eac732cb08351e71fc359058b93bd... +1 ...gk_64fdf435fc9aa10be990397ff8fa92888792c"
}
"""
# Resulting values are evaluated lazily.
d >>= lambda y: {"y": y / 5}
print(d.y)
"""
1.4
"""
print(d)
"""
{
    "y": 1.4,
    "x": 5,
    "z": 10,
    "_id": "7uAa.i.4XbFyFY7OLx2TfzMQOeYZim1XGTCOzwYg",
    "_ids": "S-Vd.8e3nPaYsNmqhkiGDdvZUvVZim1XGTCOzwYg... +1 ...gk_64fdf435fc9aa10be990397ff8fa92888792c"
}
"""
# Parameterized function application.
# "Parameters" are distinguished from "fields" by having default values.
# When the default value is None, it means it will be explicitly defined later by 'let'.
from idict import let


def f(x, y, a=None, b=None):
    return {"z": a * x ** b, "w": y ** b}


d2 = d >> let(f, a=7, b=2)
print(d2)
"""
{
    "z": "→(a b x y)",
    "w": "→(a b x y)",
    "y": 1.4,
    "x": 5,
    "_id": "cLQzLVSJU.N2iT-5OaZWUEnnYWUHK5qjURoS6ymD",
    "_ids": "u-FenHI5ID.J6D-Hvj.WShqswXAgoL5sYPWsHSoF... +2 ...GS_cb0fda15eac732cb08351e71fc359058b93bd"
}
"""
# Parameterized function application with sampling.
# The default value is one of the following ranges, 
#     list, arithmetic progression, geometric progression.
# Each parameter value will be sampled later.
# A random number generator must be given.
from idict import let
from random import Random


def f(x, y, a=None, b=[1, 2, 3], ap=[1, 2, 3, ..., 10], gp=[1, 2, 4, ..., 16]):
    return {"z": a * x ** b, "w": y ** ap * gp}


d2 = d >> Random(0) >> let(f, a=7)
print(d2)
"""
{
    "z": "→(a b ap gp x y)",
    "w": "→(a b ap gp x y)",
    "y": 1.4,
    "x": 5,
    "_id": "JNtKgf-Bz7S5z6QwqzWKKM5OLM4QR7OmcORBo47s",
    "_ids": "IYglBIPS5j1KqhPE.JPs6GD89DSHtNtvgQncZo9u... +2 ...GS_cb0fda15eac732cb08351e71fc359058b93bd"
}
"""
print(d2.z)
"""
175
"""
print(d2)
"""
{
    "z": 175,
    "w": "10.541350399999995",
    "y": 1.4,
    "x": 5,
    "_id": "JNtKgf-Bz7S5z6QwqzWKKM5OLM4QR7OmcORBo47s",
    "_ids": "IYglBIPS5j1KqhPE.JPs6GD89DSHtNtvgQncZo9u... +2 ...GS_cb0fda15eac732cb08351e71fc359058b93bd"
}
"""

Identity example

from idict import idict

a = idict(x=3)
print(a)
"""
{
    "x": 3,
    "_id": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9",
    "_ids": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9"
}
"""
b = idict(y=5)
print(b)
"""
{
    "y": 5,
    "_id": "EI_20378979f4669f2e318ae9742e214fd4880d7",
    "_ids": "EI_20378979f4669f2e318ae9742e214fd4880d7"
}
"""
print(a >> b)
"""
{
    "x": 3,
    "y": 5,
    "_id": "pl_bb7e60e68670707cdef7dfd31096db4c63c91",
    "_ids": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9 EI_20378979f4669f2e318ae9742e214fd4880d7"
}
"""

Merging two idicts

from idict import idict

a = idict(x=3)
print(a)
"""
{
    "x": 3,
    "_id": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9",
    "_ids": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9"
}
"""
b = idict(y=5)
print(b)
"""
{
    "y": 5,
    "_id": "EI_20378979f4669f2e318ae9742e214fd4880d7",
    "_ids": "EI_20378979f4669f2e318ae9742e214fd4880d7"
}
"""
print(a >> b)
"""
{
    "x": 3,
    "y": 5,
    "_id": "pl_bb7e60e68670707cdef7dfd31096db4c63c91",
    "_ids": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9 EI_20378979f4669f2e318ae9742e214fd4880d7"
}
"""

Lazily applying functions to idict

from idict import idict

a = idict(x=3)
print(a)
"""
{
    "x": 3,
    "_id": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9",
    "_ids": "ME_bd0a8d9d8158cdbb9d7d4c7af1659ca1dabc9"
}
"""
a = a >> idict(y=5) >> {"z": 7} >> (lambda x, y, z: {"r": x ** y // z})
print(a)
"""
{
    "r": "→(x y z)",
    "x": 3,
    "y": 5,
    "z": 7,
    "_id": "kgdz8xfS7IuGtukIPe37KhAUrB2P4S3OFdPs8Gab",
    "_ids": "CXqa2zRRNd7Aj5wI8JTJ0O-7ML0P4S3OFdPs8Gab... +2 ...ZN_eccacd999c26ce18c98f9a17a6f47adcf162a"
}
"""
print(a.r)
"""
34
"""
print(a)
"""
{
    "r": 34,
    "x": 3,
    "y": 5,
    "z": 7,
    "_id": "kgdz8xfS7IuGtukIPe37KhAUrB2P4S3OFdPs8Gab",
    "_ids": "CXqa2zRRNd7Aj5wI8JTJ0O-7ML0P4S3OFdPs8Gab... +2 ...ZN_eccacd999c26ce18c98f9a17a6f47adcf162a"
}
"""

Parameterized functions and sampling

from random import Random

from idict import Ø, let


# A function provide input fields and, optionally, parameters.
# For instance:
# 'a' is sampled from an arithmetic progression
# 'b' is sampled from a geometric progression
# Here, the syntax for default parameter values is borrowed with a new meaning.
def fun(x, y, a=[-100, -99, -98, ..., 100], b=[0.0001, 0.001, 0.01, ..., 100000000]):
    return {"z": a * x + b * y}


def simplefun(x, y):
    return {"z": x * y}


# Creating an empty idict. Alternatively: d = idict().
d = Ø >> {}
d.show(colored=False)
"""
{
    "_id": "0000000000000000000000000000000000000000",
    "_ids": {}
}
"""
# Putting some values. Alternatively: d = idict(x=5, y=7).
d["x"] = 5
d["y"] = 7
print(d)
"""
{
    "x": 5,
    "y": 7,
    "_id": "BB_fad4374ca911f344859dab8e4b016ba2fe65b",
    "_ids": "GS_cb0fda15eac732cb08351e71fc359058b93bd WK_6ba95267cec724067d58b3186ecbcaa4253ad"
}
"""
# Parameter values are uniformly sampled.
d1 = d >> simplefun
print(d1)
print(d1.z)
"""
{
    "z": "→(x y)",
    "x": 5,
    "y": 7,
    "_id": "VqfQeuBWL7Xv1FwWe6pzgqJwclRMPNZuFtrAIt6g",
    "_ids": "9KKem6QL-I8C0Yk0q3URBt-aNXHMPNZuFtrAIt6g... +1 ...WK_6ba95267cec724067d58b3186ecbcaa4253ad"
}
35
"""
d2 = d >> simplefun
print(d2)
print(d2.z)
"""
{
    "z": "→(x y)",
    "x": 5,
    "y": 7,
    "_id": "VqfQeuBWL7Xv1FwWe6pzgqJwclRMPNZuFtrAIt6g",
    "_ids": "9KKem6QL-I8C0Yk0q3URBt-aNXHMPNZuFtrAIt6g... +1 ...WK_6ba95267cec724067d58b3186ecbcaa4253ad"
}
35
"""
# Parameter values can also be manually set.
e = d >> let(fun, a=5, b=10)
print(e.z)
"""
95
"""
# Not all parameters need to be set.
e = d >> Random() >> let(fun, a=5)
print("e =", e.z)
"""
e = 70000025.0
"""
# Each run will be a different sample for the missing parameters.
e = e >> Random() >> let(fun, a=5)
print("e =", e.z)
"""
e = 25.007
"""
# We can define the initial state of the random sampler.
# It will be in effect from its location place onwards in the expression.
e = d >> Random(0) >> let(fun, a=5)
print(e.z)
"""
725.0
"""
# All runs will yield the same result,
# if starting from the same random number generator seed.
e = e >> Random(0) >> let(fun, a=[555, 777])
print("Let 'a' be a list:", e.z)
"""
Let 'a' be a list: 700003885.0
"""
# Reproducible different runs are achievable by using a single random number generator.
e = e >> Random(0) >> let(fun, a=[5, 25, 125, ..., 10000])
print("Let 'a' be a geometric progression:", e.z)
"""
Let 'a' be a geometric progression: 700003125.0
"""
rnd = Random(0)
e = d >> rnd >> let(fun, a=5)
print(e.z)
e = d >> rnd >> let(fun, a=5)  # Alternative syntax.
print(e.z)
"""
725.0
700000025.0
"""
# Output fields can be defined dynamically through parameter values.
# Input fields can be defined dynamically through kwargs.
copy = lambda source=None, target=None, **kwargs: {target: kwargs[source]}
d = empty >> {"x": 5}
d >>= let(copy, source="x", target="y")
print(d)
d.evaluate()
print(d)

"""
{
    "y": "→(source target x)",
    "x": 5,
    "_id": "xmcjrFNT-2nEr3vizzx-44QwV5kwDfaOqWWvzOrq",
    "_ids": "3Tv6p5fZ936EK1DUkkcYAgWPbrmwDfaOqWWvzOrq GS_cb0fda15eac732cb08351e71fc359058b93bd"
}
{
    "y": 5,
    "x": 5,
    "_id": "xmcjrFNT-2nEr3vizzx-44QwV5kwDfaOqWWvzOrq",
    "_ids": "3Tv6p5fZ936EK1DUkkcYAgWPbrmwDfaOqWWvzOrq GS_cb0fda15eac732cb08351e71fc359058b93bd"
}
"""

Composition of sets of functions

from random import Random

from idict import Ø


# A multistep process can be defined without applying its functions


def g(x, y, a=[1, 2, 3, ..., 10], b=[0.00001, 0.0001, 0.001, ..., 100000]):
    return {"z": a * x + b * y}


def h(z, c=[1, 2, 3]):
    return {"z": c * z}


# In the 'idict' framework 'data is function',
# so the alias Ø represents the 'empty data object' and the 'reflexive function' at the same time.
# In other words: 'inserting nothing' has the same effect as 'doing nothing'.
fun = Ø >> g >> h  # 'empty' or 'Ø' enable the cartesian product of the subsequent sets of functions within the expression.
print(fun)
"""
«λ{} × λ»
"""
# Before a function is applied to a dict-like, the function free parameters remain unsampled.
# The result is an ordered set of composite functions.
d = {"x": 5, "y": 7} >> (Random(0) >> fun)
print(d)
"""
{
    "z": "→(c z→(a b x y))",
    "x": 5,
    "y": 7,
    "_id": "YZWooP03q0mFec8tjNwy3YuAohamevfjG3VAFXL-",
    "_ids": "5PKBLX4-dGDGHUifvK.QYVLeZTgmevfjG3VAFXL-... +1 ...WK_6ba95267cec724067d58b3186ecbcaa4253ad"
}
"""
print(d.z)
"""
105.0
"""
d = {"x": 5, "y": 7} >> (Random(0) >> fun)
print(d.z)
"""
105.0
"""
# Reproducible different runs by passing a stateful random number generator.
rnd = Random(0)
e = d >> rnd >> fun
print(e.z)
"""
105.0
"""
e = d >> rnd >> fun
print(e.z)
"""
14050.0
"""
# Repeating the same results.
rnd = Random(0)
e = d >> rnd >> fun
print(e.z)
"""
105.0
"""
e = d >> rnd >> fun
print(e.z)
"""
14050.0
"""

Concept

An idict is like a common Python dict, with extra functionality and lazy. It is a mapping between string keys, called fields, and any serializable (pickable) object. Each idict has two extra entries: id (identifier) and ids (value identifiers).

A custom 40-digit unique identifier (see GaROUPa) can be provided as an attribute for each function. Value objects can have custom identifiers as well, if provided whithin the entry ids.

Otherwise, identifiers for functions and values will be calculated through blake3 hashing of their content. For functions, the bytecode is used as content. For this reason, such functions should be simple, with minimal external dependencies or with their import statements inside the function body. This decreases the odds of using two functions with identical local code (and, therefore, identical identifiers) performing different calculations.

Grants

This work was supported by Fapesp under supervision of Prof. André C. P. L. F. de Carvalho at CEPID-CeMEAI (Grants 2013/07375-0 – 2019/01735-0) until 2021-03-31.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

idict-3.211211.1.tar.gz (66.0 kB view details)

Uploaded Source

Built Distribution

idict-3.211211.1-py3-none-any.whl (86.1 kB view details)

Uploaded Python 3

File details

Details for the file idict-3.211211.1.tar.gz.

File metadata

  • Download URL: idict-3.211211.1.tar.gz
  • Upload date:
  • Size: 66.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.7 CPython/3.8.10 Linux/5.4.0-89-generic

File hashes

Hashes for idict-3.211211.1.tar.gz
Algorithm Hash digest
SHA256 e7f9c205569f856940dc2b4bc14d50290007fe6d7f7477362722253901b7a906
MD5 946247d465552bb151b93c013cd3dd30
BLAKE2b-256 7c6523c0e4cccebf1f215b9d685a315596f7116646968f898af92b029b41c7c1

See more details on using hashes here.

File details

Details for the file idict-3.211211.1-py3-none-any.whl.

File metadata

  • Download URL: idict-3.211211.1-py3-none-any.whl
  • Upload date:
  • Size: 86.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.7 CPython/3.8.10 Linux/5.4.0-89-generic

File hashes

Hashes for idict-3.211211.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2ed0afa94fe6eafb2c98d14e0c94e77650e4fcb3a8768784f77ca2715662ec25
MD5 6f8342a3d2ad69a2d7907de6bba59cd3
BLAKE2b-256 a795174a2faba29b4ff2b0cb56194ec17b60094b1f111514188744aab1c00f92

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page