Skip to main content

Lazy dict with universally unique identifier for values

Project description

test codecov pypi Python version license: GPL v3

arXiv API documentation

idict

A lazy dict with universally unique deterministic identifiers.

Latest release | Current code | API documentation

Overview

An idict is a dict with str keys.

See also

  • laziness+identity (ldict)
  • laziness+identity+persistence (cdict)

Identity example

from idict import idict

a = idict(x=3)
print(a)
"""
{
    "x": 3,
    "id": "WB_e55a47230d67db81bcc1aecde8f1b950282cd",
    "ids": {
        "x": "WB_e55a47230d67db81bcc1aecde8f1b950282cd"
    }
}
"""
b = idict(y=5)
print(b)
"""
{
    "y": 5,
    "id": "0U_e2a86ff72e226d5365aea336044f7b4270977",
    "ids": {
        "y": "0U_e2a86ff72e226d5365aea336044f7b4270977"
    }
}
"""
print(a >> b)
"""
{
    "x": 3,
    "y": 5,
    "id": "Uw_1a8f02f49de0195b788bd0fea50125068c67f",
    "ids": {
        "x": "WB_e55a47230d67db81bcc1aecde8f1b950282cd",
        "y": "0U_e2a86ff72e226d5365aea336044f7b4270977",
        "id": "FY_a621a71b3c4ced8c917e5d97b928c065d049a",
        "ids": "j6_0030602f26a251e4c4ac9d691f972f82f27af"
    }
}
"""

We consider that every value is generated by a process, starting from an empty idict. The process is a sequence of transformation steps done through the operator >>, which symbolizes a data flow. There are two types of steps:

  • value insertion - represented by dict-like objects
  • function application - represented by ordinary python functions

Each function, idict, and any value have a deterministic UUID (called hosh - operable hash). Identifiers (hoshes) for idicts and values are predictable through the magic available here. An idict is completely defined by its key-value pairs so that it can be converted from/to a built-in dict.

Creating an idict is not different from creating an ordinary dict. Optionally it can be created through the >> operator used after empty or Ø (uppercase, usually AltGr+Shift+o in most keyboards): img.png

Function application is done in the same way. The parameter names define the input fields, while the keys in the returned dict define the output fields: img_1.png

Similarly, for anonymous functions: img_5.png

Finally, the result is only evaluated at request: img_6.png

Installation

...as a standalone lib

# Set up a virtualenv. 
python3 -m venv venv
source venv/bin/activate

# Install from PyPI...
pip install --upgrade pip
pip install -U idict

# ...or, install from updated source code.
pip install git+https://github.com/davips/idict

...from source

git clone https://github.com/davips/idict
cd idict
poetry install

Examples

Merging two idicts

from idict import idict

a = idict(x=3)
print(a)
"""
{
    "x": 3,
    "id": "WB_e55a47230d67db81bcc1aecde8f1b950282cd",
    "ids": {
        "x": "WB_e55a47230d67db81bcc1aecde8f1b950282cd"
    }
}
"""
b = idict(y=5)
print(b)
"""
{
    "y": 5,
    "id": "0U_e2a86ff72e226d5365aea336044f7b4270977",
    "ids": {
        "y": "0U_e2a86ff72e226d5365aea336044f7b4270977"
    }
}
"""
print(a >> b)
"""
{
    "x": 3,
    "y": 5,
    "id": "Uw_1a8f02f49de0195b788bd0fea50125068c67f",
    "ids": {
        "x": "WB_e55a47230d67db81bcc1aecde8f1b950282cd",
        "y": "0U_e2a86ff72e226d5365aea336044f7b4270977",
        "id": "FY_a621a71b3c4ced8c917e5d97b928c065d049a",
        "ids": "j6_0030602f26a251e4c4ac9d691f972f82f27af"
    }
}
"""

Lazily applying functions to idict

from idict import idict

a = idict(x=3)
print(a)
"""
{
    "x": 3,
    "id": "WB_e55a47230d67db81bcc1aecde8f1b950282cd",
    "ids": {
        "x": "WB_e55a47230d67db81bcc1aecde8f1b950282cd"
    }
}
"""
a = a >> idict(y=5) >> {"z": 7} >> (lambda x, y, z: {"r": x ** y // z})
print(a)
"""
{
    "r": "→(x y z)",
    "x": 3,
    "y": 5,
    "id": "7PPMD4boB3QfpAP2LkbAOG7xQEAp9MQBdvkLxU2o",
    "ids": {
        "r": "LG.QFClfparJajayDWaiFiehIAtp9MQBdvkLxU2o",
        "x": "WB_e55a47230d67db81bcc1aecde8f1b950282cd",
        "y": "0U_e2a86ff72e226d5365aea336044f7b4270977",
        "id": "FY_a621a71b3c4ced8c917e5d97b928c065d049a",
        "ids": "j6_0030602f26a251e4c4ac9d691f972f82f27af",
        "z": "nX_da0e3a184cdeb1caf8778e34d26f5fd4cc8c8"
    },
    "z": 7
}
"""
print(a.r)
"""
34
"""
print(a)
"""
{
    "r": 34,
    "x": 3,
    "y": 5,
    "id": "7PPMD4boB3QfpAP2LkbAOG7xQEAp9MQBdvkLxU2o",
    "ids": {
        "r": "LG.QFClfparJajayDWaiFiehIAtp9MQBdvkLxU2o",
        "x": "WB_e55a47230d67db81bcc1aecde8f1b950282cd",
        "y": "0U_e2a86ff72e226d5365aea336044f7b4270977",
        "id": "FY_a621a71b3c4ced8c917e5d97b928c065d049a",
        "ids": "j6_0030602f26a251e4c4ac9d691f972f82f27af",
        "z": "nX_da0e3a184cdeb1caf8778e34d26f5fd4cc8c8"
    },
    "z": 7
}
"""

Parameterized functions and sampling

from random import Random

from idict import Ø, let


# A function provide input fields and, optionally, parameters.
# For instance:
# 'a' is sampled from an arithmetic progression
# 'b' is sampled from a geometric progression
# Here, the syntax for default parameter values is borrowed with a new meaning.
def fun(x, y, a=[-100, -99, -98, ..., 100], b=[0.0001, 0.001, 0.01, ..., 100000000]):
    return {"z": a * x + b * y}


def simplefun(x, y):
    return {"z": x * y}


# Creating an empty ldict. Alternatively: d = ldict().
d = Ø >> {}
d.show(colored=False)
"""
{
    "id": "0000000000000000000000000000000000000000",
    "ids": {}
}
"""
# Putting some values. Alternatively: d = ldict(x=5, y=7).
d["x"] = 5
d["y"] = 7
d.show(colored=False)
"""
{
    "x": 5,
    "y": 7,
    "id": "mP_2d615fd34f97ac906e162c6fc6aedadc4d140",
    "ids": {
        "x": ".T_f0bb8da3062cc75365ae0446044f7b3270977",
        "y": "mX_dc5a686049ceb1caf8778e34d26f5fd4cc8c8"
    }
}
"""
# Parameter values are uniformly sampled.
d1 = d >> simplefun
d1.show(colored=False)
print(d1.z)
"""
{
    "z": "→(x y)",
    "x": 5,
    "y": 7,
    "id": "ZAasLu0lIEqhJyS1s8ML8WGeTnradBnjS7VNt6Mg",
    "ids": {
        "z": "iE6rHiYYwfwOBqa4Luh4XCd-myeadBnjS7VNt6Mg",
        "x": ".T_f0bb8da3062cc75365ae0446044f7b3270977",
        "y": "mX_dc5a686049ceb1caf8778e34d26f5fd4cc8c8"
    }
}
35
"""
d2 = d >> simplefun
d2.show(colored=False)
print(d2.z)
"""
{
    "z": "→(x y)",
    "x": 5,
    "y": 7,
    "id": "ZAasLu0lIEqhJyS1s8ML8WGeTnradBnjS7VNt6Mg",
    "ids": {
        "z": "iE6rHiYYwfwOBqa4Luh4XCd-myeadBnjS7VNt6Mg",
        "x": ".T_f0bb8da3062cc75365ae0446044f7b3270977",
        "y": "mX_dc5a686049ceb1caf8778e34d26f5fd4cc8c8"
    }
}
35
"""
# Parameter values can also be manually set.
e = d >> let(fun, a=5, b=10)
print(e.z)
"""
95
"""
# Not all parameters need to be set.
e = d >> let(simplefun, a=5)
print(e.z)
"""
35
"""
# Each run will be a different sample for the missing parameters.
e = e >> let(simplefun, a=5)
print(e.z)
"""
35
"""
# We can define the initial state of the random sampler.
# It will be in effect from its location place onwards in the expression.
e = d >> Random(0) >> let(fun, a=5)
print(e.z)
"""
725.0
"""
# All runs will yield the same result,
# if starting from the same random number generator seed.
e = e >> Random(0) >> let(fun, a=[555, 777])
print("Let 'a' be a list:", e.z)
"""
Let 'a' be a list: 700003885.0
"""
# Reproducible different runs are achievable by using a single random number generator.
e = e >> Random(0) >> let(fun, a=[5, 25, 125, ..., 10000])
print("Let 'a' be a geometric progression:", e.z)
"""
Let 'a' be a geometric progression: 700003125.0
"""
rnd = Random(0)
e = d >> rnd >> let(fun, a=5)
print(e.z)
e = d >> rnd >> let(fun, a=5)  # Alternative syntax.
print(e.z)
"""
725.0
700000025.0
"""

Composition of sets of functions

from random import Random

from idict import Ø


# A multistep process can be defined without applying its functions


def g(x, y, a=[1, 2, 3, ..., 10], b=[0.00001, 0.0001, 0.001, ..., 100000]):
    return {"z": a * x + b * y}


def h(z, c=[1, 2, 3]):
    return {"z": c * z}


# In the 'idict' framework 'data is function',
# so the alias Ø represents the 'empty data object' and the 'reflexive function' at the same time.
# In other words: 'inserting nothing' has the same effect as 'doing nothing'.
fun = Ø >> g >> h  # 'empty' or 'Ø' enable the cartesian product of the subsequent sets of functions within the expression.
print(fun)
"""
«<function g at 0x7fa86c26d280> × <function h at 0x7fa86c7ceb80>»
"""
# Before a function is applied to a dict-like, the function free parameters remain unsampled.
# The result is an ordered set of composite functions.
d = {"x": 5, "y": 7} >> (Random(0) >> fun)
print(d)
"""
{
    "x": 5,
    "y": 7,
    "z": "→(c z→(a b x y))"
}
"""
print(d.z)
"""
105.0
"""
d = {"x": 5, "y": 7} >> (Random(0) >> fun)
print(d.z)
"""
105.0
"""
# Reproducible different runs by passing a stateful random number generator.
rnd = Random(0)
e = d >> rnd >> fun
print(e.z)
"""
105.0
"""
e = d >> rnd >> fun
print(e.z)
"""
14050.0
"""
# Repeating the same results.
rnd = Random(0)
e = d >> rnd >> fun
print(e.z)
"""
105.0
"""
e = d >> rnd >> fun
print(e.z)
"""
14050.0
"""

Concept

An idict is like a common Python dict, with extra functionality and lazy. It is a mapping between string keys, called fields, and any serializable (pickable) object. The idict id (identifier) and the field ids are also part of the mapping.

The user can provide a unique identifier (hosh) for each function or value object. Otherwise, they will be calculated through blake3 hashing of the content of data or bytecode of function. For this reason, such functions should be simple, i.e., with minimal external dependencies, to avoid the unfortunate situation where two functions with identical local code actually perform different calculations through calls to external code that implement different algorithms with the same name.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

idict-0.211017.4.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

idict-0.211017.4-py3-none-any.whl (36.2 kB view details)

Uploaded Python 3

File details

Details for the file idict-0.211017.4.tar.gz.

File metadata

  • Download URL: idict-0.211017.4.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.7 CPython/3.8.10 Linux/5.4.0-80-generic

File hashes

Hashes for idict-0.211017.4.tar.gz
Algorithm Hash digest
SHA256 cfd161b79069790aa590727ed7a724fbb7e42279c9e2bd9a98d41d852a68b0a2
MD5 09af198dfc5b5b5b3d516226ef2d99f9
BLAKE2b-256 4a1792652a24193203dd4640ff88dad0d63f711ed9e8c52f6ec7e086a2a9b5ca

See more details on using hashes here.

File details

Details for the file idict-0.211017.4-py3-none-any.whl.

File metadata

  • Download URL: idict-0.211017.4-py3-none-any.whl
  • Upload date:
  • Size: 36.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.7 CPython/3.8.10 Linux/5.4.0-80-generic

File hashes

Hashes for idict-0.211017.4-py3-none-any.whl
Algorithm Hash digest
SHA256 52320dd35631a2981af00edd6c715e2cd4f201d9164e0347b760604a3d009045
MD5 0600cdb17eaa50f884c674ff891602e1
BLAKE2b-256 3f662cd1905fc1e61b8cca658d678fc06a445ddd23972dfbc2daf4d9cce567bf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page