Skip to main content

Those missing flavors of dict

Project description

Overview

Those dict flavours that you have probably thought of at some point. Zero dependencies.

Installation

pip install those-dicts

TL;DR

Below you may find examples of behavior under normal dist-style usage of those_dicts. Essentially those are dicts but with a twist.

from those_dicts import BatchedDict, GraphDict, TwoWayDict, OOMDict

my_batched_dict = BatchedDict(nested=True)
client1 = dict(name='Lieutenant', surname='Kowalski',
               address=dict(street='Funny Avenue', city='Elsewhere'))
client2 = dict(name='Thomas', surname='Dison',
               address=dict(street='Lightbulb St.', city='Elsewhere'))
my_batched_dict.update(client1)
my_batched_dict.update(client2)
# >>> my_batched_dict['name']
# ['Lieutenant', 'Thomas']
# >>> my_batched_dict['address']
# {'street': ['Funny Avenue', 'Lightbulb St.'], 'city': ['Elsewhere', 'Elsewhere']}

my_graph_dict = GraphDict(Warsaw='Katowice', Katowice='Gdansk', Gdansk='Warsaw')
flights_to_germany = dict(Warsaw='Berlin', Katowice='Frankfurt')
flights_from_germany = dict(Berlin='Warsaw', Frankfurt='Katowice')
my_graph_dict.update(flights_to_germany)
my_graph_dict.update(flights_from_germany)
# >>> my_graph_dict['Warsaw']
# {'Berlin', 'Katowice'}
# >>> my_graph_dict['Berlin']
# 'Warsaw'

my_twoway_dict = TwoWayDict({('Eric', 'Doe'): ('Ella', 'Moon')})
# >>> my_twoway_dict[('Ella', 'Moon')] == ('Eric', 'Doe')
# True
# >>> my_twoway_dict[('Eric', 'Doe')] == ('Ella', 'Moon')
# True
new_marriage_after_divorce = {('Ella', 'Moon'): ('Benny', 'Hills')}
my_twoway_dict.update(new_marriage_after_divorce)
# >>> my_twoway_dict[('Ella', 'Moon')] == ('Benny', 'Hills')
# True
# >>> my_twoway_dict[('Eric', 'Doe')] is None
# True

from some_lib import ObjWithDefinedSize

my_oom_dict = OOMDict(max_ram_entries=10)
my_oom_dict.update([str(k): ObjWithDefinedSize(mb_size=k) for k in range(1000)])
# first 10 objects are in RAM, the rest is on the disk

del my_oom_dict # clears the disk also

Getting Started

BatchedDict

When you want to aggregate multiple dicts:

from those_dicts import BatchedDict

my_batched_dict = BatchedDict()
my_batched_nested = BatchedDict(nested=True)
client1 = dict(name='Lieutenant', surname='Kowalski',
               address=dict(street='Funny Avenue', city='Elsewhere'))
client2 = dict(name='Thomas', surname='Dison',
               address=dict(street='Lightbulb St.', city='Elsewhere'))
my_batched_dict.update(client1)
my_batched_dict.update(client2)
my_batched_nested.update(client1)
my_batched_nested.update(client2)
# or equivalently, because it is a dict
my_batched_dict = BatchedDict(name='Lieutenant', surname='Kowalski',
                              address=dict(street='Funny Avenue', city='Elsewhere'))
my_batched_nested = BatchedDict(nested=True, name='Lieutenant', surname='Kowalski',
                                address=dict(street='Funny Avenue', city='Elsewhere'))
my_batched_dict.update(client2)
my_batched_nested.update(client2)
# >>> my_batched_dict 
# {'name': ['Lieutenant', 'Thomas'], 'surname': ['Kowalski', 'Dison'], 'address': [{'street': 'Funny Avenue', 'city': 'Elsewhere'}, {'street': 'Lightbulb St.', 'city': 'Elsewhere'}]}
# >>> my_batched_nested
# {'name': ['Lieutenant', 'Thomas'], 'surname': ['Kowalski', 'Dison'], 'address': {'street': ['Funny Avenue', 'Lightbulb St.'], 'city': ['Elsewhere', 'Elsewhere']}}

# straightforward aggregation use case
my_batched_dict = BatchedDict()
my_batched_dict['john_properties'] = 'car'
my_batched_dict['john_properties'] = 'bike'
my_batched_dict['john_properties'] = 'grill'
my_batched_dict['john_properties'] = 'gaming pc'
# >>> my_batched_dict['john_properties']
# ['car', 'bike', 'grill', 'gaming pc']
# >>> my_batched_dict['john_properties'].remove('grill')
# >>> my_batched_dict['john_properties']
# ['car', 'bike', 'gaming pc']

my_batched_dict['ella_properties'] = 'house'
my_batched_dict['ella_properties'] = 'garage'
# >>> my_batched_dict['ella_properties']
# ['house', 'garage']

Essentially it is a dict, so usage is intuitive.

GraphDict

When you want to create a mapping from one hashable to another hashable that may traverse further.

from dataclasses import dataclass
from those_dicts import GraphDict

@dataclass(frozen=True)
class Building:
    coordinates: tuple[float, float]
    address: str
    elevation: float
    purpose: str
    history: str

# some big, hashable data structure    
@dataclass(frozen=True)
class City:
    name: str
    country: str
    area: float
    population: int
    top_10_buildings: frozenset[Building]


warsaw = City('Warsaw', ...)
katowice = ...  # you get the point
gdansk = ...
berlin = ...
frankfurt = ...
my_graph_dict = GraphDict({warsaw: katowice, katowice: gdansk, gdansk: warsaw})
flights_to_germany = {warsaw: berlin, katowice: frankfurt}
flights_from_germany = {berlin: warsaw, frankfurt: katowice}
my_graph_dict.update(flights_to_germany)
my_graph_dict.update(flights_from_germany)
# >>> my_graph_dict[warsaw]
# {berlin, katowice}
# >>> my_graph_dict[berlin]
# warsaw
# >>> my_graph_dict
# {katowice: {2, 4}, warsaw: {0, 3}, gdansk: {1}, berlin: {1}, frankfurt: {0}}

GraphDict stores each hashable object only once - here everything is a key. Values are just index-wise references. This means a lot of memory savings for storing big objects.

GraphDict is compatible with dict, but with a twist(s) enlisted below:

  • .pop() method is computationally expensive, because forces reindexing all the values. Better to use del instead.
  • del graph_dict_instance[some_key] removes all links from and to given key, without removing key entry itself. Leaving (disconnected) key entry allows to keep unrelated indices in values as is (no reindexing).
  • .popitem() method is computationally expensive, because forces reindexing all the values, although not so expensive as .pop() because it returns the last key-value pair.
  • .keys() method returns a mapping proxy (like dict), but the definition of key here is: a node that has a corresponding value(s) (outgoing connection).
  • .values() method returns a mapping proxy (like dict), but the definition of value here is: a node that has a corresponding key (incoming connection).
  • .items() method returns a mapping proxy (like dict), but the definition of item here is: a pair of nodes (key-value manner) for every key that is either in keys() or in values().
  • .setdefault() raises NotImplementedError - use .get(key, default) instead.
  • .make_loops(keys: Optional[Iterable] = None) is new compared to dict - it adds connections to itself for every key provided or to all keys.
  • .delete_link(key, value) removes directed connection from key to value if exists. Do not influence existence of keys.
  • .disconnect(key, value) removes connection from key to value and from value to key if exist. Do not influence existence of keys.
  • .update() shall be used to update GraphDict like you would update regular dict.
  • .merge() shall be used to update GraphDict with another GraphDict.
  • .reindex() removes entries that are totally disconnected and updates indices stored in values for all entries (because deletion changes the order of keys).
  • .get_dict() returns regular dict with meaningful keys (that have other value than None).

TwoWayDict

It is a subclass of GraphDict that is restricted to have only exclusive two-way connections.
You can access value through its key and other way around.

Compared to GraphDict, .merge() and .make_loops() are raising NotImplementedError as those doesn't make sense for this class.

OOMDict

When you want to limit impact on RAM.

from those_dicts import OOMDict

my_oom_dict = OOMDict(max_ram_entries=10000)  # the default

for name, big_obj in big_obj_generator(num_obj=1000000):
    my_oom_dict[name] = big_obj

# everything above 10000 objects will be stored on the disk

Even if storage is split between RAM and disk, it is just a dict, so use it as usual.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

those_dicts-0.1.1.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

those_dicts-0.1.1-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file those_dicts-0.1.1.tar.gz.

File metadata

  • Download URL: those_dicts-0.1.1.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.3-76060903-generic

File hashes

Hashes for those_dicts-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0233ed473a01fe18c7c4298a3f4cadd2ba645016dddfd6d924be1bd86ef440da
MD5 9305e26cc9410aac99bc886e51f0380e
BLAKE2b-256 c794a364a53189af60cb26f54552b550cd9d9c4e23827acca49598529704df74

See more details on using hashes here.

File details

Details for the file those_dicts-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: those_dicts-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.3-76060903-generic

File hashes

Hashes for those_dicts-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6cf699e9efaa39811e48d0bfffc17be25c7c3c854f021f6008fc1b8547243a29
MD5 a9d2a2099c55d5f1b9efa330b2f5c371
BLAKE2b-256 ec29ab092cc37b9a010e9f86712267bb4661e344c176bdcb85597376c7980546

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page