Those missing flavors of dict
Project description
Overview
Those dict flavours that you have probably thought of at some point. Zero dependencies.
Installation
pip install those-dicts
TL;DR
Below you may find examples of behavior under normal dist-style usage of those_dicts. Essentially those are dicts but with a twist.
from those_dicts import BatchedDict, GraphDict, TwoWayDict, OOMDict
my_batched_dict = BatchedDict(nested=True)
client1 = dict(name='Lieutenant', surname='Kowalski',
address=dict(street='Funny Avenue', city='Elsewhere'))
client2 = dict(name='Thomas', surname='Dison',
address=dict(street='Lightbulb St.', city='Elsewhere'))
my_batched_dict.update(client1)
my_batched_dict.update(client2)
# >>> my_batched_dict['name']
# ['Lieutenant', 'Thomas']
# >>> my_batched_dict['address']
# {'street': ['Funny Avenue', 'Lightbulb St.'], 'city': ['Elsewhere', 'Elsewhere']}
my_graph_dict = GraphDict(Warsaw='Katowice', Katowice='Gdansk', Gdansk='Warsaw')
flights_to_germany = dict(Warsaw='Berlin', Katowice='Frankfurt')
flights_from_germany = dict(Berlin='Warsaw', Frankfurt='Katowice')
my_graph_dict.update(flights_to_germany)
my_graph_dict.update(flights_from_germany)
# >>> my_graph_dict['Warsaw']
# {'Berlin', 'Katowice'}
# >>> my_graph_dict['Berlin']
# 'Warsaw'
my_twoway_dict = TwoWayDict({('Eric', 'Doe'): ('Ella', 'Moon')})
# >>> my_twoway_dict[('Ella', 'Moon')] == ('Eric', 'Doe')
# True
# >>> my_twoway_dict[('Eric', 'Doe')] == ('Ella', 'Moon')
# True
new_marriage_after_divorce = {('Ella', 'Moon'): ('Benny', 'Hills')}
my_twoway_dict.update(new_marriage_after_divorce)
# >>> my_twoway_dict[('Ella', 'Moon')] == ('Benny', 'Hills')
# True
# >>> my_twoway_dict[('Eric', 'Doe')] is None
# True
from some_lib import ObjWithDefinedSize
my_oom_dict = OOMDict(max_ram_entries=10)
my_oom_dict.update([str(k): ObjWithDefinedSize(mb_size=k) for k in range(1000)])
# first 10 objects are in RAM, the rest is on the disk
del my_oom_dict # clears the disk also
Getting Started
BatchedDict
When you want to aggregate multiple dicts:
from those_dicts import BatchedDict
my_batched_dict = BatchedDict()
my_batched_nested = BatchedDict(nested=True)
client1 = dict(name='Lieutenant', surname='Kowalski',
address=dict(street='Funny Avenue', city='Elsewhere'))
client2 = dict(name='Thomas', surname='Dison',
address=dict(street='Lightbulb St.', city='Elsewhere'))
my_batched_dict.update(client1)
my_batched_dict.update(client2)
my_batched_nested.update(client1)
my_batched_nested.update(client2)
# or equivalently, because it is a dict
my_batched_dict = BatchedDict(name='Lieutenant', surname='Kowalski',
address=dict(street='Funny Avenue', city='Elsewhere'))
my_batched_nested = BatchedDict(nested=True, name='Lieutenant', surname='Kowalski',
address=dict(street='Funny Avenue', city='Elsewhere'))
my_batched_dict.update(client2)
my_batched_nested.update(client2)
# >>> my_batched_dict
# {'name': ['Lieutenant', 'Thomas'], 'surname': ['Kowalski', 'Dison'], 'address': [{'street': 'Funny Avenue', 'city': 'Elsewhere'}, {'street': 'Lightbulb St.', 'city': 'Elsewhere'}]}
# >>> my_batched_nested
# {'name': ['Lieutenant', 'Thomas'], 'surname': ['Kowalski', 'Dison'], 'address': {'street': ['Funny Avenue', 'Lightbulb St.'], 'city': ['Elsewhere', 'Elsewhere']}}
# straightforward aggregation use case
my_batched_dict = BatchedDict()
my_batched_dict['john_properties'] = 'car'
my_batched_dict['john_properties'] = 'bike'
my_batched_dict['john_properties'] = 'grill'
my_batched_dict['john_properties'] = 'gaming pc'
# >>> my_batched_dict['john_properties']
# ['car', 'bike', 'grill', 'gaming pc']
# >>> my_batched_dict['john_properties'].remove('grill')
# >>> my_batched_dict['john_properties']
# ['car', 'bike', 'gaming pc']
my_batched_dict['ella_properties'] = 'house'
my_batched_dict['ella_properties'] = 'garage'
# >>> my_batched_dict['ella_properties']
# ['house', 'garage']
Essentially it is a dict, so usage is intuitive.
GraphDict
When you want to create a mapping from one hashable to another hashable that may traverse further.
from dataclasses import dataclass
from those_dicts import GraphDict
@dataclass(frozen=True)
class Building:
coordinates: tuple[float, float]
address: str
elevation: float
purpose: str
history: str
# some big, hashable data structure
@dataclass(frozen=True)
class City:
name: str
country: str
area: float
population: int
top_10_buildings: frozenset[Building]
warsaw = City('Warsaw', ...)
katowice = ... # you get the point
gdansk = ...
berlin = ...
frankfurt = ...
my_graph_dict = GraphDict({warsaw: katowice, katowice: gdansk, gdansk: warsaw})
flights_to_germany = {warsaw: berlin, katowice: frankfurt}
flights_from_germany = {berlin: warsaw, frankfurt: katowice}
my_graph_dict.update(flights_to_germany)
my_graph_dict.update(flights_from_germany)
# >>> my_graph_dict[warsaw]
# {berlin, katowice}
# >>> my_graph_dict[berlin]
# warsaw
# >>> my_graph_dict
# {katowice: {2, 4}, warsaw: {0, 3}, gdansk: {1}, berlin: {1}, frankfurt: {0}}
GraphDict stores each hashable object only once - here everything is a key. Values are just index-wise references. This means a lot of memory savings for storing big objects.
GraphDict is compatible with dict, but with a twist(s) enlisted below:
- .pop() method is computationally expensive, because forces reindexing all the values. Better to use del instead.
- del graph_dict_instance[some_key] removes all links from and to given key, without removing key entry itself. Leaving (disconnected) key entry allows to keep unrelated indices in values as is (no reindexing).
- .popitem() method is computationally expensive, because forces reindexing all the values, although not so expensive as .pop() because it returns the last key-value pair.
- .keys() method returns a mapping proxy (like dict), but the definition of key here is: a node that has a corresponding value(s) (outgoing connection).
- .values() method returns a mapping proxy (like dict), but the definition of value here is: a node that has a corresponding key (incoming connection).
- .items() method returns a mapping proxy (like dict), but the definition of item here is: a pair of nodes (key-value manner) for every key that is either in keys() or in values().
- .setdefault() raises NotImplementedError - use .get(key, default) instead.
- .make_loops(keys: Optional[Iterable] = None) is new compared to dict - it adds connections to itself for every key provided or to all keys.
- .delete_link(key, value) removes directed connection from key to value if exists. Do not influence existence of keys.
- .disconnect(key, value) removes connection from key to value and from value to key if exist. Do not influence existence of keys.
- .update() shall be used to update GraphDict like you would update regular dict.
- .merge() shall be used to update GraphDict with another GraphDict.
- .reindex() removes entries that are totally disconnected and updates indices stored in values for all entries (because deletion changes the order of keys).
- .get_dict() returns regular dict with meaningful keys (that have other value than None).
TwoWayDict
It is a subclass of GraphDict that is restricted to have only exclusive two-way connections.
You can access value through its key and other way around.
Compared to GraphDict, .merge() and .make_loops() are raising NotImplementedError as those doesn't make sense for this class.
OOMDict
When you want to limit impact on RAM.
from those_dicts import OOMDict
my_oom_dict = OOMDict(max_ram_entries=10000) # the default
for name, big_obj in big_obj_generator(num_obj=1000000):
my_oom_dict[name] = big_obj
# everything above 10000 objects will be stored on the disk
Even if storage is split between RAM and disk, it is just a dict, so use it as usual.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file those_dicts-0.1.1.tar.gz
.
File metadata
- Download URL: those_dicts-0.1.1.tar.gz
- Upload date:
- Size: 9.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.3-76060903-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0233ed473a01fe18c7c4298a3f4cadd2ba645016dddfd6d924be1bd86ef440da |
|
MD5 | 9305e26cc9410aac99bc886e51f0380e |
|
BLAKE2b-256 | c794a364a53189af60cb26f54552b550cd9d9c4e23827acca49598529704df74 |
File details
Details for the file those_dicts-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: those_dicts-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.3-76060903-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cf699e9efaa39811e48d0bfffc17be25c7c3c854f021f6008fc1b8547243a29 |
|
MD5 | a9d2a2099c55d5f1b9efa330b2f5c371 |
|
BLAKE2b-256 | ec29ab092cc37b9a010e9f86712267bb4661e344c176bdcb85597376c7980546 |