Remote dictionary backed up by cloud services
Project description
REMOTE DICT
RemoteDict is a Python library intended to host a dictionary in a cloud backend. Currently it is supported Azure Blob Storage.
USAGE - AZURE
Grab a CONNECTION_STRING from your azure blob storage, then import the AzureDictionary as follows:
>>> from remotedict.azure import AzureDictionary
>>> remote_dict = AzureDictionary(CONNECTION_STRING, container_name="mycontainer", folder_name="myfolder")
>>> remote_dict
Azure Blob Storage. Container: "mycontainer"; Folder: "myfolder"; Num elements: 0
>>> remote_dict["foo"] = "bar"
>>> remote_dict["foo"]
bar
remote_dict is an object that behaves like a python dictionary. However, it contains extensive functionality useful to dealing with large data and concurrency.
HOW IT WORKS
Once remote_dict is instantiated, it can be used to store any kind of data:
remote_dict['example'] = "hello"
remote_dict['example2'] = b"binary data of any size"
remote_dict['example3'] = 42
remote_dict['example4'] = {"this": {"is": b"a subdictionary", "that": "holds", "any": True, "data": 42}}
remote_dict['example5'] = ["even", "lists", "or", "numpy", "and", "pandas"]
remote_dict['example6'] = np.random.randn(10, 3, 1)
remote_dict['example7'] = pd.DataFrame([1,2,3,4])
It allows to shorten the assignments and readings in atomic operations:
remote_dict[[
'example',
'example2',
'example3'
]] = "hello", b"binary data", 42
Each entry is stored as an LZ4-compressed binary in a single file inside the specified container and folder during instantiation of remote_dict.
There are no soft limits in the size a value can have.
INDEXES
In RemoteDict, there is a concept of Index which allows to retrieve all the keys instantly without iterating the backend for elements.
This index functionality is achieved by using a file as an indexer, ensuring that concurrency can't break it by using cloud leases on the file.
For this reason, the folder "Index" in the cloud container is reserved and handled automatically by RemoteDict.
Rather than downloading the index file each time an index check is required, the class only checks the etag of the file (which is faster than downloading it). If the etag does not match the local etag, the index is redownloaded ensuring to be always up-to-date.
The index is a pd.Series object that can be accessed as follows:
>>> remote_dict.index
example example/example
example2 example/example2
example3 example/example3
example4 example/example4
example5 example/example5
example6 example/example6
example7 example/example7
Name: name, dtype: object
CONCURRENT ACCESS
Concurrent readings are allowed by nature, however, concurrent writes are a bit more complex. RemoteDict deals with concurrency by allowing to acquire leases on custom elements.
Example to lock an element:
>>> remote_dict.lock_item("example", duration=15) # duration in seconds
Once the element is locked, no other remote_dict (anywhere, even different machines) can lock or write to this item again unless the item is manually unlocked or the duration expires.
If another element tries to lock it, remote_dict will wait for it to be released (default behaviour) or raise an exception if wait=False.
It can only be written by this object as long as the lease is conceived.
To unlock the element:
>>> remote_dict.unlock_item("example")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file remotedict-0.0.2-py3-none-any.whl.
File metadata
- Download URL: remotedict-0.0.2-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e95405318ee6d8c91c4f233ba452c5b4fa4a406bdef35398ac989041e7ea82bc
|
|
| MD5 |
a6a1c50bd22e5c15785b351a420f4789
|
|
| BLAKE2b-256 |
c3c3e9c282e1bfb3fce18d5106b117c42b4dd07de6171e7b0b6d1cea6d0ebe58
|