A dictionary-like object that is friendly with multiprocessing and uses key-value databases (e.g., RocksDB) as the underlying storage.
Project description
hugedict
![Documentation](https://pypi-camo.freetls.fastly.net/cb2a3c3438551102399986f02b7e576247103cac/68747470733a2f2f72656164746865646f63732e6f72672f70726f6a656374732f68756765646963742f62616467652f3f76657273696f6e3d6c6174657374267374796c653d666c6174)
hugedict provides a drop-in replacement for dictionary objects that are too big to fit in memory. hugedict's dictionary-like objects implement typing.Mapping
and typing.MutableMapping
interfaces using key-value databases (e.g., RocksDB) as the underlying storage. Moreover, they are friendly with Python's multiprocessing.
Installation
From PyPI (using pre-built binaries):
pip install hugedict
To compile the source, run: maturin build -r
inside the project directory. You need Rust, Maturin, CMake and CLang (to build Rust-RocksDB).
Features
- Create a mutable mapping backed by RocksDB
from functools import partial
from hugedict.prelude import RocksDBDict, RocksDBOptions
# replace [str, str] for the types of keys and values you want
# as well as deser_key, deser_value, ser_value
mapping: MutableMapping[str, str] = RocksDBDict(
path=dbpath, # path (str) to db file
options=RocksDBOptions(create_if_missing=create_if_missing), # whether to create database if missing, check other options
deser_key=partial(str, encoding="utf-8"), # decode the key from memoryview
deser_value=partial(str, encoding="utf-8"), # decode the value from memoryview
ser_value=str.encode, # encode the value to bytes
readonly=False, # open database in read only mode
secondary_mode=False, # open database in secondary mode
secondary_path=None, # when secondary_mode is True, it's a string pointing to a directory for storing data required to operate in secondary mode
)
-
Load huge data from files into RocksDB in parallel:
from hugedict.prelude import rocksdb_load
. This function creates SST files in parallel, ingests into the db and (optionally) compacts them. -
Cache a function when doing parallel processing
from hugedict.prelude import Parallel
pp = Parallel()
@pp.cache_func("/tmp/test.db")
def heavy_computing(seconds: float):
time.sleep(seconds)
return seconds * 2
output = pp.map(heavy_computing, [0.5, 1, 0.7, 0.3, 0.6], n_processes=3)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for hugedict-2.5.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2e9084e8ab3d98b85c62fe05759cc87b2f4d65af75ea48bca6f81ba467acaea |
|
MD5 | ec11f5534e887a0a687631fac671ca84 |
|
BLAKE2b-256 | e5cfb8035613fd1bf9e04513220e8c9ae96133d3c3aaba034b94ee4dca2e8224 |
Hashes for hugedict-2.5.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 025414e195e133d53f8621e8f24189d704887998b3dc29b7b5bf8ee47e427852 |
|
MD5 | c2e56b1a6b57ddd3ab55cdcdc20ac367 |
|
BLAKE2b-256 | 1775694ad72254a46dd301d0a60e728aba499d723446bf86a0a6fe6a20abfc69 |
Hashes for hugedict-2.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cffdcc4e2a3ed17f6d6da20dc26e7a4eb4f53f825c45212f771a212d914d0c8 |
|
MD5 | 838ca4df33dd58a122b0487b94983e40 |
|
BLAKE2b-256 | 1b29c5d1f91ad0feffb6c2024861cc429c8510322d6b1a32b6f3aaeefa790ac1 |
Hashes for hugedict-2.5.1-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6fd50293f07f1b705eb382d154d2884161cf552dcbeda0676fda48330934e0df |
|
MD5 | b608f3813c45e968ce489a068b5c459d |
|
BLAKE2b-256 | 68237ffc10c42008f9a0440391c20801b88b93f98c77c2ec62157363a730a1e3 |
Hashes for hugedict-2.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 533975b7cbec12e371bcea9478fe01272f19389ec5b7eea90b634171ea99d514 |
|
MD5 | 41c6e899bb2d43e028ccb15246cb2a00 |
|
BLAKE2b-256 | b754f28f0aa859d01be6913a46f3b8b9671b413b5fe0de7f4ee006ad2d66ee84 |
Hashes for hugedict-2.5.1-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 068f471d55540c724552fbc7bd05e1dcfd26c7796060e65ad32f5c8dd1caa12f |
|
MD5 | 5368071c52da8cdc35eaff88945f55e6 |
|
BLAKE2b-256 | b4d0688b93f497cc625fd8eba5ee91a7ac8c049b3954335817d7a52ad9171600 |
Hashes for hugedict-2.5.1-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4417e5c249118dd53dc121f529fe47d4bf6a6ec54fe87bcc004979d4e123c27c |
|
MD5 | b15a157fa4888c34cef293803bddebdd |
|
BLAKE2b-256 | 669de68bd816d95bccde374d274ce50dd16c2db967575dec69ccf92ade09b3fe |
Hashes for hugedict-2.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2ce07b17987b65ea1d3f7128eb6371b6830c03a47e9cc16a84c6e32680c8e33 |
|
MD5 | 2cccf2186a50ba410f165e5fcd34c2d1 |
|
BLAKE2b-256 | f5ef06b781512a9726d29af5735a439bbbe199962a569145c611d3489d7f1193 |
Hashes for hugedict-2.5.1-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4606f65628744c9675aa87b4be74afc2096975db97f4e4d67c6029a8d54cbbaf |
|
MD5 | 97688b2c34694fff77bbe901bd4f6372 |
|
BLAKE2b-256 | c064cc43492016d89e1edabb67a121b746cccb3225852ce11c68205d22dfec20 |
Hashes for hugedict-2.5.1-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45bd89727270d1118ccdc8ea708c88f356574b0b0c1edecabd543106055d9edf |
|
MD5 | 53e2d7fb0aeadcd1dbb3222ff090e58b |
|
BLAKE2b-256 | b14b3b46d68122ebaf85202c15575bf3f4b9807a40b35c6b42f0f150a5f6b485 |
Hashes for hugedict-2.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a034f9e3581afd432767cdc29e3c451b276f6c31d5969b2758ed49f91f3535e |
|
MD5 | b72e4e18ef83780331f26f832ee9ef71 |
|
BLAKE2b-256 | ef748257315baebc5db4f1d9ad7142fd637a4d918d3e311f22120bcf73642c1c |
Hashes for hugedict-2.5.1-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 54df366dcacee012982a2154b3474490f3584af89db7b8f1f55c3b60b0b63051 |
|
MD5 | 029703edee0af4d3ccbc3efbd6802f5c |
|
BLAKE2b-256 | 493cc0bf2b81b0baaad95deeb442c3c4781319d3c07e330db237bdc863ab21cf |