A dictionary-like object that is friendly with multiprocessing and uses key-value databases (e.g., RocksDB) as the underlying storage.
Project description
hugedict
![Documentation](https://pypi-camo.freetls.fastly.net/cb2a3c3438551102399986f02b7e576247103cac/68747470733a2f2f72656164746865646f63732e6f72672f70726f6a656374732f68756765646963742f62616467652f3f76657273696f6e3d6c6174657374267374796c653d666c6174)
hugedict provides a drop-in replacement for dictionary objects that are too big to fit in memory. hugedict's dictionary-like objects implement typing.Mapping
and typing.MutableMapping
interfaces using key-value databases (e.g., RocksDB) as the underlying storage. Moreover, they are friendly with Python's multiprocessing.
Installation
From PyPI (using pre-built binaries):
pip install hugedict
To compile the source, run: maturin build -r
inside the project directory. You need Rust, Maturin, CMake and CLang (to build Rust-RocksDB).
Features
- Create a mutable mapping backed by RocksDB
from functools import partial
from hugedict.prelude import RocksDBDict, RocksDBOptions
# replace [str, str] for the types of keys and values you want
# as well as deser_key, deser_value, ser_value
mapping: MutableMapping[str, str] = RocksDBDict(
path=dbpath, # path (str) to db file
options=RocksDBOptions(create_if_missing=create_if_missing), # whether to create database if missing, check other options
deser_key=partial(str, encoding="utf-8"), # decode the key from memoryview
deser_value=partial(str, encoding="utf-8"), # decode the value from memoryview
ser_value=str.encode, # encode the value to bytes
readonly=False, # open database in read only mode
secondary_mode=False, # open database in secondary mode
secondary_path=None, # when secondary_mode is True, it's a string pointing to a directory for storing data required to operate in secondary mode
)
-
Load huge data from files into RocksDB in parallel:
from hugedict.prelude import rocksdb_load
. This function creates SST files in parallel, ingests into the db and (optionally) compacts them. -
Cache a function when doing parallel processing
from hugedict.prelude import Parallel
pp = Parallel()
@pp.cache_func("/tmp/test.db")
def heavy_computing(seconds: float):
time.sleep(seconds)
return seconds * 2
output = pp.map(heavy_computing, [0.5, 1, 0.7, 0.3, 0.6], n_processes=3)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for hugedict-2.8.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d60db4b03dec3cf1e4c4019f222e31691580c843a84e95998a7c0240b7495c09 |
|
MD5 | c79ea797e27f459c292620e0f5e84f72 |
|
BLAKE2b-256 | 7dc67af0becf414882d49b602568d7a2f80e764db6d56e2dd6425bea75302643 |
Hashes for hugedict-2.8.3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a405def4ee5988cf6bf345122bae8d5567bd2261b752452ba3610776f4534d6b |
|
MD5 | 61eba78cdf40ac5d7548bf5d3e368917 |
|
BLAKE2b-256 | b7090733c1e6d3ead638329f43aaa8a8a97eae188c690ef4fb6582532c57ae4e |
Hashes for hugedict-2.8.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | adfc8f11a9a24d9e18875d5f693d5df7e3ca86ddf38902660720007d16230a43 |
|
MD5 | 8216ababef2d22e4d6359f80924d1ba0 |
|
BLAKE2b-256 | f9709fd98851dec247d3e2ed63d50e72b54edfc43106af9a74e3e80a415df8f5 |
Hashes for hugedict-2.8.3-cp311-cp311-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 987df471704f1603b32063bee0daa306300d3e1c9a88c30be1a681fef14872d0 |
|
MD5 | 21c1bad6e6df6973c97349ac94c5a54e |
|
BLAKE2b-256 | a01ac4f7ac1ea3d62ef676d3997228f188bf8f556c057b68d2ecb3f083fa312b |
Hashes for hugedict-2.8.3-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a03f48ee7c89d3376bfc6cfdc9337c91de72e99d46589cc95f425b7eefd2a04a |
|
MD5 | 537f3670d77f92a4caa60c39b4357a86 |
|
BLAKE2b-256 | 00989db57e0fdd50e0e2a17edb0c7eac6a1fbc9a5b023be7532cc50ff4cb738d |
Hashes for hugedict-2.8.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0275283ce0a7daa59a260179e4c6ede6ba70584834e00b5b015a59efe97623c7 |
|
MD5 | 2f7e5a5adfe4e5383956a69b32815174 |
|
BLAKE2b-256 | d881576058ac68da81cf164665ede44d77cb69bd73f378bc322f78ee6fc59a23 |
Hashes for hugedict-2.8.3-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b55a9126cb4a57c05d7221b09758d47ae04a864a8454154e5b8d48a1f540fab |
|
MD5 | 1a9de7dc5f63f9e1343ac758e1fd99da |
|
BLAKE2b-256 | 0282ea64b0476e82a1e6c6ddb3d35f3c756db16465b3d1630359301c1318b32d |
Hashes for hugedict-2.8.3-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71b5cbdfc9facc86d14689e80c95de053705dcf2f920032552b89671b8c43765 |
|
MD5 | 7c9fc36cf5ade37b97b997981eb72a5d |
|
BLAKE2b-256 | acfbd6bb91fe7b89707ef977047230404b221b5e5b84073d9dad5f9fd996d419 |
Hashes for hugedict-2.8.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b33556d53e1b919eb67c748d71a70492d9bbbabd17d4c7ca3223dcca6cc68c5 |
|
MD5 | 9e3282fe2357343dc09a15604f60eeaa |
|
BLAKE2b-256 | 129774a4c298d01b7aa821f5328a3dfb9bdd899ee079c066d7e80d2cbe1cf9e4 |
Hashes for hugedict-2.8.3-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b10285c1064f2ae10ba6550171b720859d32ab2fea2faa63ba23a68828e2c42 |
|
MD5 | 8cd369fc881ec8c38ae403951ab03cd0 |
|
BLAKE2b-256 | 76354dacfede556cbb4dbc9f7af2f09bbd42902e709399094745cc2b147565c0 |
Hashes for hugedict-2.8.3-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d6a14872bede705c4a1668a78575700e83ee7e3f0b6b63e1f076ed3a73f89d8 |
|
MD5 | ae2392d02440d074e397a18410d9bf1f |
|
BLAKE2b-256 | d6bd18e0daac49b2e08caf482d96222f827fea9beccc05d85c1f070bc8389ada |
Hashes for hugedict-2.8.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 315be4ef0d3cb46366c53c8ad3a34868cbdb33ba996804cf65457b4dac73bdc0 |
|
MD5 | 57ee5a045f420e42720f66148b87f0b8 |
|
BLAKE2b-256 | 92aaea70798fc80e7098e14fb5e88ef1c1cd58cef03b74750987184dfd13ea1c |
Hashes for hugedict-2.8.3-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa4a077e9222af5481a24978b3e017e016a5be5227ef3b3c2edba0f8c67d4406 |
|
MD5 | df65884d6ac12c8401e83e735f622902 |
|
BLAKE2b-256 | a3a1f8f58f21394c33d891fdcac816453251899f3e1e85b02c665c689536b2fc |