A dictionary-like object that is friendly with multiprocessing and uses key-value databases (e.g., RocksDB) as the underlying storage.
Project description
hugedict
![Documentation](https://pypi-camo.freetls.fastly.net/cb2a3c3438551102399986f02b7e576247103cac/68747470733a2f2f72656164746865646f63732e6f72672f70726f6a656374732f68756765646963742f62616467652f3f76657273696f6e3d6c6174657374267374796c653d666c6174)
hugedict provides a drop-in replacement for dictionary objects that are too big to fit in memory. hugedict's dictionary-like objects implement typing.Mapping
and typing.MutableMapping
interfaces using key-value databases (e.g., RocksDB) as the underlying storage. Moreover, they are friendly with Python's multiprocessing.
Installation
From PyPI (using pre-built binaries):
pip install hugedict
To compile the source, run: maturin build -r
inside the project directory. You need Rust, Maturin, CMake and CLang (to build Rust-RocksDB).
Features
- Create a mutable mapping backed by RocksDB
from functools import partial
from hugedict.prelude import RocksDBDict, RocksDBOptions
# replace [str, str] for the types of keys and values you want
# as well as deser_key, deser_value, ser_value
mapping: MutableMapping[str, str] = RocksDBDict(
path=dbpath, # path (str) to db file
options=RocksDBOptions(create_if_missing=create_if_missing), # whether to create database if missing, check other options
deser_key=partial(str, encoding="utf-8"), # decode the key from memoryview
deser_value=partial(str, encoding="utf-8"), # decode the value from memoryview
ser_value=str.encode, # encode the value to bytes
readonly=False, # open database in read only mode
secondary_mode=False, # open database in secondary mode
secondary_path=None, # when secondary_mode is True, it's a string pointing to a directory for storing data required to operate in secondary mode
)
-
Load huge data from files into RocksDB in parallel:
from hugedict.prelude import rocksdb_load
. This function creates SST files in parallel, ingests into the db and (optionally) compacts them. -
Cache a function when doing parallel processing
from hugedict.prelude import Parallel
pp = Parallel()
@pp.cache_func("/tmp/test.db")
def heavy_computing(seconds: float):
time.sleep(seconds)
return seconds * 2
output = pp.map(heavy_computing, [0.5, 1, 0.7, 0.3, 0.6], n_processes=3)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for hugedict-2.8.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7493db350aaa8e469c994c1bf3ea0bde1369a8e3b7f227d0c71fd27f54ee826f |
|
MD5 | 0577a7e8f54d7aaf3120e6b2538a6f27 |
|
BLAKE2b-256 | fcbccb6a35ec6b418f56a0c0f0e0e155067bb2c0d54a4e63be55b3ebd55ef33c |
Hashes for hugedict-2.8.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08464632828f322eee59534e0e430b85c5ed4274e987d18a044dd7ed1dbf7c6f |
|
MD5 | 9df2949d327613c83c93b8b5a1c6bae7 |
|
BLAKE2b-256 | 756472ea8ac61198d4f61732378d380a07f9670dc018d6bdb56790fc59098013 |
Hashes for hugedict-2.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5700272526ac97c1751f77d8286245d4fffddb174f82b4f680b9ac1efc2e0b6 |
|
MD5 | fcde14f1bfd84ab64efb890e20b809c4 |
|
BLAKE2b-256 | c526e93e108707fc9938d1408f25bc518880a5f7b09e24e2713121ec7f9c331c |
Hashes for hugedict-2.8.0-cp311-cp311-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ce4728e0cfdd0e048588c20a5a342e1ab754f1f45d7442256b24ee81876a2f9 |
|
MD5 | ae017ebdfdedbee97d2bf6fe07a4dd5b |
|
BLAKE2b-256 | bab06227052fe892c48f8da522f48db4b5ee17646863bc1fe27462fa9e588886 |
Hashes for hugedict-2.8.0-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ecddd2808ccbede386ccea0aa134428257d4193c005eea45a5ebf1acad47ebe |
|
MD5 | e456f6389fe0b42ef2c67cc4b2c0645e |
|
BLAKE2b-256 | 721cd7f53bb4db0d4be6aa726f9e866c88c87c03e808f936ab36a2ab2db87a61 |
Hashes for hugedict-2.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67ae79a8a433e37f36c91e2ce8ff2d688a58d9d844210639077a435ded8e6f95 |
|
MD5 | c503844fc2ac62a020b828ede99bec93 |
|
BLAKE2b-256 | 718b586ec70eccf46ad724c1b6d9471a80b12cbf742ea4c3968e5d9ef583692f |
Hashes for hugedict-2.8.0-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78ded03eaca756078b84a8c010de7032518c28ae934f5d63691539031739bd18 |
|
MD5 | f5c02b04da28d757b96e7550a78de1c7 |
|
BLAKE2b-256 | a4358f488472622507b402bdbb160c52d52aab9a963a86cd5ab21b3a25118a17 |
Hashes for hugedict-2.8.0-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 852b7f4fd5913432df2e60bd7666b076c090d80414a6141a63a5dfbb54841c17 |
|
MD5 | e1aa0288fdae91276150433c9b64a6de |
|
BLAKE2b-256 | 066aa22bd7e132851d9e3b420cfcba1e65469185f9fd888825c000da16ffda8d |
Hashes for hugedict-2.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 735209a85e7401c297948870ade78cbdf075667a515e7b78cacb22444ac251e0 |
|
MD5 | 59e9f795abdcc4e8b97c524bf8e1e43e |
|
BLAKE2b-256 | ce728c759f5636873006c89adb9a57b548b1dedaf7cac9449e470179d0111d50 |
Hashes for hugedict-2.8.0-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48e14f2141cf2dc81982be6426d485e01c7474424625b7a3f6aed07f0fcbab4c |
|
MD5 | 98bf3a910773da8f371bb08f00909ca5 |
|
BLAKE2b-256 | 9cb6d119899efc4ea1ef0273b3be151d68d0ff9a32b3ebaa850f27ba092a094d |
Hashes for hugedict-2.8.0-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3fbbeab78af2ca2eff8972d216494dee25e72992229e872baa6fc41332a2416b |
|
MD5 | 858b00720c8c2c857149964cb374e664 |
|
BLAKE2b-256 | a3dba6b36c2844efb356cbdbf9695c746d8eaaa1827584dbd3101225f1135ac3 |
Hashes for hugedict-2.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0729302fe25ec8ce5b2df126615313d0dd459dfa3c630be60aa9608c03df9da6 |
|
MD5 | 10c9dc2a61963888a8441efe6eb2f775 |
|
BLAKE2b-256 | 4195ce5a22832e6a8356bac044c1f6472f61eedc04006d4471b830171833d9b4 |
Hashes for hugedict-2.8.0-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e8ca99cecc6c345e7bd3191f1ee6ae2aaa14dd125273da93cdf75e4a7099e06 |
|
MD5 | 44d0704f8aed91fb44ed368edba38bcf |
|
BLAKE2b-256 | f66f7c5974a7814918ef699c08861223d143739a44639cf71a055d072f77300c |