A dictionary-like object that is friendly with multiprocessing and uses key-value databases (e.g., RocksDB) as the underlying storage.
Project description
hugedict
![Documentation](https://pypi-camo.freetls.fastly.net/cb2a3c3438551102399986f02b7e576247103cac/68747470733a2f2f72656164746865646f63732e6f72672f70726f6a656374732f68756765646963742f62616467652f3f76657273696f6e3d6c6174657374267374796c653d666c6174)
hugedict provides a drop-in replacement for dictionary objects that are too big to fit in memory. hugedict's dictionary-like objects implement typing.Mapping
and typing.MutableMapping
interfaces using key-value databases (e.g., RocksDB) as the underlying storage. Moreover, they are friendly with Python's multiprocessing.
Installation
From PyPI (using pre-built binaries):
pip install hugedict
To compile the source, run: maturin build -r
inside the project directory. You need Rust, Maturin, CMake and CLang (to build Rust-RocksDB).
Features
- Create a mutable mapping backed by RocksDB
from functools import partial
from hugedict.prelude import RocksDBDict, RocksDBOptions
# replace [str, str] for the types of keys and values you want
# as well as deser_key, deser_value, ser_value
mapping: MutableMapping[str, str] = RocksDBDict(
path=dbpath, # path (str) to db file
options=RocksDBOptions(create_if_missing=create_if_missing), # whether to create database if missing, check other options
deser_key=partial(str, encoding="utf-8"), # decode the key from memoryview
deser_value=partial(str, encoding="utf-8"), # decode the value from memoryview
ser_value=str.encode, # encode the value to bytes
readonly=False, # open database in read only mode
secondary_mode=False, # open database in secondary mode
secondary_path=None, # when secondary_mode is True, it's a string pointing to a directory for storing data required to operate in secondary mode
)
-
Load huge data from files into RocksDB in parallel:
from hugedict.prelude import rocksdb_load
. This function creates SST files in parallel, ingests into the db and (optionally) compacts them. -
Cache a function when doing parallel processing
from hugedict.prelude import Parallel
pp = Parallel()
@pp.cache_func("/tmp/test.db")
def heavy_computing(seconds: float):
time.sleep(seconds)
return seconds * 2
output = pp.map(heavy_computing, [0.5, 1, 0.7, 0.3, 0.6], n_processes=3)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for hugedict-2.12.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9a2812b303185a824a97c89a26b85de95c9014ebdb438ba2a813badbe2d3f958 |
|
MD5 | ecb5979bcd83352a3d5ba9ec00fd4958 |
|
BLAKE2b-256 | f3e09ef40588b7edcabd614600b3ae466c5f1cd0a7dd6f031fa95529a78b2e83 |
Hashes for hugedict-2.12.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40c3fdeac5da9f09f91d4ba8efb078ce76a95a14200425c6393c023512c7d028 |
|
MD5 | 238cb47aabc8abe9a0351383267de165 |
|
BLAKE2b-256 | d69425a53bb8fd8e3f416728ff1516029aed57f31e44d112baad2ff4c00e6542 |
Hashes for hugedict-2.12.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b150963ad863097d63d479f33019d9eeaadea5b48d41148b4fb69823b637009f |
|
MD5 | f850b143ef4d4c88184ddce3b412eb5e |
|
BLAKE2b-256 | d5e83374139a73cce70da41b89e9b2ae683e245fcab05e7043944123314e685b |
Hashes for hugedict-2.12.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3aa718f9754c57592611c72d8f5086b02122be45584b15254969933d86dfeaff |
|
MD5 | 0ef17b43ad7c41c73cfeeb5e728f8f5d |
|
BLAKE2b-256 | 061d3adfe873b01ff9a2af9b1b6b9feafd9e2c3bde451439a3d936d3224911c8 |
Hashes for hugedict-2.12.0-cp311-cp311-manylinux_2_35_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 55d4f3fcbe7b8a863e5bf1614f75e43140b2d86af8fc538893c99912dd2a5c68 |
|
MD5 | 78a59c320aa66623841ab03c09315032 |
|
BLAKE2b-256 | 618374793c9afc112e6892dc3c70461328c9adc8574d9c4848b8690c84d69e4c |
Hashes for hugedict-2.12.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40726abb2a79302165765987a8a1c54a177e33ff88b5108f8faac9b45081ae03 |
|
MD5 | 7096b3505274ee8212a21c20f128ee7b |
|
BLAKE2b-256 | 463b3e5a14a706a0531a668676db01fbf82e84ab2c8b2c4e8c75cfbe238fa95c |
Hashes for hugedict-2.12.0-cp311-cp311-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a71acfab9d736f626ed5cdf9a7f15937825e9a275b562f9e3a225266ae913be |
|
MD5 | f6b30c2c1d79244b90422f1915e4caf3 |
|
BLAKE2b-256 | d2ab590eeb2883d4117f2577efba9fb9284235721ba2d5c86a2406f864300f78 |
Hashes for hugedict-2.12.0-cp310-cp310-manylinux_2_35_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e50430b385b596ea70e0639d0054aedaedace1956688df55e60b19586c67bea |
|
MD5 | 1e6b9fdf1c186c455f2fff8190cd7005 |
|
BLAKE2b-256 | 4500c310e5e7825dc6e5284d7f7fa94c70fa5b6d6d59f4192424eaa29d5dfed1 |
Hashes for hugedict-2.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 329c3b5eba2eb3f97f2c92257cf29d59cd0f625860511adb6b702e3ffa345f84 |
|
MD5 | 77a3e85c288afd80acaa940b9af42cb7 |
|
BLAKE2b-256 | ecddd3ee356fc37833be9c6098b6048fca200106a3c2e26f9cf58a7213bdef45 |
Hashes for hugedict-2.12.0-cp310-cp310-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d347bdd0d656b6516cfca0c79b5ab5da269474bc98708aea786096afaf62f072 |
|
MD5 | 4155249d0f5feff1fdc4f6755202536a |
|
BLAKE2b-256 | f7fab594ad3933ab9cf92ae0ba1340db01ad483f4fbe7e63076219be9a37f7b5 |
Hashes for hugedict-2.12.0-cp39-cp39-manylinux_2_35_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc7bf318948afae0a7212828dce7d36ea5885ef47d2677cdf15cbf498dcce195 |
|
MD5 | 0142a35ee4ac449bf871977440bd0545 |
|
BLAKE2b-256 | f4119d588e7d1ad643e0e7651c05ceb88bb1ebf2b47809c4f120e119fb92e8fe |
Hashes for hugedict-2.12.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0926453b639443bb62ee556e7e51bdc40c57a794a57ca473d3ee1b60ed924acc |
|
MD5 | 506b39ff3152f865d8df53423e1ea063 |
|
BLAKE2b-256 | 35168c9bcfd474cee653642c238a80df541c48db06f7b41a82d88e4a6b7f8311 |
Hashes for hugedict-2.12.0-cp39-cp39-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6bfee8e93af379f89b9a0dfafa00cb0cc0afa00a9c0f7f7ba4dde4ccceab7865 |
|
MD5 | 54fb732be9e983e60cb0b572307beda0 |
|
BLAKE2b-256 | 77d1d97264ce7c51764870cffe748fb62b97cff035dcfe20d036bab1815885b2 |
Hashes for hugedict-2.12.0-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4794d65eede37bf158fe9a3f1833dbbf29b27617f9766c955657f9a6d90a5cfd |
|
MD5 | 65e9b0dbb7ee6ea872fb300c9933b820 |
|
BLAKE2b-256 | 45234b73570247090748911a5b3ed1f227e4f0b910a87f0b780197c4a13ecb28 |
Hashes for hugedict-2.12.0-cp38-cp38-manylinux_2_35_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8179e6b1d561375af5ee844809eca554b3e7649299df31414254b2acec3f6804 |
|
MD5 | d9b74d534687007a69cb2017c167bce1 |
|
BLAKE2b-256 | 8468ede82f0e50476794985f22f162e887fb203c354f4834846c5b8c5dff00ab |
Hashes for hugedict-2.12.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 295dacb55dfdff7945d450b3b26061e810f6489d388565af0823049c247a4992 |
|
MD5 | f1c526a45d7ad7d784f8f1aa4e38a327 |
|
BLAKE2b-256 | dc7d3b60e4c963a69ba0f3436f20cb3a342604c532f9a298ac323f90a30f70e2 |
Hashes for hugedict-2.12.0-cp38-cp38-macosx_10_14_x86_64.macosx_11_0_arm64.macosx_10_14_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60c499fe8f1ce363a4ac7d3c7b956fe8524157699e31732b2f8176d9c431fa6c |
|
MD5 | 58c24450eeec1dbac45d56068f558774 |
|
BLAKE2b-256 | cc2f8d39fd2ae0a908b7673242702668a80a8fcd15cbf706434ab07587e23b26 |