On-disk Dict With RocksDB (dbm alternative)
Project description
RocksDict
Key-value storage supporting any python object
Abstract
This package enables users to store, query, and delete a large number of key-value pairs on disk.
This is especially useful when the data cannot fit into RAM. If you have hundreds of GBs or many TBs of key-value data to store and query from, this is the package for you.
Installation
This package is built for macOS (x86/arm), Windows 64/32, and Linux x86.
It can be installed from pypi with pip install rocksdict
.
Introduction
Below is a code example that shows how to do the following:
- Create Rdict
- Store something on disk
- Close Rdict
- Open Rdict again
- Check Rdict elements
- Iterate from Rdict
- Batch get
- Delete storage
from rocksdict import Rdict
import numpy as np
import pandas as pd
path = str("./test_dict")
# create a Rdict with default options at `path`
db = Rdict(path)
db[1.0] = 1
db[1] = 1.0
db["huge integer"] = 2343546543243564534233536434567543
db["good"] = True
db["bad"] = False
db["bytes"] = b"bytes"
db["this is a list"] = [1, 2, 3]
db["store a dict"] = {0: 1}
db[b"numpy"] = np.array([1, 2, 3])
db["a table"] = pd.DataFrame({"a": [1, 2], "b": [2, 1]})
# close Rdict
db.close()
# reopen Rdict from disk
db = Rdict(path)
assert db[1.0] == 1
assert db[1] == 1.0
assert db["huge integer"] == 2343546543243564534233536434567543
assert db["good"] == True
assert db["bad"] == False
assert db["bytes"] == b"bytes"
assert db["this is a list"] == [1, 2, 3]
assert db["store a dict"] == {0: 1}
assert np.all(db[b"numpy"] == np.array([1, 2, 3]))
assert np.all(db["a table"] == pd.DataFrame({"a": [1, 2], "b": [2, 1]}))
# iterate through all elements
for k, v in db.items():
print(f"{k} -> {v}")
# batch get:
print(db[["good", "bad", 1.0]])
# [True, False, 1]
# delete Rdict from dict
db.close()
Rdict.destroy(path)
Supported types:
- key:
int, float, bool, str, bytes
- value:
int, float, bool, str, bytes
and anything that supportspickle
.
Rocksdb Options
Since the backend is implemented using rocksdb, most of rocksdb options are supported:
Example of tuning
from rocksdict import Rdict, Options, SliceTransform, PlainTableFactoryOptions
import os
def db_options():
opt = Options()
# create table
opt.create_if_missing(True)
# config to more jobs
opt.set_max_background_jobs(os.cpu_count())
# configure mem-table to a large value (256 MB)
opt.set_write_buffer_size(0x10000000)
opt.set_level_zero_file_num_compaction_trigger(4)
# configure l0 and l1 size, let them have the same size (1 GB)
opt.set_max_bytes_for_level_base(0x40000000)
# 256 MB file size
opt.set_target_file_size_base(0x10000000)
# use a smaller compaction multiplier
opt.set_max_bytes_for_level_multiplier(4.0)
# use 8-byte prefix (2 ^ 64 is far enough for transaction counts)
opt.set_prefix_extractor(SliceTransform.create_max_len_prefix(8))
# set to plain-table for better performance
opt.set_plain_table_factory(PlainTableFactoryOptions())
return opt
db = Rdict(str("./some_path"), db_options())
Example of Bulk Writing By SstFileWriter
from rocksdict import Rdict, Options, SstFileWriter
import random
# generate some rand bytes
rand_bytes1 = [random.randbytes(200) for _ in range(100000)]
rand_bytes1.sort()
rand_bytes2 = [random.randbytes(200) for _ in range(100000)]
rand_bytes2.sort()
# write to file1.sst
writer = SstFileWriter()
writer.open("file1.sst")
for k, v in zip(rand_bytes1, rand_bytes1):
writer[k] = v
writer.finish()
# write to file2.sst
writer = SstFileWriter(Options())
writer.open("file2.sst")
for k, v in zip(rand_bytes2, rand_bytes2):
writer[k] = v
writer.finish()
# Create a new Rdict with default options
d = Rdict("tmp")
d.ingest_external_file(["file1.sst", "file2.sst"])
d.close()
# reopen, check if all key-values are there
d = Rdict("tmp")
for k in rand_bytes2 + rand_bytes1:
assert d[k] == k
d.close()
# delete tmp
Rdict.destroy("tmp")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rocksdict-0.2.9-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88c2580cc35595ba34e18e0193e327c8d95dc1668cdce46173779941b1411c69 |
|
MD5 | 2eaa0ed2bd54bc168a43b84d6409ea17 |
|
BLAKE2b-256 | d029ac8d0df17ae584fe9f27a114c39e8a076678d0a76d0c32b781467eade563 |
Hashes for rocksdict-0.2.9-cp310-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 152301ede3c8a4de72858b19889ff8dbb0459139cb1acda542496f4388f27b6c |
|
MD5 | 8182b949b30b3e7c832f92a09815a80d |
|
BLAKE2b-256 | 0cb5b7470fbce9b83acd93c5e28a59d3cfe2a6a4864fd1e235fa39535e0debef |
Hashes for rocksdict-0.2.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 671bbc1022d3df44b96aa12b5074c5e64d57d40ea3fe1b8eb25810357ea18830 |
|
MD5 | fbe8996d074ea2b191c507bc7fd89e5b |
|
BLAKE2b-256 | e5ebc3f1f8b54ff5bdfe82b21a1ec05d2ac30c26a8e146976c36fe2a41af5537 |
Hashes for rocksdict-0.2.9-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6f8d11e0f93a19370acc8280c357ba433dd19d46ab151387732503411ade8f06 |
|
MD5 | 499eb708c2bc281239aaf21473deccf7 |
|
BLAKE2b-256 | a779c5aef31ce0d598e53f518b624fe3188c48e96914a47ad2512e6e890a07c7 |
Hashes for rocksdict-0.2.9-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0494caee2b9b26f70eb6e1d0aba68cd8f9d054faa2875c32ba4afe069fd90bc5 |
|
MD5 | 356ad60385ffc0a36d26b1b310897dbf |
|
BLAKE2b-256 | 2c879692887a4e54029f594a3c631d982c13e88e33d24bfcf61e6f88c36dffd5 |
Hashes for rocksdict-0.2.9-cp39-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 086fb77beb331efaa43a5876837f1e6fdbe620eb8a777be193b132e99b883038 |
|
MD5 | 0470a4831c54e559572ba3959d36f216 |
|
BLAKE2b-256 | 86dd564d1de78bba5eed6e9c3df8426394d8d308637a56ed773d842cf444796c |
Hashes for rocksdict-0.2.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec7d593f633850b91044cdb963e1b55f6ffeb7c413441b00192b73816b33da2d |
|
MD5 | 510e5cd835e09190c8b6f1859a125bba |
|
BLAKE2b-256 | a20665cbefc5c05cd252c469f9029c10bd35d40e28c2a71d12857e25989704c4 |
Hashes for rocksdict-0.2.9-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da1e5752451f07a02e33f7056f07994a2bbf66de218ead5490bab41415e0f949 |
|
MD5 | d3a373c2b85dc8a365414ab6c6130f1f |
|
BLAKE2b-256 | 57050cb7fdacf2454d928d32003ee53cae1b5752882b94ed4d2bda08a83110ce |
Hashes for rocksdict-0.2.9-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee8724e79bc8f0fd0e84fd7ea20de509a5e1a3c11c3fa3d95b93f4c782c77aa6 |
|
MD5 | 9fdebcce75fadbd7027767ce54f010e5 |
|
BLAKE2b-256 | ebfe87442f92c7e1f19a280216c0c558fed187064ce3ed7fa0a6c4f969669ca2 |
Hashes for rocksdict-0.2.9-cp38-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5651ed18f95b01096976f7fea74fa6454c5ea2ecdd451c204d28894000b02077 |
|
MD5 | 362675ea58d20f58201b6f6b4a900e5c |
|
BLAKE2b-256 | 6ce3560618d91bc158bf8aa9ac82a45a038fee8c00112124881cc658c93928b3 |
Hashes for rocksdict-0.2.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b8e49a722308ddfb379c3ea4c6a33734a116d910eb93d808cc40bb8abc8304a |
|
MD5 | 1442711fe30c32c770881deaf8603543 |
|
BLAKE2b-256 | 0334b32e45eb3183d20efc2471765cb7193d229e0505854131a15a26eb990462 |
Hashes for rocksdict-0.2.9-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95a5ba19cfa0b2cf14e8b1c066a03e1756672bacceb6cb18fc92dffc264766b0 |
|
MD5 | f5a12516389d5b606d5952f407405e82 |
|
BLAKE2b-256 | 3e9d6f11d7b750b4a3a352aff7ecd7c257f9d5de9c6c276108b3d46986338f5d |
Hashes for rocksdict-0.2.9-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a05272a5c93eaa9de1e471208bb67c327c951cfbf95036ccb3f6111f2bafcd07 |
|
MD5 | abe4bf1007ced82149a88da8b65dcb4e |
|
BLAKE2b-256 | 2819284b765491622373ab88c430de1f1fa4f777d2f2e05f841f22187fe717eb |
Hashes for rocksdict-0.2.9-cp37-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1199ea9671dfb4f4dce49f9493c8e2c22fe25fea700f1706e29ba2c0a1dea685 |
|
MD5 | 6062c937376fcf5b610b375d49fbccb2 |
|
BLAKE2b-256 | 66c36bacfcca26a6c64a770f25aa0492cfb0febb453c5ea96c14c121439a4ba4 |
Hashes for rocksdict-0.2.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | affce07530fa0fac37925cebf58978494d42a753c6437556e8f00374fbf952e6 |
|
MD5 | 61742abd2a1a6718cfa61c73d7e09800 |
|
BLAKE2b-256 | 9877bd770eed925748d57efaacad272b1607c741adfbd24d1526d151d7e48948 |
Hashes for rocksdict-0.2.9-cp37-cp37m-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8352aec64cffaa6f7f413280f55a06279e8d501173efad74b17e3cf8d4ddc7e5 |
|
MD5 | a89dab530b585954809bd5342dbb2c6b |
|
BLAKE2b-256 | 8755a7a2910449802152eac782aab3e448bec657fd2b8f528885e08fe9658538 |
Hashes for rocksdict-0.2.9-cp36-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b76d49633b6fd2b4e88369445574fcf6e8f5615eb9cb6756eac514be58fc32a |
|
MD5 | 720609c52d8deacb2dbf1fda65548a5c |
|
BLAKE2b-256 | 859daab60a8239d559d02013b42411f588cf449f4258e1d7c8a87330596900e9 |
Hashes for rocksdict-0.2.9-cp36-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9da85805b5894ee8a916f83c63065da811d033acb66b97944015ef6386a6b930 |
|
MD5 | 6619d916337c70c0863ede6b272bc52d |
|
BLAKE2b-256 | d7654d333c88a69cca22499feccc86e4efce21e1b72ea470af6c00e12ad98879 |
Hashes for rocksdict-0.2.9-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c05bc47d67b4a3730a61ba90ed09808e1a0577bb0486efd385d86a29494f5bd |
|
MD5 | a36efef90af2e4c2931d68203fb7a994 |
|
BLAKE2b-256 | b1561ae05f720f3330038ab07af82729cd443efcbdf933b64ea53b4617619456 |
Hashes for rocksdict-0.2.9-cp36-cp36m-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0770eb6abb7b76b44c4ccdfdeef9a57aabb57f0e145eac4fbb34d85f7cfb3645 |
|
MD5 | 837a140f90a3522124d2846b7239433d |
|
BLAKE2b-256 | 9f2cae577195e463b3dc720f0131e420606b3071932690577e0d8ee4eaf99a11 |