On-disk Dict With RocksDB (dbm alternative)
Project description
RocksDict
Key-value storage supporting any python object
Abstract
This package enables users to store, query, and delete a large number of key-value pairs on disk.
This is especially useful when the data cannot fit into RAM. If you have hundreds of GBs or many TBs of key-value data to store and query from, this is the package for you.
Installation
This package is built for macOS (x86/arm), Windows 64/32, and Linux x86.
It can be installed from pypi with pip install rocksdict
.
Introduction
Below is a code example that shows how to do the following:
- Create Rdict
- Store something on disk
- Close Rdict
- Open Rdict again
- Check Rdict elements
- Iterate from Rdict
- Batch get
- Delete storage
from rocksdict import Rdict
import numpy as np
import pandas as pd
path = str("./test_dict")
# create a Rdict with default options at `path`
db = Rdict(path)
db[1.0] = 1
db[1] = 1.0
db["huge integer"] = 2343546543243564534233536434567543
db["good"] = True
db["bad"] = False
db["bytes"] = b"bytes"
db["this is a list"] = [1, 2, 3]
db["store a dict"] = {0: 1}
db[b"numpy"] = np.array([1, 2, 3])
db["a table"] = pd.DataFrame({"a": [1, 2], "b": [2, 1]})
# close Rdict
db.close()
# reopen Rdict from disk
db = Rdict(path)
assert db[1.0] == 1
assert db[1] == 1.0
assert db["huge integer"] == 2343546543243564534233536434567543
assert db["good"] == True
assert db["bad"] == False
assert db["bytes"] == b"bytes"
assert db["this is a list"] == [1, 2, 3]
assert db["store a dict"] == {0: 1}
assert np.all(db[b"numpy"] == np.array([1, 2, 3]))
assert np.all(db["a table"] == pd.DataFrame({"a": [1, 2], "b": [2, 1]}))
# iterate through all elements
for k, v in db.items():
print(f"{k} -> {v}")
# batch get:
print(db[["good", "bad", 1.0]])
# [True, False, 1]
# delete Rdict from dict
db.close()
Rdict.destroy(path)
Supported types:
- key:
int, float, bool, str, bytes
- value:
int, float, bool, str, bytes
and anything that supportspickle
.
Rocksdb Options
Since the backend is implemented using rocksdb, most of rocksdb options are supported:
Example of tuning
from rocksdict import Rdict, Options, SliceTransform, PlainTableFactoryOptions
import os
def db_options():
opt = Options()
# create table
opt.create_if_missing(True)
# config to more jobs
opt.set_max_background_jobs(os.cpu_count())
# configure mem-table to a large value (256 MB)
opt.set_write_buffer_size(0x10000000)
opt.set_level_zero_file_num_compaction_trigger(4)
# configure l0 and l1 size, let them have the same size (1 GB)
opt.set_max_bytes_for_level_base(0x40000000)
# 256 MB file size
opt.set_target_file_size_base(0x10000000)
# use a smaller compaction multiplier
opt.set_max_bytes_for_level_multiplier(4.0)
# use 8-byte prefix (2 ^ 64 is far enough for transaction counts)
opt.set_prefix_extractor(SliceTransform.create_max_len_prefix(8))
# set to plain-table for better performance
opt.set_plain_table_factory(PlainTableFactoryOptions())
return opt
db = Rdict(str("./some_path"), db_options())
Example of Bulk Writing By SstFileWriter
from rocksdict import Rdict, Options, SstFileWriter
import random
# generate some rand bytes
rand_bytes1 = [random.randbytes(200) for _ in range(100000)]
rand_bytes1.sort()
rand_bytes2 = [random.randbytes(200) for _ in range(100000)]
rand_bytes2.sort()
# write to file1.sst
writer = SstFileWriter()
writer.open("file1.sst")
for k, v in zip(rand_bytes1, rand_bytes1):
writer[k] = v
writer.finish()
# write to file2.sst
writer = SstFileWriter(Options())
writer.open("file2.sst")
for k, v in zip(rand_bytes2, rand_bytes2):
writer[k] = v
writer.finish()
# Create a new Rdict with default options
d = Rdict("tmp")
d.ingest_external_file(["file1.sst", "file2.sst"])
d.close()
# reopen, check if all key-values are there
d = Rdict("tmp")
for k in rand_bytes2 + rand_bytes1:
assert d[k] == k
d.close()
# delete tmp
Rdict.destroy("tmp")
Limitations
Currently the package does not support ColumnFamilies due to some memory bug.
Contribution
This project is still in an early stage of development. People are welcome to add tests, benchmarks and new features.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rocksdict-0.2.10-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a04a0f026e752771100b66f2618ae9b7f3049ac82e84139ef1a0adffb5941e3f |
|
MD5 | 8f605b951fc4da8f3f2da83fc0facc59 |
|
BLAKE2b-256 | 820e6e9465a528cbe862956f7a5ff1150a0f9567ac5cfbaccf916ccb092d1f45 |
Hashes for rocksdict-0.2.10-cp310-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f01c67d67bb0db3f41b2dba82f4e5d48d0c867c0542ec700123e4c1d90a230e9 |
|
MD5 | 28bf10c5bd1eb9c425f8eb3bd0c9d43b |
|
BLAKE2b-256 | 40f69d5b12753ea1133542b7241b62696ab70ff297cbf4d16d2637e0d437713c |
Hashes for rocksdict-0.2.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b735da10b5882da27b63f243dd29a32e3351d3ec34648fd1117fa74d026592b |
|
MD5 | ee6bb40fe2105aafd00d136713a91d3e |
|
BLAKE2b-256 | 384eeb3fbf5018319be45b76978f4b2ece53672910e4e54a1bd2cb783ee9df87 |
Hashes for rocksdict-0.2.10-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46acc69e65a42d501d46402c15fff3f73454868de70d955d9f0ce3dff8284f5a |
|
MD5 | 7065b56fd6fd892576a9a5b97225cdc6 |
|
BLAKE2b-256 | a54507b57fe9c254172686c7cf28ea2e80358029697e76e72f27e9bedf539406 |
Hashes for rocksdict-0.2.10-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25762fd9e71022af3a417114040935a3e6170668c547088620a900c25bc54217 |
|
MD5 | dab00c7e20aa7cfc5f2eca88297e184a |
|
BLAKE2b-256 | 84ae859a407f3ebbdfaff099d58c2e79f042956c1a26161e97d0ce4ecc3eddce |
Hashes for rocksdict-0.2.10-cp39-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 98253ee4df567beb38e314e738bef629bf5285990f3732eb88552611f9a60728 |
|
MD5 | af56330f5216529387089c2896dc4636 |
|
BLAKE2b-256 | f12bd446308b560cf9d9dc03e995659dba48c42ca5cef5391ae890940fe1cd11 |
Hashes for rocksdict-0.2.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26032a93496265ae3a5e4c61aa1a10da7dc82e626466c06b4ead6d4ec95bb57f |
|
MD5 | c387de947edc79dd4b98a82d69d71f3f |
|
BLAKE2b-256 | f99c79f7d2619c893a2284802481e77f995ddcef9db837ab7521c36260ecffab |
Hashes for rocksdict-0.2.10-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3b8914156252f13e1eb1c5c9754c76b4fc24bcf930fdb2ebd326896094f901c |
|
MD5 | cc57b7bd71d6b79d5dc8acbbc3a02c9f |
|
BLAKE2b-256 | 22201f7c126603d68758ec4a1f86af52f4137823776380325c3c385eb073b3ce |
Hashes for rocksdict-0.2.10-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5dde8751121459b70f91adc447216bab0300479be03a0e0425f5241fe4e254a5 |
|
MD5 | 2c0fd92d8660a493baed0c93809e3ff4 |
|
BLAKE2b-256 | fccea3291baf170cd0657665af59af22b0d5a52756321ed4011eddab9ff0faef |
Hashes for rocksdict-0.2.10-cp38-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43fedee5ebdb026d794a1c79b722f40db83275ae7970d7c4f95aea3cf6e88da6 |
|
MD5 | 40f7fa49ce4ecb13c64aca81442f1f66 |
|
BLAKE2b-256 | fd391b75f59f7d394f389993c78a4c07b245d801a3409af5d9818d47b163f71c |
Hashes for rocksdict-0.2.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e262adfcb875953b4712d6643252ca8e36eaee4eefd9cab9023df588f0a084a0 |
|
MD5 | dbb0cf3f659486417b4ef4d9ecb0f453 |
|
BLAKE2b-256 | 49695a97bd5c7d6a16022f79518b2411ccc5c347ff4a7a489af3feabebe70685 |
Hashes for rocksdict-0.2.10-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2a7370827370e7620a3f35c9f1bf75886e22e9e4889a1ef24e7fea57f21da00 |
|
MD5 | b2787419ccaee5caef52211d37395eae |
|
BLAKE2b-256 | 23a7c7626396f1795b3f6a082e60ae3ee8c254073f75e75fd5a9ec351f6b61de |
Hashes for rocksdict-0.2.10-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da8e3408eeaf9eebf1e1f077a401a5fe6f1521b695a6fd00c32936a00ce7be3c |
|
MD5 | 9113561f99ea369f4868b435f1ac6d46 |
|
BLAKE2b-256 | 6c039860a5578b073aa9ecd20e4484dda40638542e266c28a799f1109836e2ad |
Hashes for rocksdict-0.2.10-cp37-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 295eef59b3b42583a73ab008ed3445916c36ce8973b7eadda463babc61c67248 |
|
MD5 | 5f859c3f61ba1858b6726aef45c34044 |
|
BLAKE2b-256 | 670775c03e64ec1b54a9cd3fa72d5c2621c54fcd0563a610c3c7b76bd2f134c1 |
Hashes for rocksdict-0.2.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ebc6a4119baa3d2e4fa5fd63a20098675efd769402fb82d1d1266fd11a88ae06 |
|
MD5 | 62bc97d950a4bbd72ef39b1886ca5f75 |
|
BLAKE2b-256 | b21ee6e1a03912ff066810d894d2def38125ac97a53af5c296fc4ff8b4177908 |
Hashes for rocksdict-0.2.10-cp37-cp37m-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec21fb93ed67f3d0a8a9b8869184f08a90b87ff4a4bea8bcd53506bda54b7f9c |
|
MD5 | fc1a8576270b5e15e9714f2b958dca38 |
|
BLAKE2b-256 | 134d0e9ede255e1f6367e4bc5699ece24bcace53bedb5d016dc836900d6a0053 |
Hashes for rocksdict-0.2.10-cp36-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96f59fb0f94294ea122adc17b0dac200860039e65ee4c51feea09ff7d245915b |
|
MD5 | 52fe681b6dee20a80aa69fa0f1e7ac53 |
|
BLAKE2b-256 | 9195de536b7923974bbc65c9ba7986b4ea9f91cb7e366518918e549880f47394 |
Hashes for rocksdict-0.2.10-cp36-none-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c56e4fb125674e014d3d1e9e5d7740da61949b3e7be71d4ceeb1d092cff75673 |
|
MD5 | 580dfc99043ae343528468531c41e204 |
|
BLAKE2b-256 | b769a30418a6529307509c23b35373327befca491694e15a53740e60af4ba145 |
Hashes for rocksdict-0.2.10-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48317254ab006ef68fa1f828bf5ee741dc9cbda163b8f44c43896592edc646bc |
|
MD5 | 34677ea39f804d2b25b480462ad63046 |
|
BLAKE2b-256 | 80d3791e112c3f881e594a5c14fa9acba84d75002d70db0cafdffbb12260988d |
Hashes for rocksdict-0.2.10-cp36-cp36m-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d79b7305bdb5b1fe5a699e781c32a44e81d88737f6653c86b2a8c0454e9b74c9 |
|
MD5 | d26b56a09e751715d34b1be29b659dce |
|
BLAKE2b-256 | f1b9fa4775415723815dc3634d9e1d080b540c4ebcc643e6e67049bf04152d5c |