Skip to main content

A simple kv store

Project description

JumpDB

JumpDB is a simple key-value store that exploits Sorted String Tables.

Here's a tutorial which goes a little more in-depth into how it works.

Install

pip3 install jumpDB

Usage

from jumpDB import DB

db = DB()
db["k1"] = "v1"
db["k2"] = "v2"
del db["k2"]
assert db["k1"] == "v1"
assert "k2" not in db

API

  • get(key) => Finds the corresponding value to the given key

  • set(key, value) => Insert the entry into the db

  • delete(key) => Deletes the key from the db

  • contains(key) => Checks if the given key is present in the db

Design & Implementation

The design is essentially a simplified version of levelDB.

Every write is initially inserted into an in-memory data structure (typically called "memtable") -- in this case, a red-black tree.

When the memtable's size exceeds a certain threshold, all entries are written out into a segment file. Exploiting the properties of a red-black BST, we can ensure all entries will be efficiently written in sorted order. The resulting file is immutable and called a sorted-string table (SST).

Whilst performing the above write, we also maintain a sparse index table, keeping track of the file offset of every in 1 in x entries.

When a read comes in, we first look into the memtable for the corresponding k-v pair; if it doesn't exist, we look at the closest entry (by key) in the sparse table. We jump to the file offset of that entry and then linearly scan forwards until we find the desired key-value pair. This is only possible because the SST is sorted by key.

Periodically, the segments are merged (also called "compaction"); this ensures a reduction in memory footprint as the resulting merged segments(s) would only hold the most recent entries.

An addition optimisation includes the use of bloom-filters to check if a key is not present in the DB. This saves us from performing heavy disk reads for keys that haven't been inserted into the db.

Tests

Run pytest -v

License

BSD 2-Clause License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

jumpDB-0.0.6-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file jumpDB-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: jumpDB-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.4

File hashes

Hashes for jumpDB-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 afdea4323c34f66360e94525f2aa7000dd284077cddb7557678d035ee61b78d1
MD5 e5ffcb5cb30141a77b0f25f6617c3bb0
BLAKE2b-256 a57114d5c3e82eb2b4d448fc0795e9b2c90030f2bd0144b46cbf7f15a65a5a95

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page