A python key-value file database
Project description
Introduction
Booklet is a pure python key-value file database. It allows for multiple serializers for both the keys and values. The API is designed to use all of the same python dictionary methods python programmers are used to in addition to the typical dbm methods.
Installation
Install via pip:
pip install booklet
Or conda:
conda install -c mullenkamp booklet
I’ll probably put it on conda-forge once I feel like it’s up to an appropriate standard…
Serialization
Both the keys and values stored in Booklet must be bytes when written to disk. This is the default when “open” is called. Booklet allows for various serializers to be used for taking input keys and values and converting them to bytes. The in-build serializers include pickle, str, json, and orjson (if orjson is installed). If you want to serialize to json, then it is highly recommended to use orjson as it is substantially faster than the standard json python module. If the user has installed the dill python package, it will use this instead of pickle. The dill package will allow the serializers to be more independent from the original source of the serializer classes. Pickle will only reference classes and functions back to the source scripts rather than storing them directly. The user can also pass custom serializers to the key_serializer and value_serializer parameters. These must have “dumps” and “loads” static methods. This allows the user to chain a serializer and a compressor together if desired.
Usage
The docstrings have a lot of info about the classes and methods. Files should be opened with the booklet.open function. Read the docstrings of the open function for more details.
Write data using the context manager
import booklet
with booklet.open('test.blt', 'n', value_serializer='pickle', key_serializer='str') as db:
db['test_key'] = ['one', 2, 'three', 4]
Read data using the context manager
with booklet.open('test.blt', 'r') as db:
test_data = db['test_key']
Notice that you don’t need to pass serializer parameters when reading. Booklet stores this info on the initial file creation.
Write data without using the context manager
import booklet
db = booklet.open('test.blt', 'n', value_serializer='pickle', key_serializer='str')
db['test_key'] = ['one', 2, 'three', 4]
db['2nd_test_key'] = ['five', 6, 'seven', 8]
db.sync()
db.close()
Read data without using the context manager
db = booklet.open('test.blt', 'r')
test_data1 = db['test_key']
test_data2 = db['2nd_test_key']
db.close()
Recommendations
In most cases, the user should use python’s context manager “with” when reading and writing data. This will ensure data is properly written and (optionally) locks are released on the file. If the context manager is not used, then the user must be sure to run the db.sync() at the end of a series of writes to ensure the data has been fully written to disk. And as with other dbm style APIs, the db.close() must be run to close the file and release locks. MultiThreading is safe for multiple readers and writers, but only multiple readers are safe with MultiProcessing.
Custom serializers
import orjson
class Orjson:
def dumps(obj):
return orjson.dumps(obj, option=orjson.OPT_NON_STR_KEYS | orjson.OPT_OMIT_MICROSECONDS | orjson.OPT_SERIALIZE_NUMPY)
def loads(obj):
return orjson.loads(obj)
with booklet.open('test.blt', 'n', value_serializer=Orjson, key_serializer='str') as db:
db['test_key'] = ['one', 2, 'three', 4]
The Orjson class is actually already built into the package. You can pass the string ‘orjson’ to either serializer parameters to use the above serializer. This is just an example of a serializer.
Here’s another example with compression.
import orjson
import zstandard as zstd
class OrjsonZstd:
def dumps(obj):
return zstd.compress(orjson.dumps(obj, option=orjson.OPT_NON_STR_KEYS | orjson.OPT_OMIT_MICROSECONDS | orjson.OPT_SERIALIZE_NUMPY))
def loads(obj):
return orjson.loads(zstd.decompress(obj))
with booklet.open('test.blt', 'n', value_serializer=OrjsonZstd, key_serializer='str') as db:
db['big_test'] = list(range(1000000))
with booklet.open('test.blt', 'r') as db:
big_test_data = db['big_test']
The open flag follows the standard dbm options:
Value |
Meaning |
---|---|
'r' |
Open existing database for reading only (default) |
'w' |
Open existing database for reading and writing |
'c' |
Open database for reading and writing, creating it if it doesn’t exist |
'n' |
Always create a new, empty database, open for reading and writing |
TODO
I need to write a lot more tests for the functionality. I also need to figure out why the prune function does not work…Currently, stale data cannot be removed from a book, but this will be possible in the future.
Benchmarks
Coming soon…
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file booklet-0.0.11.tar.gz
.
File metadata
- Download URL: booklet-0.0.11.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 pkginfo/1.8.2 readme-renderer/27.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82e64b66751be1651808db459eff909bb61ffa73a254dff42daada3636494191 |
|
MD5 | e0a09ffdb8e975eb8dfb1c6f3951230b |
|
BLAKE2b-256 | 4a751108ed23be0a41914ce842f975b3e8ba50127ee32cbd022a04101a5802e6 |
File details
Details for the file booklet-0.0.11-py2.py3-none-any.whl
.
File metadata
- Download URL: booklet-0.0.11-py2.py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 pkginfo/1.8.2 readme-renderer/27.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.4.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce058ca4d478f8508cb47aa1c1b220c4be55427f3ae18b2d364d7d800eff368f |
|
MD5 | 563bea064bbdc270e98684080b19e16f |
|
BLAKE2b-256 | 1642c33f5cb45e4b7cb07644cc6ab165985e0bccd57ced7c1cb9a091f89979ef |