A simple lazy loaded key:value database
Project description
lazy_db
- Free software: MIT license
- Documentation: https://lazy-db.readthedocs.io.
A lazily loaded key:value db intended for use with large datasets that are too big to be loaded into memory. The database supports integers, strings, lists, and dictionaries. There may be future support for raw bytes as well. This database is meant to strike a good balance of retrieval/insertion speed and memory usage.
Example usage:
from lazy_db import LazyDb
db = LazyDb("test.lazy")
db.write("test_value", "value")
print(db.read("test_value")) # prints "value"
db.close()
Everything about this package currently works, but it is still in early stages of development. Here are changes planned in the future:
- Make content ints unsigned (space optimization)
- Attend to files generated by cookiecutter; release on pypi (release)
How it works
File layout
All text in database files are encoded in utf-8 format. Each database has a json string at the start of the file that denotes the database's settings, who's end is marked with a NUL byte (00 in hex)
Each database entry is appended at the end of the file and is laid out in this format:
Name | Size (bytes) | Purpose |
---|---|---|
NUL | 1 | Marks the start of the entry. When the initial headers index, the starting byte of each entry is checked for this NUL byte to be sure the database hasn't been corrupted. This is the beginning to what is considered the "header" for the entry (NUL bytes carry a hex value of 0x00) |
Key type | 1 | Marks if the key is an integer or a string. |
Key | any | The key for the database entry. |
NUL | 1 | Marks the end of the key. This is necessary since string keys don't have a set size. |
Content length | content_int_size | An integer (little endian) depicting the length of the content (including the content type). Defaults to 4 bytes long. This is the end to what is considered the "header" for the entry |
Content type | 1 | Marks if the content is a string, int, int list, dict, or bytes. |
Content | Content length | Stores the content |
Content type labels
Name | Hex type value | Type description |
---|---|---|
String | 0x01 | A utf-8 string |
Int | 0x02 | An integer |
Dict | 0x03 | A dictionary (internally stored as a utf-8 json string) |
Int list | 0x04 | A list of integers. Max integer size is defined by int_list_size (default: 4 bytes) |
Bytes | 0x05 | A bytes object |
The algorithm
When loading a database, all entry headers are scanned for their key value and lengths. This allows for values to be retrieved very quickly without having to load the content of every entry, at the cost of having to store the key and content length in memory though. This approach makes the database best for cases where your database will be storing a lot of data in each key that you can't afford to store in memory, however you can afford to store the name values and lengths of each element in memory.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lazy_database-0.1.0.tar.gz
.
File metadata
- Download URL: lazy_database-0.1.0.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48f22b27cf785bdad7dd34e692424206b8cb41ed19d07191958347b9745f1d4b |
|
MD5 | d2267549908b0c718e18aef12c67f5b8 |
|
BLAKE2b-256 | 38b9f1409e8dd9dd989763842a36e77beb66044bcde9921e4ccb8838d2dd0b52 |
File details
Details for the file lazy_database-0.1.0-py2.py3-none-any.whl
.
File metadata
- Download URL: lazy_database-0.1.0-py2.py3-none-any.whl
- Upload date:
- Size: 6.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dcb5e5b549521da527f474f8a6ac55acd8f325c9e28e79e0c12c25ee948893c0 |
|
MD5 | 441e1cd3dc0f3d4f949204f68c107b9b |
|
BLAKE2b-256 | 5c683425e2728888b3bb4c95ea5936d9ff99eb6feb9ad49a30d8a3585be0c099 |