Easy-to-use database using dicts

These details have not been verified by PyPI

Project links

Project description

Logo

Tests Coverage

DictDataBase is a simple and fast database for handling json or compressed json as the underlying storage mechanism. Features:

Multi threading and multi processing safe. Multiple processes on the same machine can simultaneously read and write to dicts without losing data.
ACID compliant. Unlike TinyDB, it is suited for concurrent environments.
No database server required. Simply import DictDataBase in your project and use it.
Compression. Configure if the files should be stored as raw json or as json compressed with zlib.
Fast. A dict can be accessed partially without having to parse the entire file, making the read and writes very efficient.
Tested. 100% coverage, over 1000 test cases.

Why use DictDataBase

For example, have a webserver dispatch database reads and writes concurrently.
If spinning up a database server is an overkill for your application.
- But you need ACID guarantees.
You have a big database but only want to access single key-value pairs repeatedly. DictDataBase can do this efficiently and quickly.
Your use case is suited for working with json data, or you have to work with a lot of json data.

Why not DictDataBase

If your storage is slow.
If a relational database is better suited for your use case.

Install

pip install dictdatabase

Configuration

There are 5 configuration options:

Storage directory

Set storage_directory to the path of the directory that will contain your json files:

DDB.config.storage_directory = "./ddb_storage" # Default value

Compression

If you want to use compressed files, set use_compression to True. This will make the db files significantly smaller and might improve performance if your disk is slow. However, the files will not be human readable.

DDB.config.use_compression = False # Default value

Indentation

Set the way how written json files should be indented. Behaves exactly like json.dumps(indent=...). It can be an int for the number of spaces, the tab character, or None if you don't want the files to be indented.

DDB.config.indent = "\t" # Default value

Notice: If DDB.config.use_orjson = True, then the value can only be 2 (spaces) or 0/None for no indentation.

Sort keys

Specify if you want the dict keys to be sorted when writing to a file. Behaves exactly like json.dumps(sort_keys=...).

DDB.config.sort_keys = True # Default value

Use orjson

You can use the orjson encoder and decoder if you need to. The standard library json module is sufficient most of the time. However, orjson is a lot more performant in virtually all cases.

DDB.config.use_orjson = True # Default value

Usage

Import

import dictdatabase as DDB

Create a file

This library is called DictDataBase, but you can actually use any json serializable object.

user_data_dict = {
    "users": {
        "Ben": { "age": 30, "job": "Software Engineer" },
        "Sue": { "age": 21, "job": "Architect" },
        "Joe": { "age": 50, "job": "Manager" }
    },
    "follows": [["Ben", "Sue"], ["Joe", "Ben"]]
}
DDB.at("users").create(user_data_dict)

# There is now a file called users.json (or user_data.ddb if you use compression)
# in your specified storage directory.

Check if file or sub-key exists

DDB.at("users").exists()  # True
DDB.at("users", key="none").exists()  # False
# Also works on nested keys
DDB.at("users", key="Ben").exists()  # True
DDB.at("users", key="Sam").exists()  # False

Read dicts

d = DDB.at("users").read()
# You now have a copy of the json file named "users"
d == user_data_dict # True

# Only partially read Joe
joe = DDB.at("users", key="Joe").read()
joe == user_data_dict["Joe"] # True

It is also possible to only read a subset of keys based on a filter callback:

DDB.at("numbers").create({"a", 1, "b", 2, "c": 3})

above_1 = DDB.at("numbers", where=lambda k, v: v > 1).read()
>>> above_1 == {"b", 2, "c": 3}

Write dicts

with DDB.at("users").session() as (session, users):
    # You now have a copy of the json file users as the variable users
    # Inside the with statement, the file of user_data will be locked, and no other
    # processes will be able to interfere.
    users["follows"].append(["Sue", "Ben"])
    session.write()
    # session.write() must be called to save the changes!
print(DDB.at("user_data").read()["follows"])
>>> [["Ben", "Sue"], ["Joe", "Ben"], ["Sue", "Ben"]]

If you do not call session.write(), changes will not be written to disk!

Partial writing

Imagine you have a huge json file with many purchases. The json file looks like this: {<id>: <purchase>, <id>: <purchase>, ...}. Normally, you would have to read and parse the entire file to get a specific key. After modifying the purchase, you would also have to serialize and write the entire file again. With DDB, you can do it more efficiently:

with DDB.at("purchases", key="3244").session() as (session, purchase):
    purchase["status"] = "cancelled"
    session.write()

Afterwards, the status is updated in the json file. However, DDB did only efficiently gather the one purchase with id 134425, parsed its value, and serialized that value alone before writing again. This is several orders of magnitude faster than the naive approach when working with big files.

Folders

You can also read and write to folders of files. Consider the same example as before, but now we have a folder called purchases that contains many files <id>.json. If you want to open a session or read a specific one, you can do:

DDB.at("purchases/<id>").read()
# Or equivalently:
DDB.at("purchases", "<id>").read()

To open a session or read all, do the following:

DDB.at("purchases/*").read()
# Or equivalently:
DDB.at("purchases", "*").read()

Select from folder

If you have a folder containing many json files, you can read them selectively based on a function. The file is included if the provided function returns true when it get the file dict as input:

To open a session or read all, do the following:

for i in range(10):
    DDB.at("folder", i).create({"a": i})
# Now in the directory "folder", 10 files exist
res = DDB.at("folder/*", where=lambda x: x["a"] > 7).read() # .session() also possible
assert ress == {"8": {"a": 8}, "9": {"a": 9}} # True

Performance

In preliminary testing, DictDataBase showed promising performance.

SQLite vs DictDataBase

In each case, 16 parallel processes were spawned to perform 128 increments of a counter in 4 tables/files. SQLite achieves 2435 operations/s while DictDataBase managed to achieve 3143 operations/s.

More tests

It remains to be tested how DictDatabase performs in different scenarios, for example when multiple processes want to perform full writes to one big file.

API Reference

`at(path) -> DDBMethodChooser:`

Select a file or folder to perform an operation on. If you want to select a specific key in a file, use the key parameter, e.g. DDB.at("file", key="subkey").

If you want to select an entire folder, use the * wildcard, eg. DDB.at("folder", "*"), or DDB.at("folder/*"). You can also use the where callback to select a subset of the file or folder.

If the callback returns True, the item will be selected. The callback needs to accept a key and value as arguments.

Args:

path: The path to the file or folder. Can be a string, a comma-separated list of strings, or a list.
key: The key to select from the file.
where: A function that takes a key and value and returns True if the key should be selected.

Beware: If you select a folder with the * wildcard, you can't use the key parameter. Also, you cannot use the key and where parameters at the same time.

DDBMethodChooser

`exists() -> bool:`

Create a new file with the given data as the content. If the file already exists, a FileExistsError will be raised unless force_overwrite is set to True.

Args:

data: The data to write to the file. If not specified, an empty dict will be written.
force_overwrite: If True, will overwrite the file if it already exists, defaults to False (optional).

`create(data=None, force_overwrite: bool = False):`

It creates a database file at the given path, and writes the given database to it :param db: The database to create. If not specified, an empty database is created. :param force_overwrite: If True, will overwrite the database if it already exists, defaults to False (optional).

`delete()`

Delete the file at the selected path.

`read(self, as_type: T = None) -> dict | T | None:`

Reads a file or folder depending on previous .at(...) selection.

Args:

as_type: If provided, return the value as the given type. Eg. as_type=str will return str(value).

`session(self, as_type: T = None) -> DDBSession[T]:`

Opens a session to the selected file(s) or folder, depending on previous .at(...) selection. Inside the with block, you have exclusive access to the file(s) or folder. Call session.write() to write the data to the file(s) or folder.

Args:

as_type: If provided, cast the value to the given type. Eg. as_type=str will return str(value).

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.5.1

Apr 18, 2025

2.5.0

Dec 17, 2023

2.4.6

Aug 30, 2023

2.4.5

Aug 13, 2023

2.4.4

Mar 29, 2023

2.4.3

Mar 5, 2023

2.4.2

Jan 11, 2023

2.4.1

Nov 20, 2022

2.4.0

Nov 11, 2022

2.3.2

Nov 10, 2022

2.3.1

Nov 9, 2022

This version

2.3.0

Nov 8, 2022

2.2.0

Nov 7, 2022

2.1.5

Nov 6, 2022

2.1.4

Nov 6, 2022

2.1.3

Nov 6, 2022

2.1.2

Nov 6, 2022

2.1.1

Nov 6, 2022

2.1.0

Nov 6, 2022

2.0.5

Oct 29, 2022

2.0.4

Oct 27, 2022

2.0.3

Oct 27, 2022

2.0.2

Oct 27, 2022

2.0.1

Oct 27, 2022

2.0.0

Oct 26, 2022

1.3.0

Oct 3, 2022

1.2.0

Oct 2, 2022

1.1.4

Apr 8, 2022

1.1.3

Apr 8, 2022

1.1.2

Apr 8, 2022

1.1.1

Apr 8, 2022

1.1.0

Jan 31, 2022

0.4.0

Feb 19, 2021

0.3.0

Jan 31, 2021

0.2.0

Jan 31, 2021

0.1.0

Jan 31, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dictdatabase-2.3.0.tar.gz (19.2 kB view details)

Uploaded Nov 8, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dictdatabase-2.3.0-py3-none-any.whl (18.4 kB view details)

Uploaded Nov 8, 2022 Python 3

File details

Details for the file dictdatabase-2.3.0.tar.gz.

File metadata

Download URL: dictdatabase-2.3.0.tar.gz
Upload date: Nov 8, 2022
Size: 19.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.2.2 CPython/3.10.8 Darwin/22.1.0

File hashes

Hashes for dictdatabase-2.3.0.tar.gz
Algorithm	Hash digest
SHA256	`ba9bea59327a2a8b3717dd4d66235e951399c4c6cc90f16ea1926317ddca1cb2`
MD5	`9d8d46023e4fa35f1092ca08660f1f84`
BLAKE2b-256	`135f19cb0f92571abfd7ef6d990f712c719fa9555db87c041125bbc31ad063ad`

See more details on using hashes here.

File details

Details for the file dictdatabase-2.3.0-py3-none-any.whl.

File metadata

Download URL: dictdatabase-2.3.0-py3-none-any.whl
Upload date: Nov 8, 2022
Size: 18.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.2.2 CPython/3.10.8 Darwin/22.1.0

File hashes

Hashes for dictdatabase-2.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4316d45f0300e287bf093f75b0166c4db6ff9413abe6351be348dd4eca6b5682`
MD5	`fb50690a9080652620372cc2b8de4fff`
BLAKE2b-256	`4347c6da56024a8be82c4f84ec6e72c0fbfab31355c4fd129514a15656c833f9`

See more details on using hashes here.

dictdatabase 2.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Why use DictDataBase

Why not DictDataBase

Install

Configuration

Storage directory

Compression

Indentation

Sort keys

Use orjson

Usage

Import

Create a file

Check if file or sub-key exists

Read dicts

Write dicts

Partial writing

Folders

Select from folder

Performance

SQLite vs DictDataBase

More tests

API Reference

at(path) -> DDBMethodChooser:

DDBMethodChooser

exists() -> bool:

create(data=None, force_overwrite: bool = False):

delete()

read(self, as_type: T = None) -> dict | T | None:

session(self, as_type: T = None) -> DDBSession[T]:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`at(path) -> DDBMethodChooser:`

`exists() -> bool:`

`create(data=None, force_overwrite: bool = False):`

`delete()`

`read(self, as_type: T = None) -> dict | T | None:`

`session(self, as_type: T = None) -> DDBSession[T]:`