A zero-config, serverless JSON-based KV database. No schema, no setup, just data.
Reason this release was yanked:
change json_db -> json_dbx
Project description
👉 Quick Links
✨ Introduction
json_db is a high-performance, embedded database engine designed for Python developers who need the speed of a Key-Value store with the querying power of a document database. Built for extreme throughput and thread-safety, json_db leverages modern serialization (orjson, msgpack, marshal, pickle) and compression to provide a storage layer that is often significantly faster than SQLite for JSON-heavy workloads. Whether you are building a local cache, a log aggregator, or a distributed microservice, json_db provides the tools to handle data at scale with “Zero-Config” simplicity.
Schema-LESS: Store complex, nested data without pre-defining tables.
Server-LESS: Direct disk access without the overhead of a database server.
SQL-LESS: Use native Python syntax, Regex, and Lambdas for data manipulation.
🚀 Features
Extreme Performance: Leverages orjson and ormsgpack for serialization. [refer to Supported Data Formats]
Concurrency Control: Optimized for Many-Read / Single-Write environments using a robust file-locking and Lock mechanism.
Advanced Compression: Supports LZ4 (speed-focused), Zstandard (balanced), and Brotli (size-focused) to minimize storage footprint. [refer to Supported Zip Formats]
Powerful Querying: Search using Regular Expressions (RE), Lambda filters, or modification timestamps (Time-Travel query).
Memory Caching: Adjustable cache_limit to balance RAM usage and I/O speed.
Network Mode (JNetFiles): Transform a local json_db instance into a networked service with a single command using run_files_server. [refer to Network Mode]
In-Memory Mode (JMemFiles): Run the entire database in RAM for extreme performance (ideal for real-time caches or volatile session storage). [refer to In-memory Mode]
Revertable: Unlike traditional NoSQL stores, json_db tracks internal states allowing you to unwrite (rollback a modification) or undelete a record. This provides a safety net similar to a manual “Undo” or a lightweight ACID rollback. [ref to Rollback data]
Native CSV Support: Built-in hooks for DictReader and DictWriter allow you to import massive datasets from CSV files or export your json_db collections for analysis in Excel or Pandas.
Date-Based Lookups: Every record is timestamped, enabling queries like “Give me all users modified last Tuesday.” [refer to Date Lookups]
📌 Supported Python Versions
json_db has been tested with Python 3.7 - 3.13.
🛠️ Quick Start
Installation
pip install json_dbx
Basic usage
from json_db import JDb, JDbReader
# Initialize the database from file
# Key-Value is Json+Json without compression
jdb = JDb("example.jdb")
# Store data
jdb["user:001"] = {"name" : "Ryan", "role": "Developer"}
jdb.set("user:002", {"name" : "Joe", "role": "Developer"})
jdb.update({"user:002":{"name" : "Joe", "role": "Senior Developer"}})
# Retrieve data
user = jdb["user:001"]
print(user["name"]) # Output: Ryan
user = jdb.get("user:002")
print(user["name"]) # Output: Joe
# Remove data
user = jdb.pop("user:002", None)
print(user) # Output: {/name' : 'Joe', 'role': 'Senior Developer'}
print(set(jdb)) # Output: {'user:001'}
# create 2nd JDb with same database file
jdb2 = JDb("example.jdb")
jdb2["user:002"] = {"name" : "Kathy", "role": "CEO"}
print(set(jdb2)) # Output: {'user:001', 'user:002'}
# create 3rd read-only JDb with same database file
jdb3 = JDbReader(jdb)
print(set(jdb3)) # Output: {'user:001', 'user:002'}
assert jdb == jdb2
assert jdb == jdb3
assert len(jdb) == 2
print(jdb["user:002"]["name"]) # Output: Kathy
print(set(jdb)) # Output: {'user:001', 'user:002'}
All standard dict methods work: keys(), values(), items(), pop(), setdefault(), update().
In-memory Mode
from json_db import JDb
# Initialize the database in memory
# Key-Value is Json+Msgpack with Gzip compression
jdb = JDb(data_type="J+S", zip_type="gz")
# Store data
jdb += {"user:001" : {"name" : "Joe", "role": "Senior Developer"}}
# Retrieve data
user = jdb["user:001"]
print(user["name"]) # Output: Joe
# create 2nd JDb with same memory
jdb2 = JDb(jdb)
# Store data
jdb2["user:002"] ={"name" : "Kathy", "role": "CEO"}
assert jdb == jdb2
assert len(jdb) == 2
print(jdb["user:002"]["name"]) # Output: Kathy
print(set(jdb)) # Output: {'user:001', 'user:002'}
Rollback data
from json_db import JDb
# Initialize the database from file
# Key-Value is Json+Pickle with zstandard compression
jdb = JDb("fruit.jdb", data_type="J+P(zs)")
jdb["apple"] = "red"
jdb["apple"] = "blue" # modify
jdb.revert("apple") # unmodify
assert jdb["apple"] == 'red'
del jdb["apple"]
assert "apple" not in jdb
jdb.revert("apple") # unremove
assert jdb["apple"] == "red"
Query data
>> from json_db import JDb
>> # Initialize the database in memory
>> # Key-Value is Json+Marshal with no compression
>> jdb = JDb(data_type="J+M")
>> # insert value without key
>> jdb.insert_vals([{'name': 'John', 'age': 22},\
{'name': 'John', 'age': 37}, \
{'name': 'Bob', 'age': 42}, \
{'name': 'Megan', 'age': 27}])
>> jdb[:] # show all records from jdb
{'0': {'name': 'John', 'age': 22},
'1': {'name': 'John', 'age': 37},
'2': {'name': 'Bob', 'age': 42},
'3': {'name': 'Megan', 'age': 27}}
>> jdb.find(FUNC=lambda k,v: v.get('name', '') == 'John')
{'0': {'name': 'John', 'age': 22}, '1': {'name': 'John', 'age': 37}}
>> jdb.find(RE='John|Bob')
{'0': {'name': 'John', 'age': 22},
'1': {'name': 'John', 'age': 37},
'2': {'name': 'Bob', 'age': 42}}
Operator
from json_db import JDb
# Initialize the database in memory
# Key+Value is Msgpack+Msgpack with lz4 compression
jdb = JDb(data_type="S+S(lz)")
# [1] KEY+VAL operators
# <jdb += data> == jdb.update(data)
data = {f'key{v}':v for v in range(100)}
jdb += data
assert len(jdb) == 100
# <jdb == data>
assert jdb == data
# <jdb |= ..> == jdb.insert(..)
jdb |= {f'key{v}':v+1 for v in range(102)}
assert len(jdb) == 102
assert jdb['key100'] == 101
assert jdb[-2.:] == {'key100':101, 'key101':102} # get last two modified records
assert jdb[(f'key{v}' for v in range(100))] == data # same as jdb[data] == data
# <jdb -= ..> == jdb.remove(..)
jdb -= ['key100', 'key101', 'key102', 'key103']
assert len(jdb) == 100
assert jdb == data
# <jdb &= ..> == jdb.replace(..)
jdb &= {f'key{v}':v+1 for v in range(200)}
assert len(jdb) == 100
assert jdb == {f'key{v}':v+1 for v in range(100)}
# <jdb ^= ..> == jdb.unmodify(..)
jdb ^= {f'key{v}' for v in range(100)} # same as jdb ^= data
assert len(jdb) == 100
assert jdb == data
# <jdb[:] = ..> == jdb.update(..)
jdb[:] = 0 # set all records to zero
assert len(jdb) == 100
assert jdb == {f'key{v}':0 for v in range(100)}
assert jdb.find(NE=0) == {}
# remove all records
jdb -= jdb # same as del jdb[:]
assert len(jdb) == 0
# <jdb ^= ..> == jdb.unremove(..)
jdb ^= {f'key{v}' for v in range(100)} # same as jdb ^= data
assert len(jdb) == 100
assert all(val == 0 for key,val in jdb.items())
# lambda VALUE operation
jdb[:] = lambda key,val: int(key.replace('key', '')) + val
assert jdb == data
# <del jdb[..]> == jdb.remove_fast(..)
del jdb[data] # same as del jdb[:]
assert len(jdb) == 0
# unremove all data
jdb ^= data
assert jdb == data
# <jdb[..]> == jdb.get_n(..) or jdb.get_all()
matches = jdb[('key2', 'key22', 'key44', 'key111')]
assert matches == {'key2':2, 'key22':22, 'key44':44}
# lambda KEY operation
matches = jdb[lambda key:key.endswith('1')]
assert set(matches) == {'key1', 'key11', 'key21', 'key31', 'key41', 'key51', 'key61', 'key71', 'key81', 'key91'}
# set all matched records to -1
jdb[matches] = -1
matches_2 = jdb[lambda key,val: val == -1]
assert set(matches) == set(matches_2)
assert matches_2 == jdb.find(EQ=-1)
assert matches_2 == jdb.find(FUNC=lambda val: val == -1)
# RE search
matches_3 = jdb[::r'1$']
assert matches_2 == matches_3
# unmodify
jdb ^= matches
assert jdb == data
# [2] KEY operators
# <jdb & {..}> == jdb.intersection(..)
matches = jdb & {f'key{v}' for v in range(98, 120)}
assert matches == {'key98', 'key99'}
# <{..} & jdb> == {..}.intersection(jdb)
matches_2 = {f'key{v}' for v in range(98, 120)} & jdb
assert matches == matches_2
# <jdb | {..}> == jdb.union(..)
matches = jdb | {f'key{v}' for v in range(10, 120)}
assert len(matches) == 120
assert matches == {f'key{v}' for v in range(0, 120)}
# <{..} | jdb> == {..}.union(jdb)
matches_2 = {f'key{v}' for v in range(10, 120)} | jdb
assert matches == matches_2
# <jdb + {..}> == jdb.union(..)
matches = jdb + {f'key{v}' for v in range(10, 120)}
assert matches == matches_2
# <{..} + jdb> == {..}.union(jdb)
matches_2 = {f'key{v}' for v in range(10, 120)} + jdb
assert matches == matches_2
# <jdb - {..}> == jdb.difference(..)
matches = jdb - {f'key{v}' for v in range(0, 98)}
assert matches == {'key98', 'key99'}
# <{..} - jdb> == {..}.difference(jdb)
matches = {f'key{v}' for v in range(2, 102)} - jdb
assert matches == {'key100', 'key101'}
# <jdb ^ {..}> == jdb.non_intersection(..)
matches = jdb ^ {f'key{v}' for v in range(1, 101)}
assert matches == {'key0', 'key100'}
# <{..} ^ jdb> == {..}.non_intersection(jdb)
matches_2 = {f'key{v}' for v in range(1, 101)} ^ jdb
assert matches == matches_2
# <.. in jdb> == jdb.has_all(..)
assert 'key10' in jdb
assert {'key10', 'key90'} in jdb
assert {'key10', 'key90', 'key110', 'key190'} not in jdb
assert jdb.has('key10')
assert jdb.has_all('key10')
assert jdb.has_any('key10')
assert jdb.has_all({'key10', 'key90'})
assert jdb.has_any({'key10', 'key90', 'key110', 'key190'})
assert jdb.is_disjoint({'key110', 'key190'})
Date Lookups
from json_db import JDb
import datetime as dt
# Initialize the database in memory
# Key+Value is Json+Msgpack with Brotli compression
# using BTree as Key Table for better memory usage
jdb = JDb(data_type="J+S(br)", key_limit="bt")
# insert data
fruits = {'apple':'red', 'banana':'yellow', 'mango':'yellow', 'lemon':'yellow', 'tomato':'red'}
jdb += fruits
# datetime for create date, date for modify date
now = dt.datetime.now()
today = now.date()
# find create date: date == now
matches = jdb[now]
assert matches == fruits
# find create date: date >= now
matches = jdb[now:]
assert matches == fruits
# find create date: date < now
matches = jdb[:now]
assert len(matches) == 0
# find create date: now <= date <= now+1
next_date = now + dt.timedelta(days=1)
matches = jdb[now:next_date]
assert matches == fruits
prev_date = now - dt.timedelta(days=1)
prev_week = now - dt.timedelta(days=7)
# change key create date
jdb.keys['apple', 'tomato'] = prev_date
jdb.keys['mango'] = prev_week
assert jdb[prev_date] == {'apple':'red', 'tomato':'red'}
assert jdb[prev_week] == {'mango':'yellow'}
# find create date: date == now
matches = jdb[now]
assert set(matches) == {'banana', 'lemon'}
# find create date: date < now
matches = jdb[:now]
assert set(matches) == {'apple', 'mango', 'tomato'}
# find modify date: date == today
matches = jdb[today]
assert matches == fruits
# change key modify date + create date
new_modify_date = prev_date.date()
new_create_date = prev_week.date()
assert new_modify_date >= new_create_date
jdb.keys['lemon'] = f'{new_modify_date} {new_create_date}'
# find modify date: date == today
matches = jdb[today]
assert set(matches) == {'apple', 'banana', 'mango', 'tomato'}
# find modify date: date == prev_date
matches = jdb[prev_date.date()]
assert set(matches) == {'lemon'}
# change all keys create date
jdb.keys[:] = today
assert jdb[today] == fruits
Network Mode
Server side:
>> from json_db import run_files_server
>> run_files_server(host='0.0.0.0', port=59698, files='net_storage.jdb')
Client side:
>> from json_db import JDb, JNetFiles
>> jdb = JDb(JNetFiles(('0.0.0.0', 59898)))
📝 Specifications
Supported Data Formats
Configure data_type during initialization:
J+J: JSON Key + JSON Value (default)
J+S: JSON Key + MsgPack Value
J+M: JSON Key + Marshal Value
J+P: JSON Key + Pickle Value
S+J: MsgPack Key + JSON Value
S+S: MsgPack Key + MsgPack Value
S+M: MsgPack Key + Marshal Value
S+P: MsgPack Key + Pickle Value
Data size = 70,840,580 (MB = 1,000,000B, no zip)
data_type |
size |
ratio |
read |
write |
GOODs |
BADs |
|---|---|---|---|---|---|---|
J+J or S+J |
70,840,580 |
1.00 |
75.3MB/s |
358.0MB/s |
|
|
J+S or S+S |
47,616,008 |
1.48 |
77.4MB/s |
354.2MB/s |
|
|
J+M or S+M |
72,430,958 |
0.97 |
81.4MB/s |
177.1MB/s |
|
|
J+P or S+P |
70,207,207 |
1.01 |
64.9MB/s |
22.8MB/s |
|
|
Supported Zip Formats
Configure zip_type during initialization:
no: no compression for Value (default)
gz: Gzip (mode=9) compression for Value
bz: Bzip2 (mode=9) compression for Value
xz: LZMA compression for Value
zs: Zstandard (mode=22) compression for Value
br: Brotli (mode=6) compression for Value (better than gz)
z1: Zstandard (mode=6) compression for Value (better than gz)
z2: Zstandard (mode=11) compression for Value
lz: LZ4 (mode=0) compression for Value
Data size = 70,840,580 (MB = 1,000,000B)
zip_type |
size |
ratio |
read |
write |
GOODs |
BADs |
|---|---|---|---|---|---|---|
no |
70,840,580 |
1.00 |
75.3MB/s |
358.0MB/s |
|
|
gz |
16,915,844 |
4.18 |
65.5MB/s |
5.1MB/s |
|
|
bz |
11,394,042 |
6.21 |
26.4MB/s |
10.8MB/s |
|
|
xz |
11,340,548 |
6.24 |
54.9MB/s |
2.3MB/s |
|
|
zs |
11,119,665 |
6.37 |
73.0MB/s |
1.7MB/s |
|
|
br |
13,700,696 |
5.17 |
65.8MB/s |
25.3MB/s |
|
|
z1 |
14,738,859 |
4.80 |
73.6MB/s |
70.8MB/s |
|
|
z2 |
13,799,407 |
5.13 |
72.7MB/s |
23.6MB/s |
|
|
lz |
26,226,039 |
2.70 |
75.6MB/s |
202.4MB/s |
|
|
Supported Key Table Formats
Configure key_limit during initialization:
no: dict for key_table (default)
bt: BTree for key_table (save 44.3% vs dict)
l0 - l5: LiteKeyTable modes (save 60-75%+ vs dict)
Table size = 3,241,854 keys
key_limit |
memory |
key search |
HIT > get() |
MISS > get() |
|---|---|---|---|---|
no |
519MB |
48.59Mo/s |
29.28Mo/s |
18.3Mo/s |
bt |
289MB |
3.46Mo/s |
3.07Mo/s |
8.04Mo/s |
l3 |
85MB |
2.01Mo/s |
2.01Mo/s |
1.59Mo/s |
📊 Benchmarking
Testing
>> from json_db import JDb
>> size = 1_000_000
>> jdb = JDb(data_type='J+J')
>> data = {f'key{k}':k for k in range(size)}
>> # Benchmarking operations
>> jdb += data # insert
>> jdb[:] # get_all
>> jdb -= data # remove
>> jdb ^= data # revert=unremove
>> jdb[data] = -1 # replace
>> jdb ^= data # revert=unmodify
Results
size |
insert |
get_all |
remove |
unremove |
replace |
unmodify |
|---|---|---|---|---|---|---|
1 |
132 μs |
89 μs |
111 μs |
96 μs |
91 μs |
83 μs |
10 |
136 μs |
93 μs |
142 μs |
145 μs |
183 μs |
177 μs |
100 |
442 μs |
319 μs |
594 μs |
680 μs |
876 μs |
976 μs |
1K |
3.37 ms |
2.71 ms |
5.24 ms |
5.9 ms |
7.61 ms |
9.12 ms |
10K |
32.2 ms |
26 ms |
54.3 ms |
55.8 ms |
77.5 ms |
91.1 ms |
100K |
358 ms |
262 ms |
626 ms |
583 ms |
774 ms |
930 ms |
1M |
3.87 s |
2.78 s |
7 s |
6.09 s |
8.15 s |
9.83 s |
👥 Contributing
Whether reporting bugs, discussing improvements and new ideas or writing extensions: Contributions to json_dbx are welcome! Here’s how to get started:
Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.
Fork the repository on Github, create a new branch off the master branch and start making your changes (known as GitHub Flow).
Write a test which shows that the bug was fixed or that the feature works as expected.
Send a pull request and bug the maintainer until it gets merged and published ☺
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file json_dbx-2.6.40.tar.gz.
File metadata
- Download URL: json_dbx-2.6.40.tar.gz
- Upload date:
- Size: 111.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67cd554bd23dd1b2682a0de72444b1497421fab836745397fdf0ad143db41801
|
|
| MD5 |
e7de2b7e9c0dfcdf7da4a09db619199f
|
|
| BLAKE2b-256 |
97c45fa99e2f99a422af9bc2d892afa452823bdc958c1b88528b6430398abfe7
|
File details
Details for the file json_dbx-2.6.40-py3-none-any.whl.
File metadata
- Download URL: json_dbx-2.6.40-py3-none-any.whl
- Upload date:
- Size: 81.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
163e7619c14b94d6ac9e0265fb5290a87e12fbc9c2b0b702519834abe1fd5c22
|
|
| MD5 |
a899334b4181012a4e9b21c7606bb4ac
|
|
| BLAKE2b-256 |
5d7b875ae5e31bacd44741c2b130f9f2efb7e491decfb389232fda22b8de899c
|