Skip to main content

A zero-config, serverless JSON-based KV database. No schema, no setup, just data.

Reason this release was yanked:

README typo

Project description

✨ Introduction

json_db is a high-performance, embedded database engine designed for Python developers who need the speed of a Key-Value store with the querying power of a document database. Built for extreme throughput and thread-safety, json_db leverages modern serialization (orjson, msgpack, marshal, pickle) and compression to provide a storage layer that is often significantly faster than SQLite for JSON-heavy workloads. Whether you are building a local cache, a log aggregator, or a distributed microservice, json_db provides the tools to handle data at scale with “Zero-Config” simplicity.

  • Schema-LESS: Store complex, nested data without pre-defining tables.

  • Server-LESS: Direct disk access without the overhead of a database server.

  • SQL-LESS: Use native Python syntax, Regex, and Lambdas for data manipulation.

🚀 Features

  • Extreme Performance: Leverages orjson and ormsgpack for serialization. [refer to Supported Data Formats]

  • Concurrency Control: Optimized for Many-Read / Single-Write environments using a robust file-locking and Lock mechanism.

  • Advanced Compression: Supports LZ4 (speed-focused), Zstandard (balanced), and Brotli (size-focused) to minimize storage footprint. [refer to Supported Zip Formats]

  • Powerful Querying: Search using Regular Expressions (RE), Lambda filters, or modification timestamps (Time-Travel query).

  • Memory Caching: Adjustable cache_limit to balance RAM usage and I/O speed.

  • Network Mode (JNetFiles): Transform a local json_db instance into a networked service with a single command using run_files_server. [refer to Network Mode]

  • In-Memory Mode (JMemFiles): Run the entire database in RAM for extreme performance (ideal for real-time caches or volatile session storage). [refer to In-memory Mode]

  • Revertable: Unlike traditional NoSQL stores, json_db tracks internal states allowing you to unwrite (rollback a modification) or undelete a record. This provides a safety net similar to a manual “Undo” or a lightweight ACID rollback. [ref to Rollback data]

  • Native CSV Support: Built-in hooks for DictReader and DictWriter allow you to import massive datasets from CSV files or export your json_db collections for analysis in Excel or Pandas.

  • Date-Based Lookups: Every record is timestamped, enabling queries like “Give me all users modified last Tuesday.” [refer to Date Lookups]

📌 Supported Python Versions

json_db has been tested with Python 3.7 - 3.13.

🛠️ Quick Start

Installation

pip install json_db

Basic usage

from json_db import JDb, JDbReader
# Initialize the database from file
# Key-Value is Json+Json without compression
jdb = JDb("example.jdb")

# Store data
jdb["user:001"] = {"name" : "Ryan", "role": "Developer"}
jdb.set("user:002", {"name" : "Joe", "role": "Developer"})
jdb.update({"user:002":{"name" : "Joe", "role": "Senior Developer"}})

# Retrieve data
user = jdb["user:001"]
print(user["name"]) # Output: Ryan

user = jdb.get("user:002")
print(user["name"]) # Output: Joe

# Remove data
user = jdb.pop("user:002", None)
print(user) # Output: {/name' : 'Joe', 'role': 'Senior Developer'}
print(set(jdb)) # Output: {'user:001'}

# create 2nd JDb with same database file
jdb2 = JDb("example.jdb")
jdb2["user:002"] = {"name" : "Kathy", "role": "CEO"}
print(set(jdb2)) # Output: {'user:001', 'user:002'}

# create 3rd read-only JDb with same database file
jdb3 = JDbReader(jdb)
print(set(jdb3)) # Output: {'user:001', 'user:002'}

assert jdb == jdb2
assert jdb == jdb3
assert len(jdb) == 2
print(jdb["user:002"]["name"]) # Output: Kathy
print(set(jdb)) # Output: {'user:001', 'user:002'}

All standard dict methods work: keys(), values(), items(), pop(), setdefault(), update().

In-memory Mode

from json_db import JDb
# Initialize the database in memory
# Key-Value is Json+Msgpack with Gzip compression
jdb = JDb(data_type="J+S", zip_type="gz")

# Store data
jdb += {"user:001" : {"name" : "Joe", "role": "Senior Developer"}}

# Retrieve data
user = jdb["user:001"]
print(user["name"]) # Output: Joe

# create 2nd JDb with same memory
jdb2 = JDb(jdb)

# Store data
jdb2["user:002"] ={"name" : "Kathy", "role": "CEO"}

assert jdb == jdb2
assert len(jdb) == 2
print(jdb["user:002"]["name"]) # Output: Kathy
print(set(jdb)) # Output: {'user:001', 'user:002'}

Rollback data

from json_db import JDb
# Initialize the database from file
# Key-Value is Json+Pickle with zstandard compression
jdb = JDb("fruit.jdb", data_type="J+P(zs)")

jdb["apple"] = "red"
jdb["apple"] = "blue" # modify
jdb.revert("apple") # unmodify
assert jdb["apple"] == 'red'

del jdb["apple"]
assert "apple" not in jdb

jdb.revert("apple") # unremove
assert jdb["apple"] == "red"

Query data

>> from json_db import JDb
>> # Initialize the database in memory
>> # Key-Value is Json+Marshal with no compression
>> jdb = JDb(data_type="J+M")

>> # insert value without key
>> jdb.insert_vals([{'name': 'John', 'age': 22},\
      {'name': 'John', 'age': 37}, \
      {'name': 'Bob', 'age': 42}, \
      {'name': 'Megan', 'age': 27}])

>> jdb[:] # show all records from jdb
{'0': {'name': 'John', 'age': 22},
'1': {'name': 'John', 'age': 37},
'2': {'name': 'Bob', 'age': 42},
'3': {'name': 'Megan', 'age': 27}}

>> jdb.find(FUNC=lambda k,v: v.get('name', '') == 'John')
{'0': {'name': 'John', 'age': 22}, '1': {'name': 'John', 'age': 37}}

>> jdb.find(RE='John|Bob')
{'0': {'name': 'John', 'age': 22},
'1': {'name': 'John', 'age': 37},
'2': {'name': 'Bob', 'age': 42}}

Operator

from json_db import JDb
# Initialize the database in memory
# Key+Value is Msgpack+Msgpack with lz4 compression
jdb = JDb(data_type="S+S(lz)")

# [1] KEY+VAL operators
# <jdb += data> == jdb.update(data)
data = {f'key{v}':v for v in range(100)}
jdb += data
assert len(jdb) == 100

# <jdb == data>
assert jdb == data

# <jdb |= ..> == jdb.insert(..)
jdb |= {f'key{v}':v+1 for v in range(102)}
assert len(jdb) == 102
assert jdb['key100'] == 101
assert jdb[-2.:] == {'key100':101, 'key101':102} # get last two modified records
assert jdb[(f'key{v}' for v in range(100))] == data # same as jdb[data] == data

# <jdb -= ..> == jdb.remove(..)
jdb -= ['key100', 'key101', 'key102', 'key103']
assert len(jdb) == 100
assert jdb == data

# <jdb &= ..> == jdb.replace(..)
jdb &= {f'key{v}':v+1 for v in range(200)}
assert len(jdb) == 100
assert jdb == {f'key{v}':v+1 for v in range(100)}

# <jdb ^= ..> == jdb.unmodify(..)
jdb ^= {f'key{v}' for v in range(100)} # same as jdb ^= data
assert len(jdb) == 100
assert jdb == data

# <jdb[:] = ..> == jdb.update(..)
jdb[:] = 0 # set all records to zero
assert len(jdb) == 100
assert jdb == {f'key{v}':0 for v in range(100)}
assert jdb.find(NE=0) == {}

# remove all records
jdb -= jdb # same as del jdb[:]
assert len(jdb) == 0

# <jdb ^= ..> == jdb.unremove(..)
jdb ^= {f'key{v}' for v in range(100)} # same as jdb ^= data
assert len(jdb) == 100
assert all(val == 0 for key,val in jdb.items())

# lambda VALUE operation
jdb[:] = lambda key,val: int(key.replace('key', '')) + val
assert jdb == data

# <del jdb[..]> == jdb.remove_fast(..)
del jdb[data] # same as del jdb[:]
assert len(jdb) == 0

# unremove all data
jdb ^= data
assert jdb == data

# <jdb[..]> == jdb.get_n(..) or jdb.get_all()
matches = jdb[('key2', 'key22', 'key44', 'key111')]
assert matches == {'key2':2, 'key22':22, 'key44':44}

# lambda KEY operation
matches = jdb[lambda key:key.endswith('1')]
assert set(matches) == {'key1', 'key11', 'key21', 'key31', 'key41', 'key51', 'key61', 'key71', 'key81', 'key91'}

# set all matched records to -1
jdb[matches] = -1
matches_2 = jdb[lambda key,val: val == -1]
assert set(matches) == set(matches_2)
assert matches_2 == jdb.find(EQ=-1)
assert matches_2 == jdb.find(FUNC=lambda val: val == -1)

# RE search
matches_3 = jdb[::r'1$']
assert matches_2 == matches_3

# unmodify
jdb ^= matches
assert jdb == data

# [2] KEY operators
# <jdb & {..}> == jdb.intersection(..)
matches = jdb & {f'key{v}' for v in range(98, 120)}
assert matches == {'key98', 'key99'}

# <{..} & jdb> == {..}.intersection(jdb)
matches_2 = {f'key{v}' for v in range(98, 120)} & jdb
assert matches == matches_2

# <jdb | {..}> == jdb.union(..)
matches = jdb | {f'key{v}' for v in range(10, 120)}
assert len(matches) == 120
assert matches == {f'key{v}' for v in range(0, 120)}

# <{..} | jdb> == {..}.union(jdb)
matches_2 = {f'key{v}' for v in range(10, 120)} | jdb
assert matches == matches_2

# <jdb + {..}> == jdb.union(..)
matches = jdb + {f'key{v}' for v in range(10, 120)}
assert matches == matches_2

# <{..} + jdb> == {..}.union(jdb)
matches_2 = {f'key{v}' for v in range(10, 120)} + jdb
assert matches == matches_2

# <jdb - {..}> == jdb.difference(..)
matches = jdb - {f'key{v}' for v in range(0, 98)}
assert matches == {'key98', 'key99'}

# <{..} - jdb> == {..}.difference(jdb)
matches = {f'key{v}' for v in range(2, 102)} - jdb
assert matches == {'key100', 'key101'}

# <jdb ^ {..}> == jdb.non_intersection(..)
matches = jdb ^ {f'key{v}' for v in range(1, 101)}
assert matches == {'key0', 'key100'}

# <{..} ^ jdb> == {..}.non_intersection(jdb)
matches_2 = {f'key{v}' for v in range(1, 101)} ^ jdb
assert matches == matches_2

# <.. in jdb> == jdb.has_all(..)
assert 'key10' in jdb
assert {'key10', 'key90'} in jdb
assert {'key10', 'key90', 'key110', 'key190'} not in jdb
assert jdb.has('key10')
assert jdb.has_all('key10')
assert jdb.has_any('key10')
assert jdb.has_all({'key10', 'key90'})
assert jdb.has_any({'key10', 'key90', 'key110', 'key190'})
assert jdb.is_disjoint({'key110', 'key190'})

Date Lookups

from json_db import JDb
import datetime as dt

# Initialize the database in memory
# Key+Value is Json+Msgpack with Brotli compression
# using BTree as Key Table for better memory usage
jdb = JDb(data_type="J+S(br)", key_limit="bt")

# insert data
fruits = {'apple':'red', 'banana':'yellow', 'mango':'yellow', 'lemon':'yellow', 'tomato':'red'}
jdb += fruits

# datetime for create date, date for modify date
now = dt.datetime.now()
today = now.date()

# find create date: date == now
matches = jdb[now]
assert matches == fruits

# find create date: date >= now
matches = jdb[now:]
assert matches == fruits

# find create date: date < now
matches = jdb[:now]
assert len(matches) == 0

# find create date: now <= date <= now+1
next_date = now + dt.timedelta(days=1)
matches = jdb[now:next_date]
assert matches == fruits

prev_date = now - dt.timedelta(days=1)
prev_week = now - dt.timedelta(days=7)

# change key create date
jdb.keys['apple', 'tomato'] = prev_date
jdb.keys['mango'] = prev_week
assert jdb[prev_date] == {'apple':'red', 'tomato':'red'}
assert jdb[prev_week] == {'mango':'yellow'}

# find create date: date == now
matches = jdb[now]
assert set(matches) == {'banana', 'lemon'}

# find create date: date < now
matches = jdb[:now]
assert set(matches) == {'apple', 'mango', 'tomato'}

# find modify date: date == today
matches = jdb[today]
assert matches == fruits

# change key modify date + create date
new_modify_date = prev_date.date()
new_create_date = prev_week.date()
assert new_modify_date >= new_create_date
jdb.keys['lemon'] = f'{new_modify_date} {new_create_date}'

# find modify date: date == today
matches = jdb[today]
assert set(matches) == {'apple', 'banana', 'mango', 'tomato'}

# find modify date: date == prev_date
matches = jdb[prev_date.date()]
assert set(matches) == {'lemon'}

# change all keys create date
jdb.keys[:] = today
assert jdb[today] == fruits

Network Mode

Server side:

>> from json_db import run_files_server
>> run_files_server(host='0.0.0.0', port=59698, files='net_storage.jdb')

Client side:

>> from json_db import JDb, JNetFiles
>> jdb = JDb(JNetFiles(('0.0.0.0', 59898)))

📝 Specifications

Supported Data Formats

Configure data_type during initialization:

  • J+J: JSON Key + JSON Value (default)

  • J+S: JSON Key + MsgPack Value

  • J+M: JSON Key + Marshal Value

  • J+P: JSON Key + Pickle Value

  • S+J: MsgPack Key + JSON Value

  • S+S: MsgPack Key + MsgPack Value

  • S+M: MsgPack Key + Marshal Value

  • S+P: MsgPack Key + Pickle Value

Data size = 70,840,580 (MB = 1,000,000B, no zip)

data_type

size

ratio

read

write

GOODs

BADs

J+J or S+J

70,840,580

1.00

75.3MB/s

358.0MB/s

  • fastest write

  • faster read

  • readable

  • no set()

  • no tuple()

  • weak bytes

J+S or S+S

47,616,008

1.48

77.4MB/s

354.2MB/s

  • smallest size

  • faster read

  • faster write

  • no tuple()

  • unreadable

J+M or S+M

72,430,958

0.97

81.4MB/s

177.1MB/s

  • all type [1]

  • fastest read

  • biggest size

  • unreadable

J+P or S+P

70,207,207

1.01

64.9MB/s

22.8MB/s

  • slowest read

  • slowest write

  • unreadable

Supported Zip Formats

Configure zip_type during initialization:

  • no: no compression for Value (default)

  • gz: Gzip (mode=9) compression for Value

  • bz: Bzip2 (mode=9) compression for Value

  • xz: LZMA compression for Value

  • zs: Zstandard (mode=22) compression for Value

  • br: Brotli (mode=6) compression for Value (better than gz)

  • z1: Zstandard (mode=6) compression for Value (better than gz)

  • z2: Zstandard (mode=11) compression for Value

  • lz: LZ4 (mode=0) compression for Value

Data size = 70,840,580 (MB = 1,000,000B)

zip_type

size

ratio

read

write

GOODs

BADs

no

70,840,580

1.00

75.3MB/s

358.0MB/s

  • fastest speed

  • biggest size

gz

16,915,844

4.18

65.5MB/s

5.1MB/s

  • slower zip

bz

11,394,042

6.21

26.4MB/s

10.8MB/s

  • better ratio

  • slowest unzip

xz

11,340,548

6.24

54.9MB/s

2.3MB/s

  • better ratio

  • slower zip

  • slower unzip

zs

11,119,665

6.37

73.0MB/s

1.7MB/s

  • best ratio

  • faster unzip

  • slowest zip

br

13,700,696

5.17

65.8MB/s

25.3MB/s

  • better gz

z1

14,738,859

4.80

73.6MB/s

70.8MB/s

  • faster zip

  • faster unzip

z2

13,799,407

5.13

72.7MB/s

23.6MB/s

  • faster unzip

lz

26,226,039

2.70

75.6MB/s

202.4MB/s

  • fastest zip

  • fastest unzip

  • worst ratio

Supported Key Table Formats

Configure key_limit during initialization:

  • no: dict for key_table (default)

  • bt: BTree for key_table (save 44.3% vs dict)

  • l0 - l5: LiteKeyTable modes (save 60-75%+ vs dict)

Table size = 3,241,854 keys

key_limit

memory

key search

HIT > get()

MISS > get()

no

519MB

48.59Mo/s

29.28Mo/s

18.3Mo/s

bt

289MB

3.46Mo/s

3.07Mo/s

8.04Mo/s

l3

85MB

2.01Mo/s

2.01Mo/s

1.59Mo/s

📊 Benchmarking

Testing

>> from json_db import JDb
>> size = 1_000_000
>> jdb = JDb(data_type='J+J')
>> data = {f'key{k}':k for k in range(size)}

>> # Benchmarking operations
>> jdb += data        # insert
>> jdb[:]             # get_all
>> jdb -= data        # remove
>> jdb ^= data        # revert=unremove
>> jdb[data] = -1     # replace
>> jdb ^= data        # revert=unmodify

Results

size

insert

get_all

remove

unremove

replace

unmodify

1

132 μs

89 μs

111 μs

96 μs

91 μs

83 μs

10

136 μs

93 μs

142 μs

145 μs

183 μs

177 μs

100

442 μs

319 μs

594 μs

680 μs

876 μs

976 μs

1K

3.37 ms

2.71 ms

5.24 ms

5.9 ms

7.61 ms

9.12 ms

10K

32.2 ms

26 ms

54.3 ms

55.8 ms

77.5 ms

91.1 ms

100K

358 ms

262 ms

626 ms

583 ms

774 ms

930 ms

1M

3.87 s

2.78 s

7 s

6.09 s

8.15 s

9.83 s

👥 Contributing

Whether reporting bugs, discussing improvements and new ideas or writing extensions: Contributions to json_db are welcome! Here’s how to get started:

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.

  2. Fork the repository on Github, create a new branch off the master branch and start making your changes (known as GitHub Flow).

  3. Write a test which shows that the bug was fixed or that the feature works as expected.

  4. Send a pull request and bug the maintainer until it gets merged and published ☺

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

json_dbx-2.6.20.tar.gz (111.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

json_dbx-2.6.20-py3-none-any.whl (81.2 kB view details)

Uploaded Python 3

File details

Details for the file json_dbx-2.6.20.tar.gz.

File metadata

  • Download URL: json_dbx-2.6.20.tar.gz
  • Upload date:
  • Size: 111.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for json_dbx-2.6.20.tar.gz
Algorithm Hash digest
SHA256 55c277e0906eaaa53c1f4fd6792526f6661725b873d56c1d71974390c2f626cd
MD5 3e358fa9379b53bb504d207251f1826e
BLAKE2b-256 81daf18ccab88981a8e0f76bcbeebaea90dbeabf98aa948864a0330456ce51a5

See more details on using hashes here.

File details

Details for the file json_dbx-2.6.20-py3-none-any.whl.

File metadata

  • Download URL: json_dbx-2.6.20-py3-none-any.whl
  • Upload date:
  • Size: 81.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for json_dbx-2.6.20-py3-none-any.whl
Algorithm Hash digest
SHA256 d2aed63404defa0c5d5070b800248f90ffdbb98c0249c4cebf0e2fb862cb9b8e
MD5 effa6d7ca56950b73d11f3b58b840542
BLAKE2b-256 34d630201c18edf78b3c4e470d7b5396c46efe3c67e830552b55ed74f5009210

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page