Skip to main content

SQLite-backed KV store (Rust+PyO3) with vector search, auto-packing, and efficient array storage

Project description

KohakuVault

SQLite-backed storage with Rust performance. Works like Python dict/list but persisted and fast.

Installation

pip install kohakuvault

KVault - Works Like a Dict

from kohakuvault import KVault
import numpy as np

kv = KVault("app.db")

# Just use it like a dict - store anything
kv["user:123"] = {"name": "Alice", "email": "alice@example.com"}
kv["embeddings"] = np.random.randn(768).astype(np.float32)
kv["settings"] = {"theme": "dark", "notifications": True}
kv["scores"] = [95.5, 87.3, 92.1]

# Get back actual Python objects
user = kv["user:123"]  # dict
embeddings = kv["embeddings"]  # numpy array
settings = kv["settings"]  # dict

# Standard dict operations
del kv["user:123"]
if "settings" in kv:
    print("Settings exist")

# Bulk operations with cache for speed
with kv.cache():
    for i in range(100000):
        kv[f"item:{i}"] = {"id": i, "data": f"value_{i}"}

That's it. No manual pickle/msgpack/json. Just works.

ColumnVault - Works Like Typed Lists

from kohakuvault import ColumnVault

cv = ColumnVault("data.db")

# Create typed columns
user_ids = cv.create_column("user_ids", "i64")
scores = cv.create_column("scores", "f64")
comments = cv.create_column("comments", "str:utf8")
metadata = cv.create_column("metadata", "msgpack")

# Use like lists
user_ids.append(1)
user_ids.extend([2, 3, 4, 5])

scores.append(95.5)
scores.extend([87.3, 92.1, 88.0])

# Indexing and slicing
first_score = scores[0]
batch = scores[10:20]  # Fast batch read

# Updates
scores[5] = 99.0
scores[10:15] = [100.0, 101.0, 102.0, 103.0, 104.0]

# Iterate
for score in scores[:10]:
    print(score)

Built-in types: i64, f64, str:utf8, bytes, msgpack, cbor

Vector Storage

Store arrays/tensors efficiently:

# Embeddings (768-dim vectors)
embeddings = cv.create_column("embeddings", "vec:f32:768")
embeddings.append(np.random.randn(768).astype(np.float32))
embeddings.extend([model.encode(text) for text in documents])

# Images (28x28 grayscale)
images = cv.create_column("mnist", "vec:u8:28:28")
images.append(np.random.randint(0, 256, (28, 28), dtype=np.uint8))

# RGB images (224x224x3)
photos = cv.create_column("photos", "vec:u8:3:224:224")

# Matrices
matrices = cv.create_column("correlations", "vec:f64:100:100")

Vector types: vec:f32:768 (fixed), vec:i64:10:20 (2D), vec:u8:3:224:224 (3D), vec:f32 (arbitrary)

Vector Search

Find similar vectors:

from kohakuvault import VectorKVault

# Create search index
search = VectorKVault("search.db", dimensions=384, metric="cosine")

# Add vectors
for doc, embedding in zip(documents, embeddings):
    search.insert(embedding, doc.encode())

# Search for similar
results = search.search(query_embedding, k=10)
for id, distance, doc in results:
    print(f"{distance:.3f}: {doc.decode()}")

# Get single closest
closest = search.get(query_embedding)

Metrics: cosine (text), l2 (images), l1, hamming

Performance Tips

Use Cache for Bulk Writes

# 10-100x faster with cache
with kv.cache():
    for i in range(100000):
        kv[f"key:{i}"] = data

# Or enable cache
kv.enable_cache()
# ... write operations ...
kv.flush_cache()

Batch Operations Beat Loops

# Slow: loop
for i in range(1000):
    column.append(value)

# Fast: bulk extend (450x faster!)
column.extend([value] * 1000)

# Slow: loop read
for i in range(100):
    item = column[i]

# Fast: slice read (200x faster!)
batch = column[0:100]

Use DataPacker for Custom Serialization

from kohakuvault import DataPacker

# Primitives
packer_i64 = DataPacker("i64")
packed = packer_i64.pack(42)

# MessagePack for dicts/lists
packer_msg = DataPacker("msgpack")
packed = packer_msg.pack({"key": "value"})

# Bulk operations (3-35x faster!)
values = list(range(100000))
packed = packer_i64.pack_many(values)
unpacked = packer_i64.unpack_many(packed, count=100000)

Real-World Example

from kohakuvault import KVault, ColumnVault, VectorKVault
import numpy as np

# All components share one database file
db = "app.db"

# 1. Store documents
kv = KVault(db, table="docs")
for doc_id, content in documents:
    kv[doc_id] = content  # Auto-packs to MessagePack or keeps bytes

# 2. Store metadata in columns
cv = ColumnVault(db)
ids = cv.create_column("doc_ids", "i64")
titles = cv.create_column("titles", "str:utf8")
vectors = cv.create_column("embeddings", "vec:f32:384")

for i, (title, embedding) in enumerate(zip(titles_list, embeddings_list)):
    ids.append(i)
    titles.append(title)
    vectors.append(embedding)

# 3. Build search index
search = VectorKVault(db, table="search", dimensions=384, metric="cosine")
for i, embedding in enumerate(embeddings_list):
    search.insert(embedding, str(i).encode())

# 4. Search and retrieve
query = model.encode("search query")
results = search.search(query, k=5)

for rank, (vec_id, distance, doc_idx_bytes) in enumerate(results, 1):
    idx = int(doc_idx_bytes.decode())
    title = titles[idx]
    content = kv[f"doc:{idx}"]
    print(f"{rank}. {title} (similarity: {1-distance:.2f})")

Performance

M1 Max benchmarks:

  • KVault: 24K writes/s, 63K reads/s (with cache)
  • Column extend: 12.5M ops/s (i64)
  • Column slice: 2.3M slices/s (100 items)
  • Vector unpack: 35x faster bulk vs loop
  • MessagePack: 42% smaller than JSON

See examples/benchmark.py

Features

  • Auto-packing: Store numpy, dict, list, int, float, str automatically
  • Vector search: k-NN similarity search (cosine, L2, L1, hamming)
  • Vector storage: Arrays/tensors in columns (1-byte overhead)
  • Caching: Write-back cache with auto-flush
  • Batch operations: Slice read/write, pack_many/unpack_many
  • Single file: All components share one SQLite database
  • Ordered containers: CSBTree (B+Tree), SkipList (lock-free)
  • Type wrappers: MsgPack(data), Json(data), Cbor(data), Pickle(data)

Storage Types

KVault (dict-like):

  • Any Python object (auto-packed)
  • Works with cache for performance

ColumnVault (list-like):

  • i64, f64 - Integers, floats
  • str:utf8 - Strings
  • bytes, bytes:N - Raw bytes
  • msgpack, cbor - Structured data
  • vec:f32:768, vec:u8:28:28 - Arrays/vectors

VectorKVault (search):

  • k-NN similarity search
  • Multiple metrics
  • Numpy arrays

Development

pip install -e .[dev]
pytest
cargo test
cargo clippy

Documentation

  • examples/basic_usage.py - Quick examples
  • examples/all_usage.py - Complete feature tour
  • docs/ - Detailed guides

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kohakuvault-0.7.0.tar.gz (192.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kohakuvault-0.7.0-cp313-cp313-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.13Windows x86-64

kohakuvault-0.7.0-cp313-cp313-manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

kohakuvault-0.7.0-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

kohakuvault-0.7.0-cp312-cp312-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.12Windows x86-64

kohakuvault-0.7.0-cp312-cp312-manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

kohakuvault-0.7.0-cp312-cp312-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

kohakuvault-0.7.0-cp311-cp311-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.11Windows x86-64

kohakuvault-0.7.0-cp311-cp311-manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

kohakuvault-0.7.0-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

kohakuvault-0.7.0-cp310-cp310-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.10Windows x86-64

kohakuvault-0.7.0-cp310-cp310-manylinux_2_34_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

kohakuvault-0.7.0-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file kohakuvault-0.7.0.tar.gz.

File metadata

  • Download URL: kohakuvault-0.7.0.tar.gz
  • Upload date:
  • Size: 192.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kohakuvault-0.7.0.tar.gz
Algorithm Hash digest
SHA256 552732997086d24f0d508781859c5cfe1c849055e7beb9740061a836065d0607
MD5 3760fc1ce84a0e0deee996b686418358
BLAKE2b-256 f01c1af1bab962ae35365b49bb7b298ecbd9fe573c3638e5506f9deda801a16b

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0.tar.gz:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 6f6cfeea826dfea47f3a2b6004d59093703ae131e2f2f96215d9e2c6b9e61ab0
MD5 0795f5ba94cb7fbfee8d802201ba75bc
BLAKE2b-256 7260623b47ac7a711f32e55c1a73cd779b46088144498d9c55dbc2c0291dd3c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp313-cp313-win_amd64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6892244902921be01fd2db578a07026068f5b5f341026bd3fb540a095bae6023
MD5 ae99c04cb81fe18489b8b2bf7ee6b139
BLAKE2b-256 e77480b28a9462bba4bb0dec9b1faf910c1fe512b559c7e4b8e0eda89af7848f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp313-cp313-manylinux_2_34_x86_64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 916d698ccab105dbbad97bbf7e55d646392f2cbb50275afa9f0ffade8350d33d
MD5 e83385db2dd1311e1e038b65014e9784
BLAKE2b-256 9a055317aca384a9f7fae7965e89d90ef77b0c13e064c7a44d0f499c020415c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 50d6db7343fbe3845610aa5c83272285b5a6d8ad484eb4876a9f275c5715d4db
MD5 5f2ab81c1d94d6e5c513c15ebde68f41
BLAKE2b-256 82354d070bc39667ab77a2efe7bac8b463d96686cfbe2fa0aa493ce512ef759a

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp312-cp312-win_amd64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 b53f88e62bf5785c6b2f3ba83b20f450b4db32e8a3aacde594eb64c931a8b537
MD5 85b9ebd607fee6c88cff68ee00c6a5cb
BLAKE2b-256 3072eb72b59a9f876f74d7fe1ba8dabf5829eaceccf66f08314fa14f74280b1c

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp312-cp312-manylinux_2_34_x86_64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e43844344fa5752dea0e65ca19af122bae19546cde28fd5305b021a44a9c31a0
MD5 16a7c5bc6c428d7ffcc12c2592f34e5a
BLAKE2b-256 672764c024e63da4a664dcc84b703494f43439f7b8736234ea96402afcb62bfe

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 8af6291fa5dde6a04719eca689c27ed7e4292090e27866b4fe6c9943febd6a46
MD5 603bd3804622f3e91cfa9cacf983b5b6
BLAKE2b-256 fb750af809f40ff2379eee16465268feb19b6835ed74ea6507ae2ca34865681e

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp311-cp311-win_amd64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 eb9e40401185e7a0627adc2c735017eed7033c7a66da7d7eaabc5d555d6f79d3
MD5 435b4eb4ecbd86a15e5be2860862b88b
BLAKE2b-256 5735ef13e3e371ac37ad7820aa49d36bc097ea031ba77a2a6f0d185bb5e64dc1

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp311-cp311-manylinux_2_34_x86_64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 711879fa902e9275fec9fa0df990ea4f9075a3a7f74f495483214667d7a749d6
MD5 3837746d3a9a0d5986ed6fd45a5961ba
BLAKE2b-256 2c9917b6d529fe9cc959093ac0e53106a9a276998da52d86a9f8182f125bd533

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e02b0b934da1b68c2b2354d971ab2a3ea591acaa09a68db829e5dbe6ca12e8f9
MD5 84218ae7b63798136dcedea86d1c278a
BLAKE2b-256 01d3dda92ac71beab249238d8f9c5faa219ca2e949122432dfc74ba7d4c84826

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp310-cp310-win_amd64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 753814f2544e9f125d73b26f1fbf3d36612850d5d356332ae51b87064afeb4f9
MD5 f625db7f705e27aee1892f80679413a3
BLAKE2b-256 28edf1baef37641b467264d3c63d9b627b692f2d95d35e82401b02148ec86f07

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp310-cp310-manylinux_2_34_x86_64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kohakuvault-0.7.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kohakuvault-0.7.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 147bc5ddef08292ee7b70a6af3358d5abca8de709d1a8d294c6d2701ee620596
MD5 d36d6c34151014d2d4414d33dd16a8e5
BLAKE2b-256 f482edde30d0f044e571bef3b34f67207658a29265d40fd2346fd393d92654ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for kohakuvault-0.7.0-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: release.yml on KohakuBlueleaf/KohakuVault

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page