Local-first, privacy-by-default embedded vector database. Your data never leaves your machine.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

davidobot

These details have not been verified by PyPI

Project description

LodeDB

A fast, exact embedded vector database for local RAG: in-process, on-disk, no server.

Built by Egoist Machines, Inc. - efficient full-stack infrastructure for reliable AI systems.

Most embedded vector databases stop at the CPU. LodeDB runs the same on-disk index on the GPU when you have one: batched search hits 24k queries/sec on an A10 and 50k qps on an L40S, 2.8× to 4.8× the all-CPU ceiling, with recall unchanged. It also persists changed rows incrementally, so a commit stays sub-millisecond even at 1M vectors.

Fast on a laptop. Faster on a GPU. Exact every time. Never phones home.

GPU-resident batch search: an fp16 copy of the index lives on the GPU, scored with a tiled GEMM plus a streaming top-k ([gpu], Linux/CUDA). How it works.
O(changed) persistence: commits only the rows that changed, 173× to 1,308× faster than a full rewrite. How it works.
Compact storage: the MIT TurboVec core packs vectors into 2/4-bit codes and scans them with SIMD CPU kernels.
In-process, on-disk (.tvim/.tvd/.jsd): no daemon, no account, no API key.
Safe concurrency: one writer and many lock-free readers per path; every commit is crash-atomic and rolls back to the last committed state on failure, never a torn store. How it works.
Private by default: text, ids, and vectors stay local; telemetry is metrics-only (counts, bytes, latency), never raw payloads.
Local embeddings: sentence-transformers on CUDA, MPS, or CPU.
Batteries included: a lodedb CLI, a loopback dev server, an MCP server, and a LangChain VectorStore adapter.

🏢 Enterprise The LodeDB core is Apache-2.0 and free to use. Enterprise licensing is available for commercial support, managed and at-scale serving, and on-prem / BYOC deployment. Contact sales@egoistmachines.com.

Install

pip install lodedb

That's it. Prebuilt wheels cover Linux, macOS (Apple Silicon and Intel), and Windows on Python 3.11+, and bundle the TurboVec (Rust) core, so there's nothing to compile. Confirm the install with lodedb doctor. Optional extras:

pip install "lodedb[gpu]"            # GPU-resident scan (Linux/CUDA)
pip install "lodedb[mcp,langchain]"  # MCP server + LangChain adapter

Build from source (contributors, or a platform without a wheel)

Needs a Rust toolchain and a CBLAS provider (Accelerate on macOS, libopenblas-dev on Linux). uv builds and bundles the core for you:

git clone https://github.com/Egoist-Machines/LodeDB && cd LodeDB
uv sync                                 # builds + bundles the TurboVec core via maturin
uv sync --extra mcp --extra langchain   # + MCP server, LangChain adapter
uv sync --extra gpu                     # + GPU-resident scan (Linux/CUDA)

Run with uv run (e.g. uv run lodedb doctor).

Quickstart

from lodedb import LodeDB

db = LodeDB(path="./data", model="minilm")   # "minilm" (fast) | "bge" (quality)

fox = db.add("the quick brown fox jumps", metadata={"topic": "animals"})
db.add("a lazy dog sleeps all day", metadata={"topic": "animals"})

for score, doc_id, meta in db.search("fox", k=5):
    print(score, doc_id, meta)

for hits in db.search_many(["fox", "dog"], k=5):   # batched; the GPU can serve this
    print([(h.score, h.id, h.metadata) for h in hits])

# filter by metadata: exact match, plus $gt/$gte/$lt/$lte/$in/$nin/$exists and $and/$or/$not
db.search("fox", k=5, filter={"topic": "animals"})                      # bare scalar = exact
db.search("fox", k=5, filter={"$or": [{"topic": "animals"}, {"year": {"$gte": 2020}}]})

db.get(fox)     # -> "the quick brown fox jumps"  (text retained by default)
db.persist()    # durable .tvim/.tvd/.jsd snapshot; replays on reopen

Reopen with LodeDB(path="./data"); no migration step. Original text is kept in a .tvtext sidecar for db.get; pass store_text=False to keep none. Presets are minilm (384-dim) and bge (768-dim), with weights pulled from Hugging Face on first use. More in examples/.

Need to read a store another process is writing to? Open it read-only. It takes no writer lock, so it never blocks on (or is blocked by) the writer:

reader = LodeDB.open_readonly("./data")   # or LodeDB(path="./data", read_only=True)
reader.search("fox", k=5)                 # reads a committed snapshot
reader.add("nope")                        # raises ReadOnlyError

GPU-resident index

With the [gpu] extra on a CUDA host, LodeDB reconstructs the compact index into an fp16 matrix resident on the GPU and scores batched search_many with a tiled GEMM plus a streaming top-k. It is opt-in and lazy: single queries, non-CUDA hosts, and GPU-memory rejection fall back to the CPU scan, which stays the source of truth.

GPU throughput climbs with batch size while the CPU scan is flat. Same 4-bit index (d=1536, 100K), same host, only the scoring step differs. Crossover is around batch 50:

query batch	A10 GPU	L40S GPU
1	261 q/s	432 q/s
16	3,531	5,562
64	11,463	18,175
256	19,998	39,449
1024	24,037	50,326

Vanilla TurboVec CPU (all threads) on the same boxes: 8,497 q/s (A10 host), 10,420 q/s (L40S host). At batch 1024 the GPU is 2.8× / 4.8× that, and it scales with GPU class.

GPU throughput vs batch size: A10 and L40S vs the vanilla CPU scan

Recall is unchanged: the GPU scores the exact 4-bit reconstruction, so R@1 tracks the CPU scan across datasets and bit-widths, and edges ahead on GloVe-200 where quantization error is largest.

Recall: vanilla CPU scan vs GPU fp16 reconstruction

Other in-process vector databases stay CPU-bound. Alibaba's zvec reports about 8.4k q/s (VectorDBBench, 16-vCPU CPU, Cohere 768-dim): the same class as the TurboVec CPU scan, and a different regime from ours, so read it as the CPU-class baseline. The GPU-resident path is what clears it.

Scope. GPU search is Linux/CUDA-only and opt-in ([gpu]). macOS scans on the CPU by default; a first-class opt-in MPS exact scan exists (LODEDB_MPS_DIRECT_TURBOVEC) but NEON stays the default. On the measured M1 it was slower than NEON at every batch size; newer Apple GPUs should be re-measured before any default change. See docs/benchmarks.md and docs/architecture.md.

Delta persistence

Most embedded indexes rewrite the whole file on every change (O(N)). LodeDB writes only the rows that changed (O(changed)), so a 1,000-row commit stays sub-millisecond at any size:

corpus	full rewrite	delta export	speedup
100K	42.4 ms	0.25 ms	173×
500K	190.4 ms	0.24 ms	782×
1M	404.9 ms	0.31 ms	1,308×

Persist time: full rewrite vs delta export

The GPU path makes reads fast; the delta makes writes cheap. The on-disk format stays a plain snapshot that replays on reopen.

The opt-in raw-text store (store_text=True) is journaled the same way: an incremental commit appends a small .txd text delta instead of rewriting the whole document_id -> text map, so enabling text retrieval keeps commits O(changed) too. Isolated, the per-commit text write drops from a full-map rewrite (~57 ms at 20K docs, ~244 ms at 80K) to a flat ~0.7 ms regardless of corpus size.

And the rest of an incremental add() is O(changed) too: a single-doc update no longer rebuilds the whole index layout or rewrites the full text map on the commit path, so write latency stays flat as the corpus grows instead of climbing with it.

Benchmarks

All artifacts are metrics-only (counts, bytes, latency), never payloads. Full methodology and the complete figure set are in docs/benchmarks.md; each benchmarks/ folder has a README and a one-line reproduction command.

Local is the common case. On an Apple M1 (MiniLM, 20K docs) the CPU scan is ~0.25 ms p50, and end-to-end single-query latency is 5.7 ms p50.

Single-query latency on a laptop

CLI

lodedb doctor      # capability report: embedding / GPU / TurboVec backend
lodedb index ...   # build / add to an on-disk index
lodedb query ...   # search
lodedb serve       # loopback dev server (127.0.0.1, no auth)
lodedb mcp         # stdio MCP server for agent memory
lodedb benchmark   # local, metrics-only benchmark

Concurrency & durability

Single writer, many readers, per path. One handle holds the path open for writing at a time (an exclusive OS advisory lock); a second writer waits for it to close, then fails fast (ConcurrentWriterError) after LODEDB_PERSIST_LOCK_TIMEOUT (default 30s). Read-only handles (LodeDB.open_readonly(path) or read_only=True; used by lodedb query/get) take no lock, so they read one consistent committed snapshot while a writer is open. They just don't auto-see the writer's in-flight changes (no live cross-process refresh). Within one process the engine serializes operations under an in-process lock, so the threaded lodedb serve safely shares one handle.
Crash-atomic commits. A commit spans several files, but it is sealed by atomically swapping one <key>.commit.json root pointer over generation-addressed artifacts, so a crash mid-commit rolls back to the last committed generation on reopen (never a torn, half-applied store) and readers always load one consistent generation.
Durability is fast by default. Commits are atomic but not fsync'd. Pass durability="fsync" (or --durability fsync / LODEDB_DURABILITY=fsync) to fsync each file and its directory on commit for power-loss durability, at some commit-throughput cost.
Local filesystems only. The OS advisory lock is unreliable on NFS/SMB.

Limitations

Exact scan, no ANN. Built for small-to-mid corpora where exact recall matters, not billion-scale.
GPU-resident scan is Linux/CUDA-only and opt-in ([gpu]). macOS has a first-class, opt-in Metal (MPS) exact scan (LODEDB_MPS_DIRECT_TURBOVEC=auto); NEON is the default and was faster on the measured M1, so the MPS scan stays off by default until newer Apple GPUs are re-measured.
Single queries run on the CPU; the GPU serves batched search_many.
Single writer per path. One writer at a time (many concurrent readers), with no live cross-process refresh, on local filesystems only. See Concurrency & durability.
Model weights download from Hugging Face on first use, then cache locally.

TurboVec

The compact core is the upstream MIT TurboVec project (© Ryan Codrai), vendored under third_party/turbovec/ with its license preserved. LodeDB's lifecycle patches (encoded-row export/import, upsert_with_ids, calibration) are Apache-2.0. See NOTICE.

License

Apache-2.0 (LICENSE). The bundled TurboVec core is MIT (NOTICE, third_party/turbovec/LICENSE). "LodeDB" and "Egoist Machines" are trademarks; Apache-2.0 grants no trademark rights (§6).

Enterprise licensing and commercial support are available from Egoist Machines, Inc.: contact sales@egoistmachines.com.

Contributing & security

PRs welcome; see CONTRIBUTING.md. Report security issues privately per SECURITY.md, not in public issues. Other bugs and requests go to the issue tracker.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

davidobot

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.0

Jun 25, 2026

0.3.0

Jun 24, 2026

0.2.1

Jun 23, 2026

0.2.0

Jun 23, 2026

This version

0.1.2

Jun 22, 2026

0.1.1

Jun 20, 2026

0.1.0

Jun 19, 2026

0.0.1

Jun 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lodedb-0.1.2.tar.gz (305.6 kB view details)

Uploaded Jun 22, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lodedb-0.1.2-cp39-abi3-win_amd64.whl (826.5 kB view details)

Uploaded Jun 22, 2026 CPython 3.9+Windows x86-64

lodedb-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl (13.3 MB view details)

Uploaded Jun 22, 2026 CPython 3.9+manylinux: glibc 2.28+ x86-64

lodedb-0.1.2-cp39-abi3-macosx_11_0_arm64.whl (960.7 kB view details)

Uploaded Jun 22, 2026 CPython 3.9+macOS 11.0+ ARM64

lodedb-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl (891.3 kB view details)

Uploaded Jun 22, 2026 CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file lodedb-0.1.2.tar.gz.

File metadata

Download URL: lodedb-0.1.2.tar.gz
Upload date: Jun 22, 2026
Size: 305.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lodedb-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`33a297748899370891d36f06189016dfa70bc91289906597508a5050cfdeb12c`
MD5	`a39158676e768387e6c5056a1264eae9`
BLAKE2b-256	`4d8b8f97b2fc3b76d59f3278539d4f43a6d900578cc7aa083dddc1a168930cb2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lodedb-0.1.2.tar.gz:

Publisher: release.yml on Egoist-Machines/LodeDB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lodedb-0.1.2.tar.gz
- Subject digest: 33a297748899370891d36f06189016dfa70bc91289906597508a5050cfdeb12c
- Sigstore transparency entry: 1915703110
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: Egoist-Machines/LodeDB@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/Egoist-Machines
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Trigger Event: push

File details

Details for the file lodedb-0.1.2-cp39-abi3-win_amd64.whl.

File metadata

Download URL: lodedb-0.1.2-cp39-abi3-win_amd64.whl
Upload date: Jun 22, 2026
Size: 826.5 kB
Tags: CPython 3.9+, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lodedb-0.1.2-cp39-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`09e4181914f4a2e105bbd06b4625cec9f3d587be6659d52cc321a222a7b077a0`
MD5	`f5eb84ae2aabf4bbab8caed667d7d16b`
BLAKE2b-256	`55319b05f400dcd5f851a8e28f52c930f238a61bceff44b8eb4d00c53cca0f62`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lodedb-0.1.2-cp39-abi3-win_amd64.whl:

Publisher: release.yml on Egoist-Machines/LodeDB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lodedb-0.1.2-cp39-abi3-win_amd64.whl
- Subject digest: 09e4181914f4a2e105bbd06b4625cec9f3d587be6659d52cc321a222a7b077a0
- Sigstore transparency entry: 1915703228
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: Egoist-Machines/LodeDB@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/Egoist-Machines
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Trigger Event: push

File details

Details for the file lodedb-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl.

File metadata

Download URL: lodedb-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl
Upload date: Jun 22, 2026
Size: 13.3 MB
Tags: CPython 3.9+, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lodedb-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`747a9fefe4ef6c98420660808b636c06d989a9f8f89ebfc8e0b54534f798b929`
MD5	`e120789b3f427e50578eb5733553c49c`
BLAKE2b-256	`d12781d1650672735f478ffeef98feff45ecab082703e0c9e19f213dd309258d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lodedb-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl:

Publisher: release.yml on Egoist-Machines/LodeDB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lodedb-0.1.2-cp39-abi3-manylinux_2_28_x86_64.whl
- Subject digest: 747a9fefe4ef6c98420660808b636c06d989a9f8f89ebfc8e0b54534f798b929
- Sigstore transparency entry: 1915703377
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: Egoist-Machines/LodeDB@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/Egoist-Machines
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Trigger Event: push

File details

Details for the file lodedb-0.1.2-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: lodedb-0.1.2-cp39-abi3-macosx_11_0_arm64.whl
Upload date: Jun 22, 2026
Size: 960.7 kB
Tags: CPython 3.9+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lodedb-0.1.2-cp39-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`a657980787cdce200be575cb55d1b7f3425cd2fc7e3d54cf265ab4f25a49b542`
MD5	`53461e55688faa7c3448b5f09e559f86`
BLAKE2b-256	`df27ea86c8f8f44f4afbcc6cd1235294d8e25d91d4a40f4d27d04a6ec41c09ae`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lodedb-0.1.2-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on Egoist-Machines/LodeDB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lodedb-0.1.2-cp39-abi3-macosx_11_0_arm64.whl
- Subject digest: a657980787cdce200be575cb55d1b7f3425cd2fc7e3d54cf265ab4f25a49b542
- Sigstore transparency entry: 1915703171
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: Egoist-Machines/LodeDB@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/Egoist-Machines
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Trigger Event: push

File details

Details for the file lodedb-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

Download URL: lodedb-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl
Upload date: Jun 22, 2026
Size: 891.3 kB
Tags: CPython 3.9+, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lodedb-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`acb205d973c2d18c35046c4bfb0106f53d6fe02d217998c11307d11cf220f172`
MD5	`e28770eb2d50b309e2956857e2854269`
BLAKE2b-256	`3ee138b6b5348ce465dad5279b6a21bc68e338a0e199d5f816332dd84b0dd574`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lodedb-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on Egoist-Machines/LodeDB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lodedb-0.1.2-cp39-abi3-macosx_10_12_x86_64.whl
- Subject digest: acb205d973c2d18c35046c4bfb0106f53d6fe02d217998c11307d11cf220f172
- Sigstore transparency entry: 1915703311
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: Egoist-Machines/LodeDB@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/Egoist-Machines
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@66257c2909ea9f76f852efdf3e6cf4891f4f8371
- Trigger Event: push

lodedb 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

LodeDB

Install

Quickstart

GPU-resident index

Delta persistence

Benchmarks

CLI

Concurrency & durability

Limitations

TurboVec

License

Contributing & security

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance