MongoDB-compatible Python client for KeraDB - a lightweight embedded NoSQL database with vector search

These details have not been verified by PyPI

Project links

Project description

KeraDB Python SDK

A MongoDB-compatible Python client for KeraDB - a lightweight, embedded NoSQL document database with advanced vector search capabilities.

Features

MongoDB-Compatible API: Familiar API for easy migration from MongoDB
Embedded Database: No server required, runs directly in your application
Vector Search: Built-in HNSW-based vector similarity search
Multiple Distance Metrics: Cosine, Euclidean, Dot Product, Manhattan
Vector Compression: Delta and quantized compression for efficient storage
Zero Dependencies: Pure Python with no external dependencies for document operations
High Performance: Written in Rust with Python bindings via FFI
ACID Transactions: Full transaction support for data integrity

Installation

From PyPI (when published)

pip install keradb

From Source

First, build the native KeraDB library:

cd ../../../  # Navigate to project root
cargo build --release

Install the Python package:

cd sdks/python
pip install -e .

For Development

pip install -e ".[dev]"

Quick Start

Basic Document Operations

import keradb

# Connect to database (creates if doesn't exist)
client = keradb.connect("mydb.ndb")
db = client.database()
users = db.collection("users")

# Insert documents
result = users.insert_one({"name": "Alice", "age": 30, "email": "alice@example.com"})
print(f"Inserted ID: {result.inserted_id}")

# Find documents
user = users.find_one({"_id": result.inserted_id})
all_users = users.find().all()

# Update documents
users.update_one(
    {"_id": result.inserted_id},
    {"$set": {"age": 31}}
)

# Delete documents
users.delete_one({"_id": result.inserted_id})

# Close connection
client.close()

Using Context Manager

import keradb

with keradb.connect("mydb.ndb") as client:
    db = client.database()
    users = db.collection("users")
    
    users.insert_one({"name": "Bob", "age": 25})
    count = users.count_documents({})
    print(f"Total users: {count}")

Vector Search

import keradb
import random
import math

# Generate a random normalized embedding
def generate_embedding(dimensions):
    vec = [random.random() * 2 - 1 for _ in range(dimensions)]
    norm = math.sqrt(sum(x * x for x in vec))
    return [x / norm for x in vec]

# Connect and create vector collection
client = keradb.connect("vectors.ndb")

config = keradb.VectorConfig(
    dimensions=128,
    distance=keradb.Distance.COSINE,
    m=16,
    ef_construction=200,
    ef_search=50,
).with_delta_compression()

client.create_vector_collection("articles", config)

# Insert vectors with metadata
embedding = generate_embedding(128)
vector_id = client.insert_vector(
    "articles",
    embedding,
    {"title": "Machine Learning Basics", "category": "tech"}
)

# Search for similar vectors
query = generate_embedding(128)
results = client.vector_search("articles", query, k=10)

for result in results:
    print(f"[{result.rank}] {result.document.metadata['title']}")
    print(f"    Score: {result.score:.4f}")

# Get statistics
stats = client.vector_stats("articles")
print(f"Vectors: {stats.vector_count}, Memory: {stats.memory_usage:,} bytes")

client.close()

API Reference

Client

`keradb.connect(path: str) -> Client`

Create or open a KeraDB database.

Parameters:

path: Path to the database file

Returns: Client instance

Database

`client.database(name: Optional[str] = None) -> Database`

Get a database instance. The name parameter is optional and kept for MongoDB compatibility.

`database.collection(name: str) -> Collection`

Get a collection by name.

`database.list_collection_names() -> List[str]`

Get a list of all collection names.

Collection

Document Operations

insert_one(document: Dict) -> InsertOneResult
insert_many(documents: List[Dict]) -> InsertManyResult
find_one(filter: Optional[Dict] = None) -> Optional[Dict]
find(filter: Optional[Dict] = None) -> Cursor
update_one(filter: Dict, update: Dict) -> UpdateResult
update_many(filter: Dict, update: Dict) -> UpdateResult
delete_one(filter: Dict) -> DeleteResult
delete_many(filter: Dict) -> DeleteResult
count_documents(filter: Optional[Dict] = None) -> int

Supported MongoDB Operators

Update Operators:

$set: Set field values
$unset: Remove fields
$inc: Increment numeric values
$push: Append to arrays

Query Operators:

$eq, $ne: Equality, inequality
$gt, $gte, $lt, $lte: Comparison
$in, $nin: Array membership
$and, $or: Logical operators

Cursor

limit(n: int) -> Cursor: Limit number of results
skip(n: int) -> Cursor: Skip n results
all() -> List[Dict]: Return all documents as a list

Vector Operations

Creating Vector Collections

config = keradb.VectorConfig(
    dimensions=128,
    distance=keradb.Distance.COSINE,
    m=16,  # HNSW connections per node
    ef_construction=200,  # Build quality
    ef_search=50,  # Query quality
)

client.create_vector_collection("my_vectors", config)

Vector Configuration Options

Distance Metrics:

Distance.COSINE: Cosine similarity (default)
Distance.EUCLIDEAN: L2 distance
Distance.DOT_PRODUCT: Dot product
Distance.MANHATTAN: L1 distance

Compression:

with_delta_compression(): Store sparse differences
with_quantized_compression(): Aggressive quantization
No compression (default): Store full vectors

Vector CRUD

# Insert
vector_id = client.insert_vector(collection, embedding, metadata)

# Search
results = client.vector_search(collection, query_vector, k=10)

# Get by ID
doc = client.get_vector(collection, vector_id)

# Delete
client.delete_vector(collection, vector_id)

# Statistics
stats = client.vector_stats(collection)

Examples

See the examples directory for complete examples:

basic.py - Basic document operations
vector_search.py - Vector search demo

Run examples:

python examples/basic.py
python examples/vector_search.py

Benchmarks

Run benchmarks to compare performance:

# Install benchmark dependencies
pip install -e ".[benchmark]"

# Run all benchmarks
pytest benchmarks/ -v

# Run specific benchmarks
pytest benchmarks/benchmark_documents.py -v
pytest benchmarks/benchmark_vectors.py -v

See benchmarks/README.md for more details.

Testing

Run the test suite:

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=keradb --cov-report=html

Requirements

Python 3.8 or higher
KeraDB native library (libkeradb.so / libkeradb.dylib / keradb.dll)

Optional Dependencies

For benchmarks and development:

pytest >= 7.0.0
pytest-benchmark >= 4.0.0
numpy >= 1.20.0 (for faster vector generation in benchmarks)

Platform Support

Linux (x86_64, ARM64)
macOS (Intel, Apple Silicon)
Windows (x86_64)

Performance

KeraDB is designed for high performance:

Document Operations: 10,000+ inserts/sec, sub-millisecond reads
Vector Search: Sub-millisecond similarity search on millions of vectors
Memory Efficient: Delta and quantized compression reduce memory usage by 60-80%
Zero-Copy: Efficient FFI layer with minimal overhead

Architecture

┌─────────────────────────────┐
│    Python Application       │
└──────────┬──────────────────┘
           │
           ├─ keradb.connect()
           ├─ Collection API (MongoDB-compatible)
           └─ Vector Search API
           │
┌──────────▼──────────────────┐
│    Python FFI Layer         │
│  (ctypes bindings)          │
└──────────┬──────────────────┘
           │
┌──────────▼──────────────────┐
│   Rust Core Library         │
│  - Document storage (LSM)   │
│  - Vector search (HNSW)     │
│  - Compression              │
└─────────────────────────────┘

MongoDB Compatibility

The SDK aims to be compatible with MongoDB's API where practical:

✅ Supported:

Basic CRUD operations
Query operators ($eq, $gt, $in, etc.)
Update operators ($set, $inc, $push, etc.)
Cursor operations (limit, skip)
find_one, find, insert_one, insert_many
update_one, update_many, delete_one, delete_many

⚠️ Partial Support:

Aggregation pipeline (limited)
Indexes (automatic for performance)

❌ Not Supported:

GridFS
Transactions (planned)
Replication
Sharding

Performance

KeraDB delivers exceptional performance for embedded database operations. Here are benchmark results comparing KeraDB vs SQLite on Windows with Python 3.13:

KeraDB vs SQLite Performance Comparison

Operation	KeraDB (μs)	SQLite (μs)	Speedup	KeraDB OPS	SQLite OPS
Count	1.7	116.0	68x faster	579,980	8,622
Find by ID	10.7	131.7	12x faster	93,510	7,595
Update	82.9	159.4	2x faster	12,059	6,272
Insert	99.3	5,234	53x faster	10,066	191
Find All	461.3	390.5	1.2x slower	2,168	2,561
Delete	161.1	4,801	30x faster	6,207	208
Batch Insert (100)	11,165	-	-	90	-

Key Performance Insights

🚀 KeraDB Advantages:

68x faster document counting
53x faster single document inserts
30x faster document deletion
12x faster lookups by ID
2x faster document updates
Batch operations: 90 ops/second for 100 documents

⚡ Why KeraDB is Faster:

Direct memory-mapped B-tree access
Rust-based native implementation with zero-copy operations
Optimized for document-oriented workloads
No SQL parsing overhead

📊 Use Cases:

Embedded applications requiring high-speed document operations
Real-time data processing
High-throughput logging and event storage
Applications needing MongoDB-like API with SQLite-level simplicity

Run Benchmarks Yourself

pip install -e ".[dev]"
python -m pytest benchmarks/ -v --benchmark-only --benchmark-sort=name

Full benchmark details: See dev-docs/BENCHMARK_RESULTS.md

Contributing

Contributions are welcome! Please see the main project repository for guidelines.

License

MIT License - See LICENSE file for details

Changelog

Version 0.1.0

Initial release
MongoDB-compatible document operations
Vector search with HNSW
Multiple distance metrics
Vector compression (delta, quantized)
Python 3.8+ support
Comprehensive test suite and benchmarks

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keradb-0.1.0.tar.gz (21.9 kB view details)

Uploaded Jan 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

keradb-0.1.0-py3-none-any.whl (16.2 kB view details)

Uploaded Jan 18, 2026 Python 3

File details

Details for the file keradb-0.1.0.tar.gz.

File metadata

Download URL: keradb-0.1.0.tar.gz
Upload date: Jan 18, 2026
Size: 21.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for keradb-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`558ad05fef1d258c127288649be2e9b8a99df268fa1a1a97fa821edee329bb35`
MD5	`8e626e858efcbe0392cb014cbce8b4c0`
BLAKE2b-256	`ff9db9460561b87991885c9fa090513de98edd831aee3c29cc124a49d03a160c`

See more details on using hashes here.

File details

Details for the file keradb-0.1.0-py3-none-any.whl.

File metadata

Download URL: keradb-0.1.0-py3-none-any.whl
Upload date: Jan 18, 2026
Size: 16.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for keradb-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5037a181a148a2fe3dd588b82180486f9addb75c458e0019b16c2eeb421215d8`
MD5	`29e2ea4b88ebab0f46272f679869f230`
BLAKE2b-256	`69a4dc889bf5eed1a5b4df219e6f28087116a9ffcf32dbc8c661c49fd222ee88`

See more details on using hashes here.

keradb 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

KeraDB Python SDK

Features

Installation

From PyPI (when published)

From Source

For Development

Quick Start

Basic Document Operations

Using Context Manager

Vector Search

API Reference

Client

keradb.connect(path: str) -> Client

Database

client.database(name: Optional[str] = None) -> Database

database.collection(name: str) -> Collection

database.list_collection_names() -> List[str]

Collection

Document Operations

Supported MongoDB Operators

Cursor

Vector Operations

Creating Vector Collections

Vector Configuration Options

Vector CRUD

Examples

Benchmarks

Testing

Requirements

Optional Dependencies

Platform Support

Performance

Architecture

MongoDB Compatibility

Performance

KeraDB vs SQLite Performance Comparison

Key Performance Insights

Run Benchmarks Yourself

Contributing

License

Links

Changelog

Version 0.1.0

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`keradb.connect(path: str) -> Client`

`client.database(name: Optional[str] = None) -> Database`

`database.collection(name: str) -> Collection`

`database.list_collection_names() -> List[str]`