A lightweight local vector-aware database for Python
Project description
menteedb
menteedb is a lightweight local Python library that combines table-like records with optional vector similarity search, fluent query API, and optional encryption.
Features
- Define tables with a schema.
- Insert structured records.
- Fluent Query Builder - no SQL, pure Python with field selection, filtering, and conditions.
- Optional AES-256-GCM encryption with automatic key derivation.
- Binary MessagePack format - ~50% smaller files than JSON, automatic fallback to JSON.
- Enable vector search on one text field per table.
- Fast text contains search per table.
- Query by field filters and/or semantic similarity.
- Persist data locally with append-only files for speed.
Quick Start
Basic Usage
from menteedb import MenteeDB
db = MenteeDB(base_path="./data")
db.create_table(
table_name="notes",
fields={"title": "str", "body": "str", "tag": "str"},
vector_field="body",
)
db.insert("notes", {"title": "First", "body": "Vector databases are useful.", "tag": "ml"})
db.insert("notes", {"title": "Second", "body": "I enjoy local-first tools.", "tag": "dev"})
# Fluent query API
results = db.find("notes").where("tag", "==", "ml").select("title", "body").execute()
print(results)
Encrypted Storage (Optional)
from menteedb import MenteeDB
# Enable encryption
db = MenteeDB(
base_path="./secure_data",
use_encryption=True,
encryption_key="my_secure_password"
)
db.create_table("secrets", fields={"key": "str", "value": "str"})
db.insert("secrets", {"key": "api_token", "value": "sk_live_..."})
# Query encrypted data transparently
results = db.find("secrets").where("key", "==", "api_token").execute()
Query API (No SQL!)
Instead of SQL syntax, use Python method chaining:
# SELECT name, email FROM users WHERE age > 25 AND city = 'NYC'
results = (
db.find('users')
.where('age', '>', 25)
.where('city', '==', 'NYC')
.select('name', 'email')
.execute()
)
Supported Operators
- Comparison:
==,!=,>,<,>=,<= - Collection:
in - String:
contains
See QUERY_GUIDE.md for complete examples.
Legacy Query Modes
The original db.query() method still works:
- Filter-only:
db.query("notes", conditions={"tag": "ml"})
- Text contains search:
db.query("notes", text_query="vector", text_fields=["title", "body"])
- Vector-only:
db.query("notes", vector_query="your text")
- Hybrid (filter + vector):
db.query("notes", conditions={"tag": "dev"}, vector_query="local tools")
Storage Layout
For base_path="./data" and table notes, menteedb stores:
./data/notes/schema.json./data/notes/records.jsonl- Binary MessagePack format (compact, ~50% smaller than JSON)./data/notes/vector_ids.jsonl./data/notes/vectors.f32
Storage Features
- MessagePack Binary Format: Compact and fast serialization (~50% size reduction vs JSON)
- Optional Encryption: Enable AES-256-GCM encryption to protect sensitive data on disk
- Automatic Format Detection: Seamlessly reads legacy JSON data and writes new data as MessagePack
- Append-Only Design: Fast sequential writes with minimal overhead
This is local file-based storage. It is not publicly exposed over the network, but anyone with local filesystem access to this folder can read it. Enable encryption for sensitive data.
Encryption
Protect sensitive data with AES-256-GCM encryption:
from menteedb import MenteeDB
db = MenteeDB(
base_path="./secure",
use_encryption=True,
encryption_key="your_secure_password"
)
Benefits:
- ✅ AES-256-GCM authenticated encryption
- ✅ Automatic key derivation (PBKDF2-HMAC-SHA256)
- ✅ Transparent to your code
- ✅ ~50% disk savings with MessagePack
See ENCRYPTION_GUIDE.md for security best practices and examples.
Privacy and Permissions
- By default,
MenteeDB(..., secure_permissions=True)applies best-effort private permissions (700for table folders,600for files). - On Windows, real privacy is controlled by NTFS ACLs; chmod behavior is limited.
Testing
Run locally:
pip install .[dev]
pytest -q
CI/CD to PyPI
Workflow file: .github/workflows/pypi-publish.yml
- Runs tests on pushes to
main, tags (v*), and releases. - Publishes to PyPI on tag push (
v*) or GitHub Release publish. - Uses trusted publishing via GitHub OIDC.
Notes
- This initial version supports one vector field per table.
- Default embeddings use a deterministic local hashing embedder with no external model download.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file menteedb-0.2.0.tar.gz.
File metadata
- Download URL: menteedb-0.2.0.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a178b722920f78a01820e77b39e2e0aefdf214dbcc052bf49412613274bfddf
|
|
| MD5 |
ac212e143d9fba10a6cd257cee29dcab
|
|
| BLAKE2b-256 |
b57720084a4790c4df3412e829f7569233466770d43b6672e60f20856aa3d320
|
Provenance
The following attestation bundles were made for menteedb-0.2.0.tar.gz:
Publisher:
pypi-publish.yml on SyabAhmad/menteedb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
menteedb-0.2.0.tar.gz -
Subject digest:
4a178b722920f78a01820e77b39e2e0aefdf214dbcc052bf49412613274bfddf - Sigstore transparency entry: 1108416358
- Sigstore integration time:
-
Permalink:
SyabAhmad/menteedb@6243317331fff4bb5371038ad375a76a7e980652 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/SyabAhmad
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@6243317331fff4bb5371038ad375a76a7e980652 -
Trigger Event:
push
-
Statement type:
File details
Details for the file menteedb-0.2.0-py3-none-any.whl.
File metadata
- Download URL: menteedb-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7815712d45e85220f33056adf2221b3a38cc23e1fbc4a669aa34b02b63a7cba7
|
|
| MD5 |
5e82d5adce62ae778fe740ee3127176f
|
|
| BLAKE2b-256 |
f19c93f3720f6924f23baa3a184aafe1b665fb8cc0d362439276d373d2562017
|
Provenance
The following attestation bundles were made for menteedb-0.2.0-py3-none-any.whl:
Publisher:
pypi-publish.yml on SyabAhmad/menteedb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
menteedb-0.2.0-py3-none-any.whl -
Subject digest:
7815712d45e85220f33056adf2221b3a38cc23e1fbc4a669aa34b02b63a7cba7 - Sigstore transparency entry: 1108416368
- Sigstore integration time:
-
Permalink:
SyabAhmad/menteedb@6243317331fff4bb5371038ad375a76a7e980652 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/SyabAhmad
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@6243317331fff4bb5371038ad375a76a7e980652 -
Trigger Event:
push
-
Statement type: