Lightweight semantic storage engine with compressed embedding archives
Project description
TurboMemory ⚡
TurboMemory is a lightweight semantic storage engine for compressed embedding archives.
It combines:
- SQLite metadata indexing
- append-only transcript logging
- quantized embedding storage (4-bit / 6-bit / 8-bit packed format)
- topic-based partitioning + centroid prefiltering
- background consolidation (merge / prune / deduplicate)
- optional confidence decay + contradiction detection
TurboMemory is designed for local-first semantic search, offline RAG, and edge deployments.
Goal: deliver "SQLite simplicity" for semantic memory + compressed vector storage.
Why TurboMemory?
Embedding storage is expensive:
- float32 vectors consume large disk space
- most vector DBs are heavy to deploy
- local-first apps need portable storage formats
TurboMemory solves this by using TurboQuant-style packing to store embeddings efficiently while still enabling fast retrieval.
Features
Storage
- Append-only transcript/event log (durable ingestion)
- Topic-based storage files (load-on-demand)
- SQLite index for metadata + fast filtering
- Packed embedding formats: 4-bit / 6-bit / 8-bit
Retrieval
- centroid/topic prefilter to reduce search space
- configurable scoring pipeline
- optional verification filtering
Maintenance / Self-Healing
- background consolidation daemon
- deduplication and merging of similar chunks
- TTL expiration + confidence decay
- experimental contradiction detection
Installation
From PyPI (recommended)
pip install turbomemory
From source
git clone https://github.com/Kubenew/TurboMemory.git
cd TurboMemory
pip install -e .
With all features
pip install turbomemory[all]
Requirements
- Python 3.9+
- numpy >= 1.24.0
- sentence-transformers >= 2.2.0
Quickstart
CLI Usage
# Add memory
python -m turbomemory add_memory --topic notes --text "TurboMemory stores semantic chunks efficiently."
# Query
python -m turbomemory query --query "semantic storage" --k 5
# Get stats
python -m turbomemory stats
Python Usage
from turbomemory import TurboMemory
tm = TurboMemory(root="./tm_data")
# Add memory
tm.add_memory(
topic="notes",
text="TurboMemory stores semantic chunks efficiently.",
ttl_days=365
)
# Query
results = tm.query("semantic storage", k=5)
for score, topic, chunk in results:
print(f"[{score:.3f}] {chunk['text']}")
Example output:
[0.892] TurboMemory stores semantic chunks efficiently.
[0.756] Semantic search with compression
[0.723] Vector storage made simple
CLI Command Reference
| Command | Description |
|---|---|
add_memory |
Add a memory chunk |
add_turn |
Add conversation turn |
query |
Search memories |
stats |
Show statistics |
backup |
Create backup |
restore |
Restore from backup |
export |
Export topics |
import |
Bulk import |
merge |
Merge topics |
sync |
Sync with remote |
hybrid |
Hybrid search |
See python -m turbomemory --help for full options.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ TurboMemory │
├─────────────────────────────────────────────────────────────────┤
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ CLI / API │ │ Python SDK │ │ Integrations │ │
│ └──────┬─────────┘ └──────┬─────────┘ └──────┬─────────┘ │
│ │ │ │ │
│ ┌──────▼───────────────────▼───────────────────▼─────────┐ │
│ │ Core Engine │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Quantization│ │ Search │ │ Consolidation│ │ │
│ │ │ (4/6/8bit)│ │ (BM25+Vec) │ │ Daemon │ │ │
│ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │
│ └─────────┼────────────────┼────────────────┼────────────┘ │
│ │ │ │ │
│ ┌─────────▼────────────────▼────────────────▼────────────┐ │
│ │ Storage Layer │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │SQLite │ │ TMF │ │ .tmlog │ │ Sync │ │ │
│ │ │Index │ │ Vectors │ │ Log │ │ Protocol│ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ └────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
Benchmarks
Compression Ratios
| Format | Size (10K vectors, 384 dims) | Compression |
|---|---|---|
| float32 | 14.6 MB | 1x |
| 8-bit | 3.7 MB | 4x |
| 6-bit | 2.8 MB | 5.2x |
| 4-bit | 1.8 MB | 8x |
Query Latency
| Dataset Size | Latency (P95) |
|---|---|
| 1,000 chunks | 12ms |
| 10,000 chunks | 45ms |
| 100,000 chunks | 180ms |
Recall Quality
| Bit Depth | Avg Cosine Similarity |
|---|---|
| 8-bit | 0.997 |
| 6-bit | 0.968 |
| 4-bit | 0.912 |
Run benchmarks yourself:
python -m turbomemory.benchmark
Comparison
| Feature | TurboMemory | Chroma | sqlite-vector | LanceDB |
|---|---|---|---|---|
| Compression | 4-8x | None | None | None |
| Local-first | ✅ | ❌ | ✅ | ✅ |
| SQLite backend | ✅ | ❌ | ✅ | ❌ |
| Topic partitioning | ✅ | ❌ | ❌ | ❌ |
| Self-healing | ✅ | ❌ | ❌ | ❌ |
| Replication | ✅ | ❌ | ❌ | ✅ |
| Hybrid search | ✅ | ✅ | ❌ | ✅ |
| No server needed | ✅ | ❌ | ✅ | ❌ |
Integrations
LangChain
from turbomemory.integrations import TurboMemoryVectorStore
vectorstore = TurboMemoryVectorStore(root="./data", topic="docs")
vectorstore.add_texts(["doc1", "doc2"])
docs = vectorstore.similarity_search("query")
LlamaIndex
from turbomemory.integrations import getTurboMemoryIndex
index = getTurboMemoryIndex(root="./data")
query_engine = index.as_query_engine()
response = query_engine.query("your question")
Limitations
- No distributed clustering - Designed for single-node deployment
- No real-time multi-writer - Single-writer with eventual consistency via sync
- HNSW/IVF not default - Uses centroid prefilter; optional HNSW available
- Model pinned at ingest - All vectors must use same embedding model
Glossary
- Centroid prefilter: Pre-selects relevant topics using centroid similarity before full search
- Confidence decay: Reduces confidence of older memories over time
- Contradiction detection: Detects conflicting information and adjusts confidence
- Consolidation: Background process to merge/prune/optimize storage
- TurboQuant: 4/6/8-bit packed quantization for embeddings
- TMF: TurboMemory Format - portable storage format specification
Roadmap
See ROADMAP.md
| Version | Milestone |
|---|---|
| v0.3 | Stability + CI + packaging |
| v0.4 | Benchmarks + profiling |
| v0.5 | TMF v1 stable format |
| v0.6 | Hybrid search (BM25 + vector) |
| v0.7 | FastAPI server mode |
| v0.8 | Replication / edge sync |
Docker
# Build
docker build -t turbomemory .
# Run
docker run -p 8000:8000 turbomemory
# Or use docker-compose
docker compose up
Contributing
Contributions are welcome!
- Fork the repo
- Create a feature branch
- Run tests:
pytest tests/ - Run linters:
ruff check . && black . - Submit a PR
See CONTRIBUTING.md for details.
License
MIT License - see LICENSE
Support
Star History
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file turbomemory-0.5.1.tar.gz.
File metadata
- Download URL: turbomemory-0.5.1.tar.gz
- Upload date:
- Size: 143.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17ab28b56579c356166ea0ce1e6f31f637cab6ee57a0bb5044e9f6390358d204
|
|
| MD5 |
262ab52eca8c7b5f2e75334ab6c88166
|
|
| BLAKE2b-256 |
dc911a59761451a92d4fc0c845c3253eaa6560dc8e744f4c7f1a58f08e97d84b
|
Provenance
The following attestation bundles were made for turbomemory-0.5.1.tar.gz:
Publisher:
publish.yml on Kubenew/TurboMemory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
turbomemory-0.5.1.tar.gz -
Subject digest:
17ab28b56579c356166ea0ce1e6f31f637cab6ee57a0bb5044e9f6390358d204 - Sigstore transparency entry: 1341850622
- Sigstore integration time:
-
Permalink:
Kubenew/TurboMemory@9084b6c60428edbcf75e17e1f722b4a5a76689eb -
Branch / Tag:
refs/tags/v0.5.4 - Owner: https://github.com/Kubenew
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9084b6c60428edbcf75e17e1f722b4a5a76689eb -
Trigger Event:
release
-
Statement type:
File details
Details for the file turbomemory-0.5.1-py3-none-any.whl.
File metadata
- Download URL: turbomemory-0.5.1-py3-none-any.whl
- Upload date:
- Size: 166.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aedbdf4feb14f06dcca5a182dc965070ca5c4cd2bc4629aec60debbaaea7f298
|
|
| MD5 |
74794c650ee900c9c8ebf969d7807127
|
|
| BLAKE2b-256 |
2fb4b20ef2035a83ecb9b6ea245f4d1e4aa9008243e1e12ca08a2c22c1577cb3
|
Provenance
The following attestation bundles were made for turbomemory-0.5.1-py3-none-any.whl:
Publisher:
publish.yml on Kubenew/TurboMemory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
turbomemory-0.5.1-py3-none-any.whl -
Subject digest:
aedbdf4feb14f06dcca5a182dc965070ca5c4cd2bc4629aec60debbaaea7f298 - Sigstore transparency entry: 1341850626
- Sigstore integration time:
-
Permalink:
Kubenew/TurboMemory@9084b6c60428edbcf75e17e1f722b4a5a76689eb -
Branch / Tag:
refs/tags/v0.5.4 - Owner: https://github.com/Kubenew
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9084b6c60428edbcf75e17e1f722b4a5a76689eb -
Trigger Event:
release
-
Statement type: