Zero-LLM memory for AI agents — semantic search, cross-encoder reranking, and deterministic heuristics
Project description
CoreMem
Zero-LLM memory retrieval for AI agents. CoreMem gives agents instant access to conversation history — semantic search plus deterministic retrieval heuristics, all without a single API call. Scores 98.0% R@5 on LongMemEval (500 questions) in the Executive Assistant retrieval stack — no LLM, no tuning, no cloud.
Embedded. Local. Open source. No external APIs, no vector DB services, no internet connection required. Runs entirely on-device with ChromaDB or HybridDB + sentence-transformers. Ships as a single Python package with zero infrastructure dependencies.
Dual-backend architecture. Drop-in backends (ChromaDB baseline, HybridDB enhanced) with the same API. Ranking pipeline: backend retrieval → deterministic heuristics → recency-aware rescoring → session-aware retrieval.
from coremem import MemoryCore
from coremem.backends.chroma import ChromaBackend
core = MemoryCore(backend=ChromaBackend(path="./memory"))
# Ingest conversation turns
core.ingest("user", "I visited the Museum of Modern Art today")
core.ingest("assistant", "That sounds wonderful! How was it?")
core.ingest("user", "I went to an Ancient Civilizations exhibition at the Natural History Museum")
# Search with deterministic heuristic reranking
results = core.search("When did I visit art museums?")
for r in results:
print(f"[{r.memory.ts}] [{r.memory.role}] {r.memory.content}")
Why CoreMem?
Every AI agent needs memory. But cloud-based vector search is expensive, slow, and doesn't work offline. Pure embedding similarity misses keyword matches and temporal context. LLM-based memory systems cost tokens per query.
CoreMem solves all three:
| Component | What it does |
|---|---|
| Semantic search | Embedding similarity via ChromaDB or HybridDB |
| Deterministic heuristics | Keyword overlap, temporal recency, person-name boost, quoted-phrase matching |
| Session deduplication | One result per conversation, with full context retrieval |
LongMemEval Results (500 questions, no LLM, no tuning)
| Metric | Score |
|---|---|
| R@5 | 98.0% |
| R@10 | 98.4% |
| MRR | 0.944 |
| P@5 | 0.592 |
| F1@5 | 0.684 |
| Selectivity | 11.5% haystack scanned |
| Rank distribution | #1: 91.8%, #2-3: 5.0%, #4-5: 1.2%, #6-10: 0.4%, >10: 1.6% |
Outperforms MemPalace raw (96.6%) and matches their hybrid v4 held-out (98.4%) — with zero tuning, zero dev-set peeking.
Installation
pip install coremem
With HybridDB backend for enhanced FTS5 + vector hybrid search:
pip install coremem[hybrid]
Note on model downloads. ChromaDB downloads a bundled MiniLM embedding model (~80MB) on first
PersistentClient()init. The cross-encoder downloadscross-encoder/ms-marco-MiniLM-L-6-v2(~500MB) on firstsearch_enhanced()call. Both cache locally after download. Callcore.warmup()at startup to pre-load models predictably.
Core Concepts
Backends
# ChromaDB baseline — pure vector search
from coremem.backends.chroma import ChromaBackend
core = MemoryCore(backend=ChromaBackend(path="./data"))
# HybridDB enhanced — FTS5 + vector hybrid search
from coremem.backends.hybrid import HybridBackend
core = MemoryCore(backend=HybridBackend(path="./data"))
Ingestion
# Simple ingestion
core.ingest("user", "I built a Spitfire model kit", session_id="conv_001")
# Batch ingestion
core.ingest_many([
{"role": "user", "content": "What's the weather today?"},
{"role": "assistant", "content": "Sunny with a high of 72°F"},
], session_id="conv_001")
Search
# Basic search — fast path with deterministic heuristics
results = core.search("How many model kits?", limit=10)
# Enhanced search — multi-query expansion + cross-encoder reranking
results = core.search_enhanced("What did I build recently?", limit=10)
# Cross-encoder loads on first use (~500MB download).
# Disable with DISABLE_CROSS_ENCODER=1 for eval scripts.
Heuristics
Deterministic, zero-LLM scoring boosts applied to every result:
| Heuristic | What it catches |
|---|---|
keyword_overlap |
Exact word matches between query and content |
temporal_boost |
Queries with "latest", "current", "recently" |
recency_decay |
Unconditional exponential decay (30-day half-life) |
person_name_boost |
Proper name mentions in content |
quoted_phrase_boost |
Exact phrase matches in quotes |
from coremem import SearchHeuristics
# Apply all heuristics to a single result
score = SearchHeuristics.apply_all(
query="latest project",
content="Just finished the Q3 project report",
score=0.75,
ts="2026-05-28T10:00:00Z",
)
Enhanced Search
search_enhanced() adds multi-query expansion and cross-encoder reranking:
results = core.search_enhanced("model kits", limit=10)
Multi-query expansion. Generates search variants for better recall. Regex expansion always active. LLM-based expansion is opt-in — pass an llm_provider to MemoryCore:
core = MemoryCore(backend=..., llm_provider=my_chat_model)
Or set MEMORY_EXPANSION_MODEL=ollama:llama3.2 in your environment when using MemoryCore from this project's ecosystem.
Cross-encoder reranking. A cross-encoder/ms-marco-MiniLM-L-6-v2 model reranks the top results for better relevance. Loads lazily on first search_enhanced() call (~500MB download). Pre-load at startup with core.warmup() to avoid the delay during first search. Disable with:
DISABLE_CROSS_ENCODER=1 python my_script.py
Wake-Up Context
Give the agent instant situational awareness:
context = core.wake_up(user_id="alice")
# Returns a compact string with L0 identity and L1 recent context.
License
MIT — see LICENSE.
Author
Eddy Vinck
CoreMem is the retrieval engine behind the Executive Assistant agent system. Pairs with HybridDB for storage and ConnectKit for real-time sync.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file coremem-0.2.1.tar.gz.
File metadata
- Download URL: coremem-0.2.1.tar.gz
- Upload date:
- Size: 242.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aec2286dd7a7bf9d9992c0ba6c6e2047160ad3b69298b0f90c735b20b6ed22b0
|
|
| MD5 |
b273395ce323a7ba2a14b985253ffa8d
|
|
| BLAKE2b-256 |
f7188a1e8dce0caccc53fb7b37bc396adc2f4ae7b21c8a5c0ef897abb18bac9e
|
File details
Details for the file coremem-0.2.1-py3-none-any.whl.
File metadata
- Download URL: coremem-0.2.1-py3-none-any.whl
- Upload date:
- Size: 23.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e738a94776e34e309f752c5145d145fff4ac11afdbb677f86818eac04f63100
|
|
| MD5 |
a8be05ea1b26bf607d672f2529a1bc27
|
|
| BLAKE2b-256 |
cac2aeac3833af4dae4f9109719500b00abd0ea1bd24e539713c4e143bfafb14
|