Skip to main content

Zero-LLM memory for AI agents — semantic search, cross-encoder reranking, and deterministic heuristics

Project description

CoreMem

Zero-LLM memory retrieval for AI agents. CoreMem gives agents instant access to conversation history — semantic search plus deterministic retrieval heuristics, all without a single API call. Scores 98.0% R@5 on LongMemEval (500 questions) in the Executive Assistant retrieval stack — no LLM, no tuning, no cloud.

Embedded. Local. Open source. No external APIs, no vector DB services, no internet connection required. Runs entirely on-device with ChromaDB or HybridDB + sentence-transformers. Ships as a single Python package with zero infrastructure dependencies.

Dual-backend architecture. Drop-in backends (ChromaDB baseline, HybridDB enhanced) with the same API. Ranking pipeline: backend retrieval → deterministic heuristics → recency-aware rescoring → session-aware retrieval.

from coremem import MemoryCore
from coremem.backends.chroma import ChromaBackend

core = MemoryCore(backend=ChromaBackend(path="./memory"))

# Ingest conversation turns
core.ingest("user", "I visited the Museum of Modern Art today")
core.ingest("assistant", "That sounds wonderful! How was it?")
core.ingest("user", "I went to an Ancient Civilizations exhibition at the Natural History Museum")

# Search with deterministic heuristic reranking
results = core.search("When did I visit art museums?")

for r in results:
    print(f"[{r.memory.ts}] [{r.memory.role}] {r.memory.content}")

Why CoreMem?

Every AI agent needs memory. But cloud-based vector search is expensive, slow, and doesn't work offline. Pure embedding similarity misses keyword matches and temporal context. LLM-based memory systems cost tokens per query.

CoreMem solves all three:

Component What it does
Semantic search Embedding similarity via ChromaDB or HybridDB
Deterministic heuristics Keyword overlap, temporal recency, person-name boost, quoted-phrase matching
Session deduplication One result per conversation, with full context retrieval

LongMemEval Results (500 questions, no LLM, no tuning)

Metric Score
R@5 98.0%
R@10 98.4%
MRR 0.944
P@5 0.592
F1@5 0.684
Selectivity 11.5% haystack scanned
Rank distribution #1: 91.8%, #2-3: 5.0%, #4-5: 1.2%, #6-10: 0.4%, >10: 1.6%

Outperforms MemPalace raw (96.6%) and matches their hybrid v4 held-out (98.4%) — with zero tuning, zero dev-set peeking.

Installation

pip install coremem

With HybridDB backend for enhanced FTS5 + vector hybrid search:

pip install coremem[hybrid]

Note on model downloads. ChromaDB downloads a bundled MiniLM embedding model (~80MB) on first PersistentClient() init. The cross-encoder downloads cross-encoder/ms-marco-MiniLM-L-6-v2 (~500MB) on first search_enhanced() call. Both cache locally after download. Call core.warmup() at startup to pre-load models predictably.

Core Concepts

Backends

# ChromaDB baseline — pure vector search
from coremem.backends.chroma import ChromaBackend
core = MemoryCore(backend=ChromaBackend(path="./data"))

# HybridDB enhanced — FTS5 + vector hybrid search
from coremem.backends.hybrid import HybridBackend
core = MemoryCore(backend=HybridBackend(path="./data"))

Ingestion

# Simple ingestion
core.ingest("user", "I built a Spitfire model kit", session_id="conv_001")

# Batch ingestion
core.ingest_many([
    {"role": "user", "content": "What's the weather today?"},
    {"role": "assistant", "content": "Sunny with a high of 72°F"},
], session_id="conv_001")

Search

# Basic search — fast path with deterministic heuristics
results = core.search("How many model kits?", limit=10)

# Enhanced search — multi-query expansion + cross-encoder reranking
results = core.search_enhanced("What did I build recently?", limit=10)

# Cross-encoder loads on first use (~500MB download).
# Disable with DISABLE_CROSS_ENCODER=1 for eval scripts.

Heuristics

Deterministic, zero-LLM scoring boosts applied to every result:

Heuristic What it catches
keyword_overlap Exact word matches between query and content
temporal_boost Queries with "latest", "current", "recently"
recency_decay Unconditional exponential decay (30-day half-life)
person_name_boost Proper name mentions in content
quoted_phrase_boost Exact phrase matches in quotes
from coremem import SearchHeuristics

# Apply all heuristics to a single result
score = SearchHeuristics.apply_all(
    query="latest project",
    content="Just finished the Q3 project report",
    score=0.75,
    ts="2026-05-28T10:00:00Z",
)

Enhanced Search

search_enhanced() adds multi-query expansion and cross-encoder reranking:

results = core.search_enhanced("model kits", limit=10)

Multi-query expansion. Generates search variants for better recall. Regex expansion always active. LLM-based expansion is opt-in — pass an llm_provider to MemoryCore:

core = MemoryCore(backend=..., llm_provider=my_chat_model)

Or set MEMORY_EXPANSION_MODEL=ollama:llama3.2 in your environment when using MemoryCore from this project's ecosystem.

Cross-encoder reranking. A cross-encoder/ms-marco-MiniLM-L-6-v2 model reranks the top results for better relevance. Loads lazily on first search_enhanced() call (~500MB download). Pre-load at startup with core.warmup() to avoid the delay during first search. Disable with:

DISABLE_CROSS_ENCODER=1 python my_script.py

Wake-Up Context

Give the agent instant situational awareness:

context = core.wake_up(user_id="alice")
# Returns a compact string with L0 identity and L1 recent context.

License

MIT — see LICENSE.

Author

Eddy Vinck

CoreMem is the retrieval engine behind the Executive Assistant agent system. Pairs with HybridDB for storage and ConnectKit for real-time sync.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coremem-0.2.0.tar.gz (239.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coremem-0.2.0-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file coremem-0.2.0.tar.gz.

File metadata

  • Download URL: coremem-0.2.0.tar.gz
  • Upload date:
  • Size: 239.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coremem-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fe0211100ea602c2896314669ebece7af81192fc928029aa4f43d0dca434b22e
MD5 4af8264fe07a7e7cee06669cb7023db2
BLAKE2b-256 1fd360628d17ef99939f2cb96169a764517d45b9cc83b12d65613b88361a43d9

See more details on using hashes here.

File details

Details for the file coremem-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: coremem-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coremem-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ac009a7f68a43b9e3d3f363e8623682296095b41ff90853371c5fbac90634b3
MD5 de4171078669531a4763b7aaf1a46272
BLAKE2b-256 cd1d3fe018b4390d1e4cbf457fa5bb8c755dca3d3cddb89f921bb69026931ac1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page