Skip to main content

Zero-LLM memory for AI agents — semantic search, cross-encoder reranking, and deterministic heuristics

Project description

CoreMem

Zero-LLM memory retrieval for AI agents. CoreMem gives agents instant access to conversation history — semantic search plus deterministic retrieval heuristics, all without a single API call. Scores 98.0% R@5 on LongMemEval (500 questions) in the Executive Assistant retrieval stack — no LLM, no tuning, no cloud.

Embedded. Local. Open source. No external APIs, no vector DB services, no internet connection required. Runs entirely on-device with ChromaDB or HybridDB + sentence-transformers. Ships as a single Python package with zero infrastructure dependencies.

Dual-backend architecture. Drop-in backends (ChromaDB baseline, HybridDB enhanced) with the same API. Ranking pipeline: backend retrieval → deterministic heuristics → recency-aware rescoring → session-aware retrieval.

from coremem import MemoryCore
from coremem.backends.chroma import ChromaBackend

core = MemoryCore(backend=ChromaBackend(path="./memory"))

# Ingest conversation turns
core.ingest("user", "I visited the Museum of Modern Art today")
core.ingest("assistant", "That sounds wonderful! How was it?")
core.ingest("user", "I went to an Ancient Civilizations exhibition at the Natural History Museum")

# Search with deterministic heuristic reranking
results = core.search("When did I visit art museums?")

for r in results:
    print(f"[{r.memory.ts}] [{r.memory.role}] {r.memory.content}")

Why CoreMem?

Every AI agent needs memory. But cloud-based vector search is expensive, slow, and doesn't work offline. Pure embedding similarity misses keyword matches and temporal context. LLM-based memory systems cost tokens per query.

CoreMem solves all three:

Component What it does
Semantic search Embedding similarity via ChromaDB or HybridDB
Deterministic heuristics Keyword overlap, temporal recency, person-name boost, quoted-phrase matching
Session deduplication One result per conversation, with full context retrieval

LongMemEval Results (500 questions, no LLM, no tuning)

Metric Score
R@5 98.0%
R@10 98.4%
MRR 0.944
P@5 0.592
F1@5 0.684
Selectivity 11.5% haystack scanned
Rank distribution #1: 91.8%, #2-3: 5.0%, #4-5: 1.2%, #6-10: 0.4%, >10: 1.6%

Outperforms MemPalace raw (96.6%) and matches their hybrid v4 held-out (98.4%) — with zero tuning, zero dev-set peeking.

Installation

pip install coremem

With HybridDB backend for enhanced FTS5 + vector hybrid search:

pip install coremem[hybrid]

Note on model downloads. ChromaDB downloads a bundled MiniLM embedding model (~80MB) on first PersistentClient() init. The cross-encoder downloads cross-encoder/ms-marco-MiniLM-L-6-v2 (~500MB) on first search_enhanced() call. Both cache locally after download. Call core.warmup() at startup to pre-load models predictably.

Core Concepts

Backends

# ChromaDB baseline — pure vector search
from coremem.backends.chroma import ChromaBackend
core = MemoryCore(backend=ChromaBackend(path="./data"))

# HybridDB enhanced — FTS5 + vector hybrid search
from coremem.backends.hybrid import HybridBackend
core = MemoryCore(backend=HybridBackend(path="./data"))

Ingestion

# Simple ingestion
core.ingest("user", "I built a Spitfire model kit", session_id="conv_001")

# Batch ingestion
core.ingest_many([
    {"role": "user", "content": "What's the weather today?"},
    {"role": "assistant", "content": "Sunny with a high of 72°F"},
], session_id="conv_001")

Search

# Basic search — fast path with deterministic heuristics
results = core.search("How many model kits?", limit=10)

# Enhanced search — multi-query expansion + cross-encoder reranking
results = core.search_enhanced("What did I build recently?", limit=10)

# Cross-encoder loads on first use (~500MB download).
# Disable with DISABLE_CROSS_ENCODER=1 for eval scripts.

Heuristics

Deterministic, zero-LLM scoring boosts applied to every result:

Heuristic What it catches
keyword_overlap Exact word matches between query and content
temporal_boost Queries with "latest", "current", "recently"
recency_decay Unconditional exponential decay (30-day half-life)
person_name_boost Proper name mentions in content
quoted_phrase_boost Exact phrase matches in quotes
from coremem import SearchHeuristics

# Apply all heuristics to a single result
score = SearchHeuristics.apply_all(
    query="latest project",
    content="Just finished the Q3 project report",
    score=0.75,
    ts="2026-05-28T10:00:00Z",
)

Enhanced Search

search_enhanced() adds multi-query expansion and cross-encoder reranking:

results = core.search_enhanced("model kits", limit=10)

Multi-query expansion. Generates search variants for better recall. Regex expansion always active. LLM-based expansion is opt-in — pass an llm_provider to MemoryCore:

core = MemoryCore(backend=..., llm_provider=my_chat_model)

Or set MEMORY_EXPANSION_MODEL=ollama:llama3.2 in your environment when using MemoryCore from this project's ecosystem.

Cross-encoder reranking. A cross-encoder/ms-marco-MiniLM-L-6-v2 model reranks the top results for better relevance. Loads lazily on first search_enhanced() call (~500MB download). Pre-load at startup with core.warmup() to avoid the delay during first search. Disable with:

DISABLE_CROSS_ENCODER=1 python my_script.py

Wake-Up Context

Give the agent instant situational awareness:

context = core.wake_up(user_id="alice")
# Returns a compact string with L0 identity and L1 recent context.

License

MIT — see LICENSE.

Author

Eddy Vinck

CoreMem is the retrieval engine behind the Executive Assistant agent system. Pairs with HybridDB for storage and ConnectKit for real-time sync.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coremem-0.2.1.tar.gz (242.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coremem-0.2.1-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file coremem-0.2.1.tar.gz.

File metadata

  • Download URL: coremem-0.2.1.tar.gz
  • Upload date:
  • Size: 242.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coremem-0.2.1.tar.gz
Algorithm Hash digest
SHA256 aec2286dd7a7bf9d9992c0ba6c6e2047160ad3b69298b0f90c735b20b6ed22b0
MD5 b273395ce323a7ba2a14b985253ffa8d
BLAKE2b-256 f7188a1e8dce0caccc53fb7b37bc396adc2f4ae7b21c8a5c0ef897abb18bac9e

See more details on using hashes here.

File details

Details for the file coremem-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: coremem-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coremem-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6e738a94776e34e309f752c5145d145fff4ac11afdbb677f86818eac04f63100
MD5 a8be05ea1b26bf607d672f2529a1bc27
BLAKE2b-256 cac2aeac3833af4dae4f9109719500b00abd0ea1bd24e539713c4e143bfafb14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page