Zero-LLM memory for AI agents — semantic search, cross-encoder reranking, and deterministic heuristics

These details have not been verified by PyPI

Project description

CoreMem

Zero-LLM memory retrieval for AI agents. CoreMem gives agents instant access to conversation history — semantic search plus deterministic retrieval heuristics, all without a single API call. Scores 93.0% R@5 on LongMemEval (500 questions) with search_enhanced, 75.4% with the zero-LLM search() path.

Embedded. Local. Open source. No external APIs, no vector DB services, no internet connection required. Runs entirely on-device with ChromaDB or HybridDB + sentence-transformers. Ships as a single Python package with zero infrastructure dependencies.

Dual-backend architecture. Drop-in backends (ChromaDB baseline, HybridDB enhanced) with the same API. Ranking pipeline: backend retrieval → deterministic heuristics → MMR session diversity → recency-aware rescoring → session-deduplicated retrieval.

from coremem import MemoryCore
from coremem.backends.chroma import ChromaBackend

core = MemoryCore(backend=ChromaBackend(path="./memory"))

# Ingest conversation turns
core.ingest("user", "I visited the Museum of Modern Art today")
core.ingest("assistant", "That sounds wonderful! How was it?")
core.ingest("user", "I went to an Ancient Civilizations exhibition at the Natural History Museum")

# Search with deterministic heuristic reranking
results = core.search("When did I visit art museums?")

for r in results:
    print(f"[{r.memory.ts}] [{r.memory.role}] {r.memory.content}")

Why CoreMem?

Every AI agent needs memory. But cloud-based vector search is expensive, slow, and doesn't work offline. Pure embedding similarity misses keyword matches and temporal context. LLM-based memory systems cost tokens per query.

CoreMem solves all three:

Component	What it does
Semantic search	Embedding similarity via ChromaDB or HybridDB
Deterministic heuristics	Keyword overlap (fuzzy + bigram), temporal recency, person-name boost, quoted-phrase matching
MMR session diversity	One result per session, preventing cross-encoder overfit
Score normalization	Per-sub-query normalization in enhanced search for balanced merging

LongMemEval Results (500 questions, zero LLM tuning)

Mode	R@5	MRR	Rank@1
`search()`	75.4%	0.562	45.8%
`search_enhanced()`	93.0%	0.892	86.6%

Question type	`search`	`search_enhanced`
multi-session	76.7%	96.2%
knowledge-update	82.1%	97.4%
single-session-user	72.9%	94.3%
temporal-reasoning	74.4%	91.7%
single-session-assistant	76.8%	89.3%
single-session-preference	60.0%	76.7%

Installation

pip install coremem

HybridBackend (HybridDB — SQLite + FTS5 + ChromaDB) is the default since 0.5.0:

Note on model downloads. ChromaDB downloads a bundled MiniLM embedding model (~80MB) on first PersistentClient() init. The cross-encoder downloads cross-encoder/ms-marco-MiniLM-L-6-v2 (~500MB) on first search_enhanced() call. Both cache locally after download. Call core.warmup() at startup to pre-load models predictably.

Core Concepts

Backends

# ChromaDB baseline — pure vector search
from coremem.backends.chroma import ChromaBackend
core = MemoryCore(backend=ChromaBackend(path="./data"))

# HybridDB enhanced — FTS5 + vector hybrid search
from coremem.backends.hybrid import HybridBackend
core = MemoryCore(backend=HybridBackend(path="./data"))

Ingestion

# Simple ingestion
core.ingest("user", "I built a Spitfire model kit", session_id="conv_001")

# Batch ingestion
core.ingest_many([
    {"role": "user", "content": "What's the weather today?"},
    {"role": "assistant", "content": "Sunny with a high of 72°F"},
], session_id="conv_001")

Search

# Basic search — fast path with deterministic heuristics
results = core.search("How many model kits?", limit=10)

# Enhanced search — multi-query expansion + cross-encoder reranking
results = core.search_enhanced("What did I build recently?", limit=10)

# Cross-encoder loads on first use (~500MB download).
# Disable with DISABLE_CROSS_ENCODER=1 for eval scripts.

Heuristics

Deterministic, zero-LLM scoring boosts applied to every result:

Heuristic	What it catches
`keyword_overlap`	Exact + fuzzy (difflib) + bigram matches between query and content
`temporal_boost`	Queries with "latest", "current", "recently"
`recency_decay`	Unconditional exponential decay (30-day half-life)
`person_name_boost`	Proper name mentions in content
`quoted_phrase_boost`	Exact phrase matches in quotes

from coremem import SearchHeuristics

# Apply all heuristics to a single result
score = SearchHeuristics.apply_all(
    query="latest project",
    content="Just finished the Q3 project report",
    score=0.75,
    ts="2026-05-28T10:00:00Z",
)

Enhanced Search

search_enhanced() adds multi-query expansion and cross-encoder reranking:

results = core.search_enhanced("model kits", limit=10)

Multi-query expansion. Generates search variants for better recall. Regex expansion always active. LLM-based expansion is opt-in — pass an llm_provider to MemoryCore:

core = MemoryCore(backend=..., llm_provider=my_chat_model)

Or set MEMORY_EXPANSION_MODEL=ollama:llama3.2 in your environment when using MemoryCore from this project's ecosystem.

Cross-encoder reranking. A cross-encoder/ms-marco-MiniLM-L-6-v2 model reranks the top results for better relevance. Loads lazily on first search_enhanced() call (~500MB download). Pre-load at startup with core.warmup() to avoid the delay during first search. Disable with:

DISABLE_CROSS_ENCODER=1 python my_script.py

Observer Pipeline

The ObserverPipeline (v0.5.0+) extracts structured observations from conversations — identity facts, events, preferences, plans, stances — and stores them with source-quote alignment guaranteeing 0% hallucination:

from coremem.memory_store import MemoryStore
from coremem.observer import ObserverPipeline

store = MemoryStore(path="./memory")
pipeline = ObserverPipeline(
    core=core, store=store, session_id="main",
    token_threshold=100, min_turns=1,
    enable_classification=True,
    enable_dedup=True,
)
await pipeline.after_turn()

7 labeling functions (LF) in parallel extract entities, actions, preferences, temporal facts, sentiment, possessions, and stances. All LFs are LLM-based — a deliberate choice:

Approach	Cost	Languages	Recall	Hallucination gate
LLM LFs (current)	~7 API calls/turn	Any language	97.5%	✅ Source-quote verified
Non-LLM (spaCy/VADER)	~free	English only	~95% (unverified)	❌ None

Non-LLM approaches like OpenIE dependency parsing can replace entities, temporal, possessions, and actions LFs with zero API cost, but are restricted to the languages the NLP model supports (primarily English). LLM LFs handle any language out of the box — Mandarin, Arabic, Spanish, code-switching — without model swaps or quality degradation. The 2.5% miss rate (third-party events, contextual asides) is the measured cost of the hallucination gate.

Wake-Up Context

Give the agent instant situational awareness:

context = core.wake_up(user_id="alice")
# Returns a compact string with L0 identity and L1 recent context.

License

MIT — see LICENSE.

Author

Eddy Vinck

CoreMem is the retrieval engine behind the Executive Assistant agent system. Pairs with HybridDB for storage and ConnectKit for real-time sync.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.9.1

Jun 15, 2026

0.9.0

Jun 11, 2026

0.8.0

Jun 8, 2026

0.7.1

Jun 7, 2026

0.7.0

Jun 7, 2026

0.6.2

Jun 7, 2026

0.6.1

Jun 7, 2026

0.6.0

Jun 5, 2026

0.5.1

Jun 5, 2026

0.3.0

Jun 1, 2026

0.2.1

May 31, 2026

0.2.0

May 31, 2026

0.1.0

May 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coremem-0.9.1.tar.gz (82.5 MB view details)

Uploaded Jun 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

coremem-0.9.1-py3-none-any.whl (50.5 kB view details)

Uploaded Jun 15, 2026 Python 3

File details

Details for the file coremem-0.9.1.tar.gz.

File metadata

Download URL: coremem-0.9.1.tar.gz
Upload date: Jun 15, 2026
Size: 82.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coremem-0.9.1.tar.gz
Algorithm	Hash digest
SHA256	`03a2442e1ffdaee4819682f8844392641f959a29c1defcc8c96982be85134f9a`
MD5	`bba97353e065d27b5f7a6a287f8d1270`
BLAKE2b-256	`851867919fcccb3925d769db75a5866522e9dfff97756b53d8bb727b7416c37a`

See more details on using hashes here.

File details

Details for the file coremem-0.9.1-py3-none-any.whl.

File metadata

Download URL: coremem-0.9.1-py3-none-any.whl
Upload date: Jun 15, 2026
Size: 50.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coremem-0.9.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58672b718c917636d01ea7a5240549923d1818b74c1b7b7d22f53920d7410428`
MD5	`bd6669a455fd35b6298f18387d3a5dfc`
BLAKE2b-256	`b80b7113eca0b27409d55328ccd0c450203e8f68cb718607542be743cec48100`

See more details on using hashes here.

coremem 0.9.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

CoreMem

Why CoreMem?

LongMemEval Results (500 questions, zero LLM tuning)

Installation

Core Concepts

Backends

Ingestion

Search

Heuristics

Enhanced Search

Observer Pipeline

Wake-Up Context

License

Author

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes