High-recall conversational memory retrieval. 98% R@5 on LongMemEval, 94% on LoCoMo — no LLM required. Local-first, cloud-ready.

These details have not been verified by PyPI

Project links

Project description

Engram logo

Engram

High-recall conversational memory retrieval. Local-first, cloud-ready.

Benchmark Results

Tested on two major benchmarks — no LLM required, zero cost per query.

LongMemEval (500 questions)

Metric	Score
R@5	98.4% (492/500)
R@10	99.4%
NDCG@5	0.934

Question Type	R@5
knowledge-update	98.7%
multi-session	99.2%
single-session-assistant	100.0%
single-session-user	100.0%
temporal-reasoning	97.0%
single-session-preference	93.3%

LoCoMo (1982 questions, 10 conversations)

Metric	Score
R@5	93.9% (1862/1982)
R@10	95.0%
NDCG@5	0.894

Category	R@5	R@10
Single-hop (factual)	90.4%	93.3%
Temporal (dates)	93.1%	94.7%
Multi-hop (inference)	75.0%	78.3%
Contextual (details)	97.1%	97.5%
Adversarial (speaker)	94.6%	94.8%

Reported with --mode rerank (chunking + cross-encoder reranker + speaker-name injection).

What It Does

Engram stores conversation history and retrieves it with state-of-the-art accuracy. It uses a three-stage retrieval pipeline — dense embeddings, sparse keyword matching, and cross-encoder reranking — to achieve higher recall than systems relying on LLM-based extraction or summarization.

Nothing is summarized. Nothing is paraphrased. Your exact words are stored and returned.

How It Compares

LoCoMo — R@5 Leaderboard

System	LoCoMo R@5	LLM Required	Source
Engram	93.9%	No	This repo (reproducible)
EverMemOS	92.3%	Yes (cloud)	arXiv:2601.02163
Hindsight	89.6%	Yes (cloud)	arXiv:2512.12818
Letta / MemGPT	83.2%	Yes (cloud)	Letta blog
SLM V3	74.8%	No	arXiv:2603.14588

Engram is the top-performing system on LoCoMo — and the only one in the top tier with zero LLM calls at query time.

LongMemEval

	Engram	MemPalace	Mem0
R@5 (LongMemEval)	98.4%	96.6%	—
Embedding model	bge-large (1024d)	all-MiniLM (384d)	Varies
Sparse retrieval	BM25 + RRF fusion	Ad-hoc keyword overlap	N/A
Reranking	Cross-encoder (free)	LLM call ($0.001/q)	N/A
Indexing	User + assistant + preference docs	User turns only	LLM-extracted facts
Cloud deployment	Qdrant backend	No	Yes
LLM required	No	No (optional rerank)	Yes

Install

pip install engram-search

Optional extras:

# With cloud backend (Qdrant)
pip install engram-search[cloud]

# With cross-encoder reranker
pip install engram-search[rerank]

# Everything (dev + cloud + rerank)
pip install engram-search[all]

Quickstart — CLI

# Initialize a memory store
engram init ./my_memories

# Ingest conversations
engram ingest conversations.json --store ./my_memories

# Search
engram search "why did we switch to GraphQL" --store ./my_memories

Quickstart — Python API

from engram.backends.faiss_backend import FaissBackend
from engram.backends.base import Document
from engram.ingestion.parser import session_to_documents
from engram.retrieval.embedder import Embedder
from engram.retrieval.pipeline import RetrievalPipeline

# Initialize
embedder = Embedder("bge-large")
backend = FaissBackend(path="./my_memories", dimension=1024)
pipeline = RetrievalPipeline(embedder=embedder)

# Ingest a conversation
turns = [
    {"role": "user", "content": "I'm switching our API from REST to GraphQL."},
    {"role": "assistant", "content": "What's driving the switch?"},
    {"role": "user", "content": "Too many round trips. Our mobile app makes 12 calls per screen."},
]
docs = session_to_documents(turns, session_id="session_1", timestamp="2025-01-15")
texts = [d["text"] for d in docs]
embeddings = embedder.encode_documents(texts)

documents = [
    Document(id=d["id"], text=d["text"], embedding=e.tolist(), metadata=d["metadata"])
    for d, e in zip(docs, embeddings)
]
backend.add(documents)

# Search
results = pipeline.search("why did we switch to GraphQL", documents=documents, top_k=3)
for r in results:
    print(r.text)

Quickstart — Cloud Mode

# Set up Qdrant (managed or self-hosted)
export ENGRAM_BACKEND=qdrant
export ENGRAM_QDRANT_URL=https://your-cluster.qdrant.io:6333
export ENGRAM_QDRANT_API_KEY=your-api-key

# Start the API server
pip install fastapi uvicorn
uvicorn engram.server:app --host 0.0.0.0 --port 8000

API Endpoints

Method	Endpoint	Description
`POST`	`/ingest`	Add conversations
`POST`	`/search`	Search memories
`GET`	`/health`	Health check
`GET`	`/stats`	Store statistics

Examples

Check out the interactive notebooks in examples/:

Notebook	Description
Getting Started	Ingest conversations, search memories, understand hybrid retrieval
Customer Support	Build a support agent with full customer history recall
Personal Assistant	AI assistant with long-term memory across conversations

Docker

# Local mode
docker compose up

# Or build and run directly
docker build -t engram .
docker run -p 8000:8000 -v engram_data:/data engram

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Engram                               │
│                                                             │
│  ┌────────────┐  ┌─────────────┐  ┌───────────────────┐    │
│  │ Ingestion  │  │   Index     │  │    Retrieval      │    │
│  │            │→ │             │→ │                   │    │
│  │ user+asst  │  │ FAISS (local│  │ 1. Dense (bi-enc) │    │
│  │ turns      │  │  or Qdrant  │  │ 2. BM25 (sparse)  │    │
│  │ preference │  │ (cloud)     │  │ 3. RRF fusion     │    │
│  │ extraction │  │             │  │ 4. Cross-encoder   │    │
│  └────────────┘  └─────────────┘  └───────────────────┘    │
│                                                             │
│  Local: FAISS + SQLite    Cloud: Qdrant + REST API          │
└─────────────────────────────────────────────────────────────┘

Run Benchmarks

LongMemEval

# Download dataset
curl -fsSL -o data/longmemeval_s_cleaned.json \
  https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_s_cleaned.json

pip install engram-search[all]

python benchmarks/longmemeval_bench.py data/longmemeval_s_cleaned.json --mode hybrid

LoCoMo

# Download dataset (from Snap Research)
curl -fsSL -o data/locomo10.json \
  https://raw.githubusercontent.com/snap-research/locomo/main/data/locomo10.json

python benchmarks/locomo_bench.py data/locomo10.json --mode rerank

Requirements

Python 3.9+
~1.3 GB disk for bge-large embedding model (downloaded on first use)
No API keys required for local mode

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.7

Apr 19, 2026

0.1.6

Apr 19, 2026

0.1.5

Apr 18, 2026

This version

0.1.4

Apr 18, 2026

0.1.3

Apr 17, 2026

0.1.2

Apr 17, 2026

0.1.1

Apr 17, 2026

0.1.0

Apr 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

engram_search-0.1.4.tar.gz (415.2 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

engram_search-0.1.4-py3-none-any.whl (28.5 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file engram_search-0.1.4.tar.gz.

File metadata

Download URL: engram_search-0.1.4.tar.gz
Upload date: Apr 18, 2026
Size: 415.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for engram_search-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`3a07e4c9c26cc0de8a6e088925cb69f3b6b74f011ea06782841f223c37eced9a`
MD5	`78c9e0c29953c3fe91e89a09a0f6b2ef`
BLAKE2b-256	`02c181eb4fc359629e0185e064f62a2e2b2269221dd6d1181bbbe9da734b70ab`

See more details on using hashes here.

File details

Details for the file engram_search-0.1.4-py3-none-any.whl.

File metadata

Download URL: engram_search-0.1.4-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 28.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for engram_search-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`88055379c619a9cf8b320d03cff4a3a99c2110902af38a3b64be2691ddb2e9b1`
MD5	`2efe745cfc2cbbd13d4c7192faa0f9d7`
BLAKE2b-256	`370d0f6ab76860ee7d7fef2a00125db2c5d89feaeaf3a380d96b5160d9adc9ed`

See more details on using hashes here.

engram-search 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Engram

Benchmark Results

LongMemEval (500 questions)

LoCoMo (1982 questions, 10 conversations)

What It Does

How It Compares

LoCoMo — R@5 Leaderboard

LongMemEval

Install

Quickstart — CLI

Quickstart — Python API

Quickstart — Cloud Mode

API Endpoints

Examples

Docker

Architecture

Run Benchmarks

LongMemEval

LoCoMo

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes