SOTA-level agent memory at zero infrastructure cost.
Project description
a sauron company
One SQLite file. Zero cloud. One command to set up.
🏆 93.0% on LoCoMo (Pro) · 🥇 SOTA on BEAM-1M (75.0%) · 📦 One SQLite File · ☁️ Zero Cloud · 💰 Zero Infrastructure Cost
Benchmark · Highlights · Edge / Base / Pro · Install · First Run · API
🔬 Benchmark
Tested on LoCoMo, the standard benchmark for conversational memory. 1,540 questions across 10 conversations. All 8 systems share the same answer model, judge, scoring, top-k, and byte-identical answer prompt — only retrieval differs.
All charts reflect the current 3-run mean scores (89.6 / 92.0 / 93.0%). Source HTML files in
assets/charts/.
TrueMemory achieves state-of-the-art accuracy for fully-local memory systems at zero ongoing infrastructure cost. Edge and Base run entirely offline with no API keys. Pro adds one small LLM call per query for HyDE query expansion.
All scores use the same evaluation pipeline: GPT-4.1-mini answer generation, GPT-4o-mini judge (3x majority vote), temperature=0. Zero errors across 12,320 total answers. Scores use a lenient semantic-match judge; rankings are valid across all systems but absolute values are higher than published LoCoMo baselines using strict exact-match. Full methodology and reproduction scripts in benchmarks/.
⚡ Research Highlights
- 30+ percentage points more accurate than Mem0 on LoCoMo (93.0% Pro vs 61.4%)
- 2x more cost-efficient per correct answer than Mem0
- Runs offline on any device with Python 3.10+ and 512MB RAM (Edge tier)
- One SQLite file, zero API keys for Edge and Base tiers. The entire 6-layer system runs offline.
- Within 1.5pp of EverMemOS, the only higher-scoring system — and EverMemOS uses pre-computed retrieval rather than live search at query time.
TrueMemory Pro nearly matches EverMemOS across all 4 question categories. Mem0 collapses on multi-hop reasoning (37.7% vs 90.0%).
🏗️ Edge / Base / Pro
Same features, same 6-layer pipeline. Three tiers trade off install size, hardware, and LoCoMo accuracy. All three use the same retrieval architecture (FTS5 + dense + RRF + cross-encoder reranker); the differences are the embedder, the reranker, and whether HyDE query expansion is used. Base and Pro share an embedder and reranker — only HyDE differs.
| Edge | Base | Pro | |
|---|---|---|---|
| LoCoMo (3-run mean ± std) | 89.6% ±0.19 | 92.0% ±0.22 | 93.0% ±0.14 |
| BEAM 1M | — | — | 75.0% |
| Embedder | Model2Vec potion-base-8M (8M params, 256d) | Qwen3-Embedding-0.6B @ 256d Matryoshka (600M params) | Qwen3-Embedding-0.6B @ 256d Matryoshka (600M params) |
| Reranker | ms-marco-MiniLM-L-6-v2 (22M) | gte-reranker-modernbert-base (149M) | gte-reranker-modernbert-base (149M) |
| HyDE | off | off | on (requires an LLM API key) |
| Runs on | Any machine, CPU only | 4GB+ RAM, CPU or GPU | 4GB+ RAM, CPU or GPU + LLM API key |
| First install | ~30MB | ~1.5GB one-time download | ~1.5GB one-time download |
| Speed | Ultra-fast | Fast | Fast + 1 LLM call/query |
Edge works everywhere. Base is the strongest fully-offline tier. Pro adds HyDE for the highest LoCoMo score.
Encoding Gate
Before storing a fact, the ingestion pipeline passes it through an encoding gate — a three-signal filter inspired by hippocampal novelty detection. Each candidate fact is scored by:
- Compression novelty (AUC 0.788) — gzip-based information gain against existing memories
- Speech-act salience (AUC 0.733) — rule-based scorer for short messages + L3's learned salience for longer text
- Embedding pair-diff PE (AUC 0.730) — detects when a message says something different about the same topic
The weighted sum (default: 0.25·novelty + 0.20·salience + 0.30·PE, normalized) must exceed a threshold (default: 0.30) to be stored. A salience floor (default: 0.10) rejects pure noise regardless of novelty. Per-category overrides lower the bar for corrections and decisions. Gate AUC: 0.810. Tune via TRUEMEMORY_GATE_* environment variables (see Configuration).
🚀 Quickstart
Claude Code / Claude Desktop
One command. Works on any Mac or Linux box, even if your system Python is old or missing entirely.
Step 1. Open Terminal:
- Mac: press
Cmd + Space, typeTerminal, pressEnter - Linux: press
Ctrl + Alt + T(or open your distro's terminal app)
Step 2. Paste this one line and press Enter:
curl -LsSf https://raw.githubusercontent.com/buildingjoshbetter/TrueMemory/main/install.sh | sh
Step 3. Wait ~1-2 minutes while it downloads and installs. You'll see progress messages scroll by — that's normal.
Step 4. If Claude Desktop was already open, quit it with Cmd+Q and reopen it (a new chat window is not enough — the config is only read at launch). Then start a new Claude session and TrueMemory walks you through choosing Edge, Base, or Pro on first run.
What this actually does: installs uv (Astral's Python tool manager) if needed, fetches a managed Python 3.12 into
~/.local/share/uv/, installs TrueMemory into an isolated tool environment, and auto-configures Claude Code and Claude Desktop. Your system Python is never touched. No sudo, no venvs, no pip struggle. Uninstall cleanly withuv tool uninstall truememory.
Want to audit the script first? It's ~170 lines of shell, no sudo, stays entirely under
$HOME. Read the source atinstall.sh, or download and inspect locally:curl -LsSf https://raw.githubusercontent.com/buildingjoshbetter/TrueMemory/main/install.sh -o install.sh && less install.sh && sh install.sh.
Want Base or Pro (adds Qwen3 embeddings + gte-reranker + sentence-transformers, ~1.5-2.5GB depending on OS)?
curl -LsSf https://raw.githubusercontent.com/buildingjoshbetter/TrueMemory/main/install.sh | TRUEMEMORY_EXTRAS="gpu" shThe default install is Edge (~30MB, the CPU-only tier). If you pick Base or Pro during first-run setup, TrueMemory will prompt you to install the extra models. Pro additionally requires an LLM API key at runtime for HyDE. (Linux CPU-only boxes will pull PyTorch's default CUDA wheel, which is larger — ~2.5GB total. Mac installs are closer to ~1.5GB.)
Python library (for developers)
If you're embedding TrueMemory in your own Python project (requires Python 3.10+):
pip install truememory
from truememory import Memory
m = Memory()
m.add("Prefers dark mode and TypeScript", user_id="alex")
m.add("Allergic to peanuts", user_id="alex")
results = m.search("What are Alex's preferences?", user_id="alex")
print(results[0]["content"])
# → "Prefers dark mode and TypeScript"
The database is created automatically at ~/.truememory/memories.db.
🤖 What happens on first run?
Claude forgets you between sessions. TrueMemory fixes that.
The installer (install.sh) wires up four lifecycle hooks (SessionStart, Stop, UserPromptSubmit, PreCompact) and merges instructions into your ~/.claude/CLAUDE.md. On your first session after installing, the SessionStart hook injects an ASCII banner and guided setup:
- Welcome banner — confirms TrueMemory is installed
- Tier selection — Edge, Base, or Pro (Claude walks you through it)
- API key (Pro only) — required for HyDE query expansion via Anthropic, OpenRouter, or OpenAI
- Example prompts — try "Remember that I prefer dark mode" then verify in a new session
On subsequent sessions, the SessionStart hook searches TrueMemory and injects up to 25 relevant memories as context so Claude knows who you are from the start. The Stop hook processes the conversation transcript in the background to extract and store new facts.
Manual setup
If auto-setup doesn't detect your Claude installation, you can configure manually.
Claude Code (if you used the installer above):
claude mcp add truememory -- truememory-mcp
Claude Desktop: add to claude_desktop_config.json (Settings > Developer > Edit Config):
{
"mcpServers": {
"truememory": {
"command": "/Users/YOU/.local/bin/truememory-mcp"
}
}
}
Use the absolute path to
truememory-mcp— runwhich truememory-mcpto find it. Claude Desktop (and most non-Claude-Code MCP clients) don't inherit your shell's PATH, so relative commands will silently fail.
No-install alternative (uvx): skip installing TrueMemory entirely and let Claude run it ephemerally. Requires uv to be installed.
{
"mcpServers": {
"truememory": {
"command": "/Users/YOU/.local/bin/uvx",
"args": ["--python", "3.12", "--from", "truememory", "truememory-mcp"]
}
}
}
uvx creates a cached environment on first run; subsequent spawns are fast. Good if you want TrueMemory to always be latest-on-PyPI without managing an install.
Configuration
All configuration is via environment variables. Defaults work out of the box — only set these to tune behavior.
| Variable | Default | Description |
|---|---|---|
TRUEMEMORY_EMBED_MODEL |
edge |
Active tier: edge, base, or pro |
TRUEMEMORY_RECALL_LIMIT |
25 |
Memories injected at session start |
TRUEMEMORY_GATE_ENABLED |
1 |
Enable/disable the encoding gate (0 to disable) |
TRUEMEMORY_GATE_THRESHOLD |
0.30 |
Gate threshold (0.0–1.0). Lower = stores more |
TRUEMEMORY_GATE_W_NOVELTY |
0.25 |
Weight for compression novelty signal |
TRUEMEMORY_GATE_W_SALIENCE |
0.20 |
Weight for speech-act salience signal |
TRUEMEMORY_GATE_W_PE |
0.30 |
Weight for embedding pair-diff prediction error |
TRUEMEMORY_GATE_SALIENCE_FLOOR |
0.10 |
Minimum salience to consider encoding |
TRUEMEMORY_MIN_MESSAGES |
5 |
Minimum messages in a session before extraction runs |
TRUEMEMORY_INGEST_SPAWN_CAP |
2 |
Max concurrent background ingestion processes |
TRUEMEMORY_ENTITY_SHEETS |
(off) | Set to 1 to re-enable legacy L4 entity profiles |
TRUEMEMORY_ALPHA_SURPRISE |
0.2 |
L5 surprise rerank boost alpha (set to 0 to disable) |
📖 API
| Method | What it does |
|---|---|
m.add(content, user_id=None) |
Store a memory |
m.search(query, user_id=None, limit=10) |
Search memories |
m.search_deep(query, user_id=None, limit=10) |
Agentic multi-round search (higher latency + LLM cost; best for ambiguous queries) |
m.get(memory_id) |
Get one memory |
m.get_all(user_id=None, limit=100) |
List all memories |
m.update(memory_id, content) |
Update a memory |
m.delete(memory_id) |
Delete a memory |
m.delete_all(user_id=None) |
Delete all |
📊 Full Benchmark Details
Every benchmark script is self-contained and runs on Modal.
- Leaderboard & Reproduction: run any system yourself
- Full Technical Report: per-category breakdowns, latency, cost, hardware
- Evaluation Config: exact models, prompts, parameters
📝 Citation
@software{truememory2026,
title = {TrueMemory: State-of-the-Art Local-First Agent Memory},
author = {@Building\_Josh},
organization = {Sauron},
year = {2026},
url = {https://github.com/buildingjoshbetter/TrueMemory},
version = {0.6.0}
}
⚖️ License
Licensed under AGPL-3.0. Free for personal and research use. Commercial use requires a separate license — contact josh@sauronlabs.ai.
TrueMemory, a sauron company · sauronlabs.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file truememory-0.6.0.tar.gz.
File metadata
- Download URL: truememory-0.6.0.tar.gz
- Upload date:
- Size: 276.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2dfdeed16b515ff2c9e281082dd2cbb51c277517ea66d18eadbbf382061f55bf
|
|
| MD5 |
891b672e370381f10024b1b273f4c1a3
|
|
| BLAKE2b-256 |
b65008df1793ebffc61e3ff40baa2c9be0395e155bb0a46a17f33c401c401b4d
|
Provenance
The following attestation bundles were made for truememory-0.6.0.tar.gz:
Publisher:
publish.yml on buildingjoshbetter/TrueMemory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
truememory-0.6.0.tar.gz -
Subject digest:
2dfdeed16b515ff2c9e281082dd2cbb51c277517ea66d18eadbbf382061f55bf - Sigstore transparency entry: 1429880738
- Sigstore integration time:
-
Permalink:
buildingjoshbetter/TrueMemory@bac33d4533450f34c0063b2322cd1539d73a6971 -
Branch / Tag:
refs/tags/v0.6.0 - Owner: https://github.com/buildingjoshbetter
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bac33d4533450f34c0063b2322cd1539d73a6971 -
Trigger Event:
release
-
Statement type:
File details
Details for the file truememory-0.6.0-py3-none-any.whl.
File metadata
- Download URL: truememory-0.6.0-py3-none-any.whl
- Upload date:
- Size: 209.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee9dc8480c6645e054c2dda3ef94c09a29f93b4a5f17590022c2d5e22c54ce6e
|
|
| MD5 |
a176c78a04657f8f09bc9e18a4a8de6d
|
|
| BLAKE2b-256 |
961f9f2ad60a3b99b2f68f00e4180f0235e814928fd07ddc0710bab6c0a62509
|
Provenance
The following attestation bundles were made for truememory-0.6.0-py3-none-any.whl:
Publisher:
publish.yml on buildingjoshbetter/TrueMemory
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
truememory-0.6.0-py3-none-any.whl -
Subject digest:
ee9dc8480c6645e054c2dda3ef94c09a29f93b4a5f17590022c2d5e22c54ce6e - Sigstore transparency entry: 1429880746
- Sigstore integration time:
-
Permalink:
buildingjoshbetter/TrueMemory@bac33d4533450f34c0063b2322cd1539d73a6971 -
Branch / Tag:
refs/tags/v0.6.0 - Owner: https://github.com/buildingjoshbetter
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@bac33d4533450f34c0063b2322cd1539d73a6971 -
Trigger Event:
release
-
Statement type: