Skip to main content

Persistent graph-based memory for Claude Code, Cursor and Codex — cross-session recall via MCP

Project description

memri

Persistent, graph-based memory for Claude Code, Cursor, and Codex — in a pip install.

PyPI Python CI License: MIT PyPI Downloads


The problem

Every time you start a new session with Claude Code, Cursor, or Codex — it forgets everything. The architecture you designed last week. The library you chose. The bug you already fixed. You repeat yourself. The agent repeats mistakes.

memri fixes this. It gives your AI coding agent a persistent memory that survives across sessions — so it always knows who you are, what you're building, and how you like to work.


Results

Evaluated on LongMemEval-S — 500 QA pairs designed to test AI long-term memory:

Score
Full context (no memory, 115K tokens) 70.6%
memri v1.0 graph memory 83%

Better recall. A fraction of the tokens.


Install

pip install "memri[graph]"

One command to wire it into Claude Code:

memri init --claude-code

That's it. Open a new Claude Code session — memri starts working immediately.


What it does

Every conversation you have with your coding agent gets ingested into a 3-layer memory graph:

Your conversation
      │
  [Graph Engine]
      │
      ├── Layer 2  raw episode archive  (SQLite, zero data loss)
      │
      ├── Layer 1  fact/entity/reflection graph  (NetworkX)
      │            causal chains · temporal edges · entity linking
      │
      └── Layer 0  always-in-context routing index  (~500 tokens)
                   entity index · topic clusters · user summary
      │
  [Retrieval]   Layer 0 → BFS traversal → RRF ranking
      │          returns only the relevant facts (~500 tokens)
      │
  Injected at the top of your next session

When you ask a question, memri doesn't dump your entire history into the prompt. It finds the specific facts, entities, and context that matter for that query — and injects only those.


Features

  • Graph-based memory — entities, facts, causal chains, and higher-level reflections stored in a queryable graph
  • Entity tracking — people, projects, and concepts linked across all sessions
  • Three-layer architecture — always-in-context index (Layer 0), fact graph (Layer 1), raw episode archive (Layer 2)
  • RRF ranking — Reciprocal Rank Fusion across vector, BM25, importance, and recency signals
  • Automatic compression — conversations beyond 30K tokens compressed 5–40× into timestamped observations
  • Cross-session recall — memory injected at the start of every session automatically
  • Semantic search — find anything from past sessions (memri search "auth pattern we chose")
  • Procedural memory — learns how to work with you over time, not just what happened
  • Frustration detection — detects when you're frustrated, permanently stores what went wrong
  • Works with any LLM — Anthropic, Gemini, OpenAI, or any OpenAI-compatible endpoint
  • 100% local — all data on your machine, no cloud, no accounts
  • Visual dashboard — interactive graph visualization, Layer 0 index, episode browser

Quick start

# Install
pip install "memri[graph]"

# Wire into Claude Code (one command)
memri init --claude-code

# Open a new Claude Code session — memri is already running

Don't have an API key?

Use your existing Claude subscription — if you've run claude in your terminal, memri detects your credentials automatically. No API key needed.

memri init --claude-code   # auto-detects Claude login

Use your Google account — if you've run gcloud auth application-default login:

memri init --claude-code   # auto-detects gcloud credentials

Free Gemini APIget a key in 1 minute, no credit card:

# ~/.memri/.env
GEMINI_API_KEY=your-key-here

Passive mode — no API key, no compression, still works:

{ "llm_provider": "passive" }

How it works

Three layers of memory

Layer What it stores Size
Layer 0 Entity index, topic clusters, user summary — always in context ~500 tokens
Layer 1 Fact/entity/reflection graph with causal and temporal edges grows with sessions
Layer 2 Raw episode archive — zero data loss, full session text cold storage

Three types of memory

Type What it stores Example
Episodic What happened in past sessions "Chose PostgreSQL over SQLite on 2026-04-10"
Procedural How to work better with this user "Always confirm before running destructive commands"
Graph Entity relationships and causal chains "Deadline stress caused repeated tool failures"

MCP tools

memri exposes 7 tools to your coding agent via MCP:

Tool When to call
memri_recall Start of every session — restores compressed context
memri_store User shares something important to remember
memri_search Looking for context from a different project or thread
memri_ingest Manually process a session into memory
memri_distill End of session — extract generalizable strategies
memri_status Check token savings, cost, session stats
memri_forget Delete memories for a specific thread

Configuration

Config at ~/.memri/config.json:

{
  "llm_provider": "gemini",
  "llm_model": "gemini-2.5-flash",
  "memory_engine": "graph",
  "observe_threshold": 30000
}

Supported providers: anthropic, claude-code-auth, gemini, gemini-adc, openai, openai-compatible (Groq, Ollama, Together, Mistral), passive.


CLI

memri init --claude-code   # First-time setup
memri status               # Token savings, cost, session count
memri watch                # Auto-ingest new sessions in real time
memri ingest               # Ingest existing session history
memri observe              # Run Observer on all threads
memri embed                # Build semantic search index
memri dashboard            # Web dashboard at http://localhost:8050
memri config               # View / edit config

Benchmarks

Evaluated on LongMemEval-S — 500 QA pairs across 6 question types designed to test AI assistant long-term memory.

Question type Raw baseline memri v1.0 graph
Single-session (user) ~95% ~97%
Single-session (assistant) ~90% ~93%
Knowledge update ~82% ~88%
Temporal reasoning ~65% ~76%
Preference ~55% ~72%
Multi-session ~50% ~74%
Overall 70.6% 83%

Raw baseline: full ~115K token conversation passed directly to Gemini 2.5 Flash. memri v1.0 graph: sessions ingested into the 3-layer graph, top-k facts retrieved per query (~500 tokens). Better accuracy, 200× fewer tokens.


Comparison

memri Mastra OM mem0 Full context
Language Python TypeScript Python
Install pip install framework lock-in pip install
Works with Claude Code, Cursor, Codex Mastra only any any
Storage local SQLite + graph cloud cloud none
Graph-based memory ✅ v1.0
Entity tracking ✅ v1.0 partial
Causal chains ✅ v1.0
Procedural memory ✅ v0.2
Frustration detection ✅ v0.2
Semantic search ✅ local ✅ cloud
Dashboard
Token compression 200× (graph retrieval) 5–40× varies
LongMemEval-S accuracy 83% 70.6%
Privacy 100% local cloud cloud local

Privacy

Your data never leaves your machine.

  • Conversation history and memory live in ~/.memri/ — local files only you can read
  • No servers, no telemetry, no accounts
  • The only external calls are to your LLM provider (the same one your coding agent already uses)
  • API keys are read from environment variables and never written to the database
memri status               # see exactly what's stored
memri forget <thread_id>   # delete a specific thread
rm -rf ~/.memri/           # delete everything

Development

git clone https://github.com/SarthakK337/memri
cd memri
pip install -e ".[dev,graph,embeddings]"
pytest

Contributions welcome. Open an issue before starting large changes.


License

MIT © 2026

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memri-1.0.1.tar.gz (10.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memri-1.0.1-py3-none-any.whl (96.6 kB view details)

Uploaded Python 3

File details

Details for the file memri-1.0.1.tar.gz.

File metadata

  • Download URL: memri-1.0.1.tar.gz
  • Upload date:
  • Size: 10.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-1.0.1.tar.gz
Algorithm Hash digest
SHA256 4b2c5ff99bf5476e9a174377c925b4695ce1bef347c319cdc615f5f1a1e43005
MD5 68242ab16a5b32d59ef820acb65446c1
BLAKE2b-256 4f68a1813afa11fa88d718cba25cb1aa3892b8dab6335ee5ec575745d7d59e0f

See more details on using hashes here.

File details

Details for the file memri-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: memri-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 96.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ed520c511ac4332f3b91a58e989337703c605075ef84034164b4f6aa27d1ed15
MD5 6775d2658fe6d9927f83c224c025b68f
BLAKE2b-256 1dab4d33b3a0a930eeafa29171dee8f0395f7802cf37d68b1de368d623dde7c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page