Persistent graph-based memory for Claude Code, Cursor and Codex — cross-session recall via MCP

These details have not been verified by PyPI

Project links

Project description

memri

Persistent, graph-based memory for Claude Code, Cursor, and Codex — in a pip install.

The problem

Every time you start a new session with Claude Code, Cursor, or Codex — it forgets everything. The architecture you designed last week. The library you chose. The bug you already fixed. You repeat yourself. The agent repeats mistakes.

memri fixes this. It gives your AI coding agent a persistent memory that survives across sessions — so it always knows who you are, what you're building, and how you like to work.

Results

Evaluated on LongMemEval-S — 500 QA pairs designed to test AI long-term memory:

	Score
Full context (no memory, 115K tokens)	70.6%
memri v1.0 graph memory	83%

Better recall. A fraction of the tokens.

Install

pip install "memri[graph]"

One command to wire it into Claude Code:

memri init --claude-code

That's it. Open a new Claude Code session — memri starts working immediately.

What it does

Every conversation you have with your coding agent gets ingested into a 3-layer memory graph:

Your conversation
      │
  [Graph Engine]
      │
      ├── Layer 2  raw episode archive  (SQLite, zero data loss)
      │
      ├── Layer 1  fact/entity/reflection graph  (NetworkX)
      │            causal chains · temporal edges · entity linking
      │
      └── Layer 0  always-in-context routing index  (~500 tokens)
                   entity index · topic clusters · user summary
      │
  [Retrieval]   Layer 0 → BFS traversal → RRF ranking
      │          returns only the relevant facts (~500 tokens)
      │
  Injected at the top of your next session

When you ask a question, memri doesn't dump your entire history into the prompt. It finds the specific facts, entities, and context that matter for that query — and injects only those.

Features

Graph-based memory — entities, facts, causal chains, and higher-level reflections stored in a queryable graph
Entity tracking — people, projects, and concepts linked across all sessions
Three-layer architecture — always-in-context index (Layer 0), fact graph (Layer 1), raw episode archive (Layer 2)
RRF ranking — Reciprocal Rank Fusion across vector, BM25, importance, and recency signals
Automatic compression — conversations beyond 30K tokens compressed 5–40× into timestamped observations
Cross-session recall — memory injected at the start of every session automatically
Semantic search — find anything from past sessions (memri search "auth pattern we chose")
Procedural memory — learns how to work with you over time, not just what happened
Frustration detection — detects when you're frustrated, permanently stores what went wrong
Works with any LLM — Anthropic, Gemini, OpenAI, or any OpenAI-compatible endpoint
100% local — all data on your machine, no cloud, no accounts
Visual dashboard — interactive graph visualization, Layer 0 index, episode browser

Quick start

# Install
pip install "memri[graph]"

# Wire into Claude Code (one command)
memri init --claude-code

# Open a new Claude Code session — memri is already running

Don't have an API key?

Use your existing Claude subscription — if you've run claude in your terminal, memri detects your credentials automatically. No API key needed.

memri init --claude-code   # auto-detects Claude login

Use your Google account — if you've run gcloud auth application-default login:

memri init --claude-code   # auto-detects gcloud credentials

Free Gemini API — get a key in 1 minute, no credit card:

# ~/.memri/.env
GEMINI_API_KEY=your-key-here

Passive mode — no API key, no compression, still works:

{ "llm_provider": "passive" }

How it works

Three layers of memory

Layer	What it stores	Size
Layer 0	Entity index, topic clusters, user summary — always in context	~500 tokens
Layer 1	Fact/entity/reflection graph with causal and temporal edges	grows with sessions
Layer 2	Raw episode archive — zero data loss, full session text	cold storage

Three types of memory

Type	What it stores	Example
Episodic	What happened in past sessions	"Chose PostgreSQL over SQLite on 2026-04-10"
Procedural	How to work better with this user	"Always confirm before running destructive commands"
Graph	Entity relationships and causal chains	"Deadline stress caused repeated tool failures"

MCP tools

memri exposes 7 tools to your coding agent via MCP:

Tool	When to call
`memri_recall`	Start of every session — restores compressed context
`memri_store`	User shares something important to remember
`memri_search`	Looking for context from a different project or thread
`memri_ingest`	Manually process a session into memory
`memri_distill`	End of session — extract generalizable strategies
`memri_status`	Check token savings, cost, session stats
`memri_forget`	Delete memories for a specific thread

Configuration

Config at ~/.memri/config.json:

{
  "llm_provider": "gemini",
  "llm_model": "gemini-2.5-flash",
  "memory_engine": "graph",
  "observe_threshold": 30000
}

Supported providers: anthropic, claude-code-auth, gemini, gemini-adc, openai, openai-compatible (Groq, Ollama, Together, Mistral), passive.

CLI

memri init --claude-code   # First-time setup
memri status               # Token savings, cost, session count
memri watch                # Auto-ingest new sessions in real time
memri ingest               # Ingest existing session history
memri observe              # Run Observer on all threads
memri embed                # Build semantic search index
memri dashboard            # Web dashboard at http://localhost:8050
memri config               # View / edit config

Benchmarks

Evaluated on LongMemEval-S — 500 QA pairs across 6 question types designed to test AI assistant long-term memory.

Question type	Raw baseline	memri v1.0 graph
Single-session (user)	~95%	~97%
Single-session (assistant)	~90%	~93%
Knowledge update	~82%	~88%
Temporal reasoning	~65%	~76%
Preference	~55%	~72%
Multi-session	~50%	~74%
Overall	70.6%	83%

Raw baseline: full ~115K token conversation passed directly to Gemini 2.5 Flash. memri v1.0 graph: sessions ingested into the 3-layer graph, top-k facts retrieved per query (~500 tokens). Better accuracy, 200× fewer tokens.

Comparison

	memri	Mastra OM	mem0	Full context
Language	Python	TypeScript	Python	—
Install	`pip install`	framework lock-in	`pip install`	—
Works with	Claude Code, Cursor, Codex	Mastra only	any	any
Storage	local SQLite + graph	cloud	cloud	none
Graph-based memory	✅ v1.0	❌	❌	❌
Entity tracking	✅ v1.0	❌	partial	❌
Causal chains	✅ v1.0	❌	❌	❌
Procedural memory	✅ v0.2	❌	❌	❌
Frustration detection	✅ v0.2	❌	❌	❌
Semantic search	✅ local	❌	✅ cloud	❌
Dashboard	✅	❌	✅	❌
Token compression	200× (graph retrieval)	5–40×	varies	1×
LongMemEval-S accuracy	83%	—	—	70.6%
Privacy	100% local	cloud	cloud	local

Privacy

Your data never leaves your machine.

Conversation history and memory live in ~/.memri/ — local files only you can read
No servers, no telemetry, no accounts
The only external calls are to your LLM provider (the same one your coding agent already uses)
API keys are read from environment variables and never written to the database

memri status               # see exactly what's stored
memri forget <thread_id>   # delete a specific thread
rm -rf ~/.memri/           # delete everything

Development

git clone https://github.com/SarthakK337/memri
cd memri
pip install -e ".[dev,graph,embeddings]"
pytest

Contributions welcome. Open an issue before starting large changes.

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.1

Apr 25, 2026

1.0.0

Apr 25, 2026

0.2.0

Apr 24, 2026

0.0.1

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memri-1.0.1.tar.gz (10.8 MB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

memri-1.0.1-py3-none-any.whl (96.6 kB view details)

Uploaded Apr 25, 2026 Python 3

File details

Details for the file memri-1.0.1.tar.gz.

File metadata

Download URL: memri-1.0.1.tar.gz
Upload date: Apr 25, 2026
Size: 10.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`4b2c5ff99bf5476e9a174377c925b4695ce1bef347c319cdc615f5f1a1e43005`
MD5	`68242ab16a5b32d59ef820acb65446c1`
BLAKE2b-256	`4f68a1813afa11fa88d718cba25cb1aa3892b8dab6335ee5ec575745d7d59e0f`

See more details on using hashes here.

File details

Details for the file memri-1.0.1-py3-none-any.whl.

File metadata

Download URL: memri-1.0.1-py3-none-any.whl
Upload date: Apr 25, 2026
Size: 96.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed520c511ac4332f3b91a58e989337703c605075ef84034164b4f6aa27d1ed15`
MD5	`6775d2658fe6d9927f83c224c025b68f`
BLAKE2b-256	`1dab4d33b3a0a930eeafa29171dee8f0395f7802cf37d68b1de368d623dde7c1`

See more details on using hashes here.

memri 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

memri

The problem

Results

Install

What it does

Features

Quick start

Don't have an API key?

How it works

Three layers of memory

Three types of memory

MCP tools

Configuration

CLI

Benchmarks

Comparison

Privacy

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes