Mastra-level memory intelligence in a pip install — observational memory for coding agents
Project description
memri
Persistent, graph-based memory for AI coding agents — in a pip install.
The problem
Every time you start a new session with Claude Code, Cursor, or Codex — it forgets everything. The architecture you designed last week. The library you chose. The bug you already fixed. You repeat yourself. The agent repeats mistakes.
memri fixes this. It builds a structured graph of your memory — entities, facts, causal chains, and reflections — and injects the most relevant context at the start of every new session. Your agent picks up exactly where it left off.
Features
- Graph-based memory (v1.0) — entities, facts, causal chains, and higher-level reflections stored in a queryable graph
- Entity tracking (v1.0) — people, places, and concepts are linked across all sessions
- Three-layer architecture (v1.0) — always-in-context index (Layer 0), fact graph (Layer 1), raw episode archive (Layer 2)
- RRF ranking (v1.0) — Reciprocal Rank Fusion across vector, BM25, importance, and recency signals
- Automatic compression — conversations beyond 30K tokens are compressed 5–40× into timestamped observations
- Cross-session recall — memory context is injected at the start of every new session, no setup needed
- Semantic search — find anything from past sessions with natural language (
memri_search "auth pattern we chose") - Procedural memory (v0.2) — learns how to work better with you over time, not just what happened
- Frustration detection (v0.2) — detects when you're frustrated and permanently stores what went wrong as a high-priority strategy
- Works with any LLM — Anthropic, Gemini, OpenAI, or any OpenAI-compatible endpoint
- 100% local — all data on your machine, no cloud, no accounts
Install
pip install memri
With full graph memory (recommended):
pip install "memri[graph]"
With semantic search only:
pip install "memri[embeddings]"
Quick start
One command wires memri into Claude Code:
memri init --claude-code
This does three things automatically:
- Creates
~/.memri/memri.db(local SQLite database) - Registers memri as an MCP server in Claude Code's
settings.json - Writes the recall instruction to
~/.claude/CLAUDE.md
That's it. Open a new Claude Code session — memri starts working.
How it works
Your conversation
│
▼
[Observer] triggers at 30K tokens
│ compresses turns → timestamped observations (5–40× smaller)
│
[Strategist] runs on every message (v0.2)
│ detects frustration → stores "what went wrong"
│ end of session → distills "what worked" → strategies
▼
[GraphEngine] primary memory backend (v1.0)
│
├── Layer 2 raw episode archive (SQLite, zero data loss)
│
├── Layer 1 fact/entity/reflection graph (NetworkX)
│ causal chains, temporal edges, entity linking
│
└── Layer 0 always-in-context routing index (~500 tokens)
entity index, topic clusters, user summary
│
▼
[get_context()] Layer 0 index → strategies → observations → recent turns
│
▼
Injected at the top of your next session
Three layers of memory:
| Layer | What it stores | Size |
|---|---|---|
| Layer 0 | Entity index, topic clusters, user summary — always in context | ~500 tokens |
| Layer 1 | Fact/entity/reflection graph with causal and temporal edges | grows with sessions |
| Layer 2 | Raw episode archive — zero data loss, full session text | cold storage |
Three types of memory content:
| Type | What it stores | Example |
|---|---|---|
| Episodic | What happened in past sessions | "User chose PostgreSQL over SQLite on 2026-04-10" |
| Procedural | How to work better with this user | "Always confirm before running destructive commands" |
| Graph | Entity relationships and causal chains | "Deadline stress caused repeated tool failures" |
MCP tools
memri exposes 7 tools to your coding agent via MCP:
| Tool | When to call |
|---|---|
memri_recall |
Start of every session — restores compressed context |
memri_store |
User shares something important to remember |
memri_search |
Looking for context from a different project or thread |
memri_ingest |
Manually process a session into memory |
memri_distill |
End of session — extract generalizable strategies |
memri_status |
Check token savings, cost, session stats |
memri_forget |
Delete memories for a specific thread |
Configuration
Config at ~/.memri/config.json:
{
"llm_provider": "gemini",
"llm_model": "gemini-2.5-flash",
"observe_threshold": 30000,
"reflect_threshold": 40000
}
API keys at ~/.memri/.env:
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
Supported providers: anthropic, claude-code-auth, gemini, gemini-adc, openai, openai-compatible (Groq, Ollama, Together, Mistral), passive (no key needed).
Don't have an API key?
Option 1 — Use your Claude subscription (zero setup if you already use Claude Code)
If you've ever run claude in your terminal, memri automatically detects your credentials and uses them. No API key needed — your Claude Pro / Max / Team subscription covers it.
memri init --claude-code # auto-detects Claude login, configures instantly
Option 2 — Use your Google account (zero setup if you use gcloud)
If you've run gcloud auth application-default login, memri detects those credentials automatically. Works with any Google account.
gcloud auth application-default login # one-time, if not already done
memri init --claude-code # auto-detects Google credentials
Option 3 — Free Gemini API (takes 1 minute) Google's Gemini 2.0 Flash has a permanently free tier — no credit card, no trial. Get a key in 1 minute: aistudio.google.com/apikey
# ~/.memri/.env
GEMINI_API_KEY=your-key-here
// ~/.memri/config.json
{ "llm_provider": "gemini", "llm_model": "gemini-2.0-flash" }
Option 4 — Local model via Ollama (fully private) Run any open-source model on your own hardware.
# Install from ollama.ai, then:
ollama pull llama3
// ~/.memri/config.json
{
"llm_provider": "openai-compatible",
"llm_base_url": "http://localhost:11434/v1",
"llm_model": "llama3"
}
Option 5 — Passive mode (zero setup)
No API key, no compression. memri stores sessions locally and returns recent context directly. memri_recall, memri_store, and memri_search all work.
// ~/.memri/config.json
{ "llm_provider": "passive" }
What about OpenAI / ChatGPT? A ChatGPT subscription ($20/mo) does not include API access — OpenAI sells these separately. If you use GPT models, you need an OpenAI API key (
OPENAI_API_KEY). The free options above (Gemini ADC or free Gemini API) are typically easier.
CLI
memri init --claude-code # First-time setup (one command)
memri status # Token savings, cost, session count
memri watch # Auto-ingest new sessions in real time
memri ingest # Ingest existing session history
memri observe # Manually run the Observer on all threads
memri embed # Build semantic search index
memri dashboard # Web dashboard at http://localhost:8050
memri config # View / edit config
Benchmarks
Evaluated on LongMemEval-S — 500 QA pairs across 6 question types designed to test AI assistant long-term memory.
| Question type | Raw baseline | memri v1.0 graph |
|---|---|---|
| Single-session (user) | ~95% | ~97% |
| Single-session (assistant) | ~90% | ~93% |
| Knowledge update | ~82% | ~88% |
| Temporal reasoning | ~65% | ~76% |
| Preference | ~55% | ~72% |
| Multi-session | ~50% | ~74% |
| Overall | 70.6% | 83% |
Raw baseline: full ~115K token conversation passed directly to Gemini 2.5 Flash.
memri v1.0 graph: sessions ingested into a 3-layer graph (facts, entities, causal chains), then only the top-k relevant facts retrieved per query (~500 tokens). Better accuracy, a fraction of the tokens.
Comparison
| memri | Mastra OM | mem0 | Full context | |
|---|---|---|---|---|
| Language | Python | TypeScript | Python | — |
| Install | pip install |
framework lock-in | pip install |
— |
| Works with | Claude Code, Cursor, Codex | Mastra only | any | any |
| Storage | local SQLite + graph | cloud | cloud | none |
| Graph-based memory | ✅ v1.0 | ❌ | ❌ | ❌ |
| Entity tracking | ✅ v1.0 | ❌ | partial | ❌ |
| Causal chains | ✅ v1.0 | ❌ | ❌ | ❌ |
| Procedural memory | ✅ v0.2 | ❌ | ❌ | ❌ |
| Frustration detection | ✅ v0.2 | ❌ | ❌ | ❌ |
| Semantic search | ✅ local | ❌ | ✅ cloud | ❌ |
| Dashboard | ✅ | ❌ | ✅ | ❌ |
| Token compression | 200× (graph retrieval) | 5–40× | varies | 1× |
| Privacy | 100% local | cloud | cloud | local |
Privacy
Your data never leaves your machine.
- Conversation history and observations live in
~/.memri/memri.db— a local SQLite file only you can read. - memri has no servers, no telemetry, no accounts.
- The only external calls are to your LLM provider (the same one your coding agent already uses) to run compression.
- API keys are read from environment variables and never written to the database.
memri status # see exactly what's stored
memri forget <thread_id> # delete a specific thread
rm ~/.memri/memri.db # delete everything
Development
git clone https://github.com/SarthakK337/memri
cd memri
pip install -e ".[dev,graph,embeddings]"
pytest
Contributions welcome. Open an issue before starting large changes.
License
MIT © 2026
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memri-1.0.0.tar.gz.
File metadata
- Download URL: memri-1.0.0.tar.gz
- Upload date:
- Size: 19.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91b787a91af1fad031a6db03a9a7834a2a611c566f5531f584c5dba9075df95e
|
|
| MD5 |
3a8a5c05a4a6adb5d733b29f1e46acfa
|
|
| BLAKE2b-256 |
8600108103d1d16f1cd5631e00643088f661fe4e1737bdf27ded20b023db085f
|
File details
Details for the file memri-1.0.0-py3-none-any.whl.
File metadata
- Download URL: memri-1.0.0-py3-none-any.whl
- Upload date:
- Size: 97.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ced8b0230d73badff73d2380f1164b20d0fc1bce37edb3230129dbc7a33e85eb
|
|
| MD5 |
73955d701eb4b172bccdf46bf14eeccc
|
|
| BLAKE2b-256 |
8701fcb26423748f5aa1008c0a093467153d320a10340173983de34625df34ef
|