Skip to main content

Mastra-level memory intelligence in a pip install — observational memory for coding agents

Project description

memri

Persistent, graph-based memory for AI coding agents — in a pip install.

PyPI Python CI License: MIT


The problem

Every time you start a new session with Claude Code, Cursor, or Codex — it forgets everything. The architecture you designed last week. The library you chose. The bug you already fixed. You repeat yourself. The agent repeats mistakes.

memri fixes this. It builds a structured graph of your memory — entities, facts, causal chains, and reflections — and injects the most relevant context at the start of every new session. Your agent picks up exactly where it left off.


Features

  • Graph-based memory (v1.0) — entities, facts, causal chains, and higher-level reflections stored in a queryable graph
  • Entity tracking (v1.0) — people, places, and concepts are linked across all sessions
  • Three-layer architecture (v1.0) — always-in-context index (Layer 0), fact graph (Layer 1), raw episode archive (Layer 2)
  • RRF ranking (v1.0) — Reciprocal Rank Fusion across vector, BM25, importance, and recency signals
  • Automatic compression — conversations beyond 30K tokens are compressed 5–40× into timestamped observations
  • Cross-session recall — memory context is injected at the start of every new session, no setup needed
  • Semantic search — find anything from past sessions with natural language (memri_search "auth pattern we chose")
  • Procedural memory (v0.2) — learns how to work better with you over time, not just what happened
  • Frustration detection (v0.2) — detects when you're frustrated and permanently stores what went wrong as a high-priority strategy
  • Works with any LLM — Anthropic, Gemini, OpenAI, or any OpenAI-compatible endpoint
  • 100% local — all data on your machine, no cloud, no accounts

Install

pip install memri

With full graph memory (recommended):

pip install "memri[graph]"

With semantic search only:

pip install "memri[embeddings]"

Quick start

One command wires memri into Claude Code:

memri init --claude-code

This does three things automatically:

  1. Creates ~/.memri/memri.db (local SQLite database)
  2. Registers memri as an MCP server in Claude Code's settings.json
  3. Writes the recall instruction to ~/.claude/CLAUDE.md

That's it. Open a new Claude Code session — memri starts working.


How it works

Your conversation
      │
      ▼
  [Observer]      triggers at 30K tokens
      │           compresses turns → timestamped observations (5–40× smaller)
      │
  [Strategist]    runs on every message                           (v0.2)
      │           detects frustration → stores "what went wrong"
      │           end of session → distills "what worked" → strategies
      ▼
 [GraphEngine]    primary memory backend                          (v1.0)
      │
      ├── Layer 2  raw episode archive  (SQLite, zero data loss)
      │
      ├── Layer 1  fact/entity/reflection graph  (NetworkX)
      │            causal chains, temporal edges, entity linking
      │
      └── Layer 0  always-in-context routing index  (~500 tokens)
                   entity index, topic clusters, user summary
      │
      ▼
 [get_context()]  Layer 0 index → strategies → observations → recent turns
      │
      ▼
 Injected at the top of your next session

Three layers of memory:

Layer What it stores Size
Layer 0 Entity index, topic clusters, user summary — always in context ~500 tokens
Layer 1 Fact/entity/reflection graph with causal and temporal edges grows with sessions
Layer 2 Raw episode archive — zero data loss, full session text cold storage

Three types of memory content:

Type What it stores Example
Episodic What happened in past sessions "User chose PostgreSQL over SQLite on 2026-04-10"
Procedural How to work better with this user "Always confirm before running destructive commands"
Graph Entity relationships and causal chains "Deadline stress caused repeated tool failures"

MCP tools

memri exposes 7 tools to your coding agent via MCP:

Tool When to call
memri_recall Start of every session — restores compressed context
memri_store User shares something important to remember
memri_search Looking for context from a different project or thread
memri_ingest Manually process a session into memory
memri_distill End of session — extract generalizable strategies
memri_status Check token savings, cost, session stats
memri_forget Delete memories for a specific thread

Configuration

Config at ~/.memri/config.json:

{
  "llm_provider": "gemini",
  "llm_model": "gemini-2.5-flash",
  "observe_threshold": 30000,
  "reflect_threshold": 40000
}

API keys at ~/.memri/.env:

GEMINI_API_KEY=...
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...

Supported providers: anthropic, claude-code-auth, gemini, gemini-adc, openai, openai-compatible (Groq, Ollama, Together, Mistral), passive (no key needed).

Don't have an API key?

Option 1 — Use your Claude subscription (zero setup if you already use Claude Code) If you've ever run claude in your terminal, memri automatically detects your credentials and uses them. No API key needed — your Claude Pro / Max / Team subscription covers it.

memri init --claude-code   # auto-detects Claude login, configures instantly

Option 2 — Use your Google account (zero setup if you use gcloud) If you've run gcloud auth application-default login, memri detects those credentials automatically. Works with any Google account.

gcloud auth application-default login   # one-time, if not already done
memri init --claude-code                # auto-detects Google credentials

Option 3 — Free Gemini API (takes 1 minute) Google's Gemini 2.0 Flash has a permanently free tier — no credit card, no trial. Get a key in 1 minute: aistudio.google.com/apikey

# ~/.memri/.env
GEMINI_API_KEY=your-key-here
// ~/.memri/config.json
{ "llm_provider": "gemini", "llm_model": "gemini-2.0-flash" }

Option 4 — Local model via Ollama (fully private) Run any open-source model on your own hardware.

# Install from ollama.ai, then:
ollama pull llama3
// ~/.memri/config.json
{
  "llm_provider": "openai-compatible",
  "llm_base_url": "http://localhost:11434/v1",
  "llm_model": "llama3"
}

Option 5 — Passive mode (zero setup) No API key, no compression. memri stores sessions locally and returns recent context directly. memri_recall, memri_store, and memri_search all work.

// ~/.memri/config.json
{ "llm_provider": "passive" }

What about OpenAI / ChatGPT? A ChatGPT subscription ($20/mo) does not include API access — OpenAI sells these separately. If you use GPT models, you need an OpenAI API key (OPENAI_API_KEY). The free options above (Gemini ADC or free Gemini API) are typically easier.


CLI

memri init --claude-code   # First-time setup (one command)
memri status               # Token savings, cost, session count
memri watch                # Auto-ingest new sessions in real time
memri ingest               # Ingest existing session history
memri observe              # Manually run the Observer on all threads
memri embed                # Build semantic search index
memri dashboard            # Web dashboard at http://localhost:8050
memri config               # View / edit config

Benchmarks

Evaluated on LongMemEval-S — 500 QA pairs across 6 question types designed to test AI assistant long-term memory.

Question type Raw baseline memri v1.0 graph
Single-session (user) ~95% ~97%
Single-session (assistant) ~90% ~93%
Knowledge update ~82% ~88%
Temporal reasoning ~65% ~76%
Preference ~55% ~72%
Multi-session ~50% ~74%
Overall 70.6% 83%

Raw baseline: full ~115K token conversation passed directly to Gemini 2.5 Flash.

memri v1.0 graph: sessions ingested into a 3-layer graph (facts, entities, causal chains), then only the top-k relevant facts retrieved per query (~500 tokens). Better accuracy, a fraction of the tokens.


Comparison

memri Mastra OM mem0 Full context
Language Python TypeScript Python
Install pip install framework lock-in pip install
Works with Claude Code, Cursor, Codex Mastra only any any
Storage local SQLite + graph cloud cloud none
Graph-based memory ✅ v1.0
Entity tracking ✅ v1.0 partial
Causal chains ✅ v1.0
Procedural memory ✅ v0.2
Frustration detection ✅ v0.2
Semantic search ✅ local ✅ cloud
Dashboard
Token compression 200× (graph retrieval) 5–40× varies
Privacy 100% local cloud cloud local

Privacy

Your data never leaves your machine.

  • Conversation history and observations live in ~/.memri/memri.db — a local SQLite file only you can read.
  • memri has no servers, no telemetry, no accounts.
  • The only external calls are to your LLM provider (the same one your coding agent already uses) to run compression.
  • API keys are read from environment variables and never written to the database.
memri status               # see exactly what's stored
memri forget <thread_id>   # delete a specific thread
rm ~/.memri/memri.db       # delete everything

Development

git clone https://github.com/SarthakK337/memri
cd memri
pip install -e ".[dev,graph,embeddings]"
pytest

Contributions welcome. Open an issue before starting large changes.


License

MIT © 2026

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memri-1.0.0.tar.gz (19.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memri-1.0.0-py3-none-any.whl (97.0 kB view details)

Uploaded Python 3

File details

Details for the file memri-1.0.0.tar.gz.

File metadata

  • Download URL: memri-1.0.0.tar.gz
  • Upload date:
  • Size: 19.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-1.0.0.tar.gz
Algorithm Hash digest
SHA256 91b787a91af1fad031a6db03a9a7834a2a611c566f5531f584c5dba9075df95e
MD5 3a8a5c05a4a6adb5d733b29f1e46acfa
BLAKE2b-256 8600108103d1d16f1cd5631e00643088f661fe4e1737bdf27ded20b023db085f

See more details on using hashes here.

File details

Details for the file memri-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: memri-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 97.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ced8b0230d73badff73d2380f1164b20d0fc1bce37edb3230129dbc7a33e85eb
MD5 73955d701eb4b172bccdf46bf14eeccc
BLAKE2b-256 8701fcb26423748f5aa1008c0a093467153d320a10340173983de34625df34ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page