Skip to main content

Mastra-level memory intelligence in a pip install — observational memory for coding agents

Project description

memri

Mastra-level memory intelligence in a pip install.

Observational memory for coding agents (Claude Code, Cursor, Codex) — a standalone Python package and MCP server that keeps your AI assistant from forgetting context across sessions.

PyPI Python License: MIT


What it does

Every time your conversation grows large, memri silently compresses it into dense, timestamped observations and injects them at the start of your next session. Your AI coding assistant remembers what you worked on — without burning tokens re-reading the full history.

Three background agents run as you code:

  • Observer — when your conversation exceeds 30K tokens, compresses it into observations (5–40× compression ratio)
  • Reflector — when observations exceed 40K tokens, garbage-collects stale or redundant ones
  • Strategist (v0.2) — extracts generalizable reasoning strategies from session trajectories; detects user frustration in real-time and stores recovery tactics as permanent procedural memory

The result: a compact, prompt-cacheable memory block at the top of every session — plus a growing library of how to work better with this specific user.


Install

pip install memri

For semantic search across past sessions:

pip install "memri[embeddings]"

Quick start

1. Initialize

memri init

This creates ~/.memri/config.json and prompts for your API key. Supports Gemini, Anthropic, and any OpenAI-compatible endpoint.

2. Connect to your coding agent

Claude Code:

memri init --claude-code

Adds memri as an MCP server to your Claude Code config automatically.

Cursor / VS Code:

memri init --cursor

Manual MCP config:

{
  "mcpServers": {
    "memri": {
      "command": "memri",
      "args": ["mcp-server"]
    }
  }
}

3. Start the MCP server

memri mcp-server

Configuration

Config lives at ~/.memri/config.json:

{
  "llm_provider": "gemini",
  "llm_model": "gemini-2.5-flash",
  "observe_threshold": 30000,
  "reflect_threshold": 40000
}

API keys in ~/.memri/.env:

GEMINI_API_KEY=...
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...

Override model via env or CLI:

memri mcp-server --model gemini-2.5-flash
memri mcp-server --model claude-haiku-4-5-20251001
memri mcp-server --model gpt-4o-mini

MCP Tools

Tool What it does
memri_recall Restore compressed context at session start
memri_store Explicitly save a note or decision
memri_search Semantic search across all past sessions
memri_status Show token savings, cost savings, session stats
memri_forget Delete all memories for a thread
memri_ingest Manually ingest a session file
memri_distill (v0.2) Extract generalizable strategies from this session

Add this to your Claude Code system prompt to activate auto-recall:

At the start of each session, call memri_recall to restore context.
After significant decisions or discoveries, call memri_store to save them.

Dashboard

memri dashboard
# Open http://localhost:8050

Shows: total tokens saved, cost savings per model, observation counts, and session history.


CLI reference

memri init                  # First-time setup
memri init --claude-code    # Setup + wire into Claude Code
memri status                # Quick stats
memri mcp-server            # Start MCP server
memri dashboard             # Start web dashboard
memri observe <thread_id>   # Manually trigger Observer
memri embed                 # Build/update semantic search index
memri watch                 # Start file watcher (auto-ingest sessions)
memri ingest <file>         # Ingest a session JSONL file
memri config                # Show current config

How it compares

memri Mastra OM Full context
Language Python TypeScript any
Install pip install framework lock-in
Coding agents Claude Code, Cursor, Codex Mastra agents only any
Cross-session search yes (semantic) no no
Procedural memory yes (v0.2) no no
Frustration detection yes (v0.2) no no
Dashboard yes no no
Token compression 5–40x 5–40x 1x
LongMemEval-S (raw baseline) 70.6% 70.6%

Architecture

Your conversation
      |
      v
  [Observer]    — triggers at 30K tokens
      |            compresses turns → episodic observations
      |
  [Strategist]  — on every message
      |            detects frustration → 🔴 CRITICAL strategy
      |            on session end → distills trajectory → 🟡/🔵 strategies
      v
 [SQLiteStore]  — stores observations + strategies + embeddings
      |
      v
  [Reflector]   — triggers at 40K observation tokens
      |            removes stale/redundant episodic observations
      v
 [get_context()] — prepends strategies, then episodic observations
      |
      v
 Injected at top of next session

Storage: ~/.memri/memory.db (SQLite, 5 tables).


Benchmarks

Validated on LongMemEval-S — 500 QA pairs across 6 question types testing AI assistant memory.

Raw baseline (full context → Gemini 2.5 Flash, no compression):

Question type Accuracy
single-session-user ~95%
single-session-assistant ~90%
knowledge-update ~82%
temporal-reasoning ~65%
preference ~55%
multi-session ~50%
Overall 70.6%

The raw baseline establishes the upper bound for memri's compressed-context path. Smriti integration (planned) targets 80%+.


Development

git clone https://github.com/your-org/memri
cd memri
pip install -e ".[dev,embeddings]"
pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memri-0.2.0.tar.gz (62.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memri-0.2.0-py3-none-any.whl (54.0 kB view details)

Uploaded Python 3

File details

Details for the file memri-0.2.0.tar.gz.

File metadata

  • Download URL: memri-0.2.0.tar.gz
  • Upload date:
  • Size: 62.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-0.2.0.tar.gz
Algorithm Hash digest
SHA256 1a2bb30b0f7a7b45ed0a55ac0ad0c0affff68c193f548df06ab6601ded941176
MD5 85c4e85b360d5b2153e4c8570ebd8dac
BLAKE2b-256 55cb2ead9865dbea3be9d93871a92190e7950d18a5f8744c2e2651caf062b0e3

See more details on using hashes here.

File details

Details for the file memri-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: memri-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 54.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for memri-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c6b19b354b9d86ac7b205156b63671f21dd6deeebd0fa3bdb0a1e6d66afd984b
MD5 f33b03ac85a2131c083b39f914414a47
BLAKE2b-256 fe563bb3fabdb559fa7ff6a2ef66968e0e0d372b5c2598add503d4e011bf58a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page