Skip to main content

Information-geometric agent memory with mathematical guarantees

Project description

SuperLocalMemory

SuperLocalMemory V3.2

The first local-only AI memory to break 74% retrieval on LoCoMo.
No cloud. No APIs. No data leaves your machine.

+16pp vs Mem0 (zero cloud)  ·  85% Open-Domain (best of any system)  ·  EU AI Act Ready

arXiv Paper PyPI npm MIT License EU AI Act Website MCP Native CLI Agent-Native


Why SuperLocalMemory?

Every major AI memory system — Mem0, Zep, Letta, EverMemOS — sends your data to cloud LLMs for core operations. That means latency on every query, cost on every interaction, and after August 2, 2026, a compliance problem under the EU AI Act.

SuperLocalMemory V3 takes a different approach: mathematics instead of cloud compute. Three techniques from differential geometry, algebraic topology, and stochastic analysis replace the work that other systems need LLMs to do — similarity scoring, contradiction detection, and lifecycle management. The result is an agent memory that runs entirely on your machine, on CPU, with no API keys, and still outperforms funded alternatives.

The numbers (evaluated on LoCoMo, the standard long-conversation memory benchmark):

System Score Cloud Required Open Source Funding
EverMemOS 92.3% Yes No
Hindsight 89.6% Yes No
SLM V3 Mode C 87.7% Optional Yes (MIT) $0
Zep v3 85.2% Yes Deprecated $35M
SLM V3 Mode A 74.8% No Yes (MIT) $0
Mem0 64.2% Yes Partial $24M

Mode A scores 74.8% with zero cloud dependency — outperforming Mem0 by 16 percentage points without a single API call. On open-domain questions, Mode A scores 85.0% — the highest of any system in the evaluation, including cloud-powered ones. Mode C reaches 87.7%, matching enterprise cloud systems.

Mathematical layers contribute +12.7 percentage points on average across 6 conversations (n=832 questions), with up to +19.9pp on the most challenging dialogues. This isn't more compute — it's better math.

Upgrading from V2 (2.8.6)? V3 is a complete architectural reinvention — new mathematical engine, new retrieval pipeline, new storage schema. Your existing data is preserved but requires migration. After installing V3, run slm migrate to upgrade your data. Read the Migration Guide before upgrading. Backup is created automatically.


What's New in V3.2 — The Living Brain

Your AI agent now remembers the way humans do: associatively, temporally, and with consolidation during idle time. V3.2 transforms SLM from a retrieval engine into a living memory system that surfaces what you need before you ask for it.

Headline Features

100x Faster Recall — Retrieval latency drops from ~500ms to <10ms at 10K facts. Vector KNN search replaces full-table scan. You feel the difference on the first query.

Automatic Memory Surfacing — Memories now come to you. A multi-signal scoring engine (similarity + recency + frequency + trust) proactively injects relevant context at session start and during conversations. No more "I forgot we decided that last week."

Associative Retrieval (5th Channel) — V3 had 4 retrieval channels. V3.2 adds a 5th: multi-hop spreading activation across your knowledge graph. Ask about "deployment" and it surfaces the related database migration decision three hops away.

Temporal Intelligence — Facts now carry time-awareness. Bi-temporal validity tracks when something was true vs. when it was recorded. Contradictions are detected automatically: "We use Postgres" + "We migrated to MySQL" triggers a conflict resolution flow.

Sleep-Time Consolidation — During idle periods, SLM compresses, deduplicates, and reorganizes your memory store. Redundant facts merge. Clusters tighten. Important memories get promoted to Core Memory blocks that stay permanently in context (inspired by Letta's core memory, but fully local).

Core Memory Blocks — Pin your most critical context (architecture decisions, team conventions, project constraints) into always-available working memory. These blocks are injected into every session automatically — your agent never starts cold.

By the Numbers

Metric V3.0 V3.2 Change
Recall latency (10K facts) ~500ms <10ms 100x faster
Retrieval channels 4 5 +spreading activation
MCP tools 24 29 +5 new
CLI commands 16 21 +5 new
Dashboard tabs 17 20 +3 new
API endpoints 9 new configuration & status
DB tables 9 18 +9 for temporal, consolidation, core memory

Enable V3.2 Features

All new features default OFF. Zero breaking changes. Opt in when ready:

# Turn on automatic memory surfacing
slm config set auto_invoke.enabled true

# Turn on sleep-time consolidation
slm config set consolidation.enabled true

# Turn on temporal intelligence
slm config set temporal.enabled true

# Turn on associative retrieval (5th channel)
slm config set retrieval.synapse.enabled true

Or enable everything at once:

slm config set v32_features.all true

Fully backward compatible. All 29 MCP tools, 21 CLI commands work the same. Existing data untouched. New features activate only when you flip the switch.

V3.2 Paper — Technical details, formal guarantees, and benchmark results in the upcoming companion paper. Watch the arXiv page for updates.


Quick Start

Install via npm (recommended)

npm install -g superlocalmemory
slm setup     # Choose mode (A/B/C)
slm doctor    # Verify everything is working
slm warmup    # Pre-download embedding model (~500MB, optional)

Install via pip

pip install superlocalmemory

First Use

slm remember "Alice works at Google as a Staff Engineer"
slm recall "What does Alice do?"
slm status

MCP Integration (Claude, Cursor, Windsurf, VS Code, etc.)

{
  "mcpServers": {
    "superlocalmemory": {
      "command": "slm",
      "args": ["mcp"]
    }
  }
}

27 MCP tools + 7 resources available. Works with Claude Code, Cursor, Windsurf, VS Code Copilot, Continue, Cody, ChatGPT Desktop, Gemini CLI, JetBrains, Zed, and 17+ AI tools. V3.1: Active Memory tools auto-learn your patterns.

Dual Interface: MCP + CLI

SLM works everywhere -- from IDEs to CI pipelines to Docker containers. The only AI memory system with both MCP and agent-native CLI.

Need Use Example
IDE integration MCP Auto-configured for 17+ IDEs via slm connect
Shell scripts CLI + --json slm recall "auth" --json | jq '.data.results[0]'
CI/CD pipelines CLI + --json slm remember "deployed v2.1" --json in GitHub Actions
Agent frameworks CLI + --json OpenClaw, Codex, Goose, nanobot
Human use CLI slm recall "auth" (readable text output)

Agent-native JSON output on every command:

# Human-readable (default)
slm recall "database schema"
#   1. [0.87] Database uses PostgreSQL 16 on port 5432...

# Agent-native JSON
slm recall "database schema" --json
# {"success": true, "command": "recall", "version": "3.0.22", "data": {"results": [...]}}

All --json responses follow a consistent envelope with success, command, version, data, and next_actions for agent guidance.


Three Operating Modes

Mode What Cloud? EU AI Act Best For
A Local Guardian None Compliant Privacy-first, air-gapped, enterprise
B Smart Local Local only (Ollama) Compliant Better answers, data stays local
C Full Power Cloud LLM Partial Maximum accuracy, research
slm mode a   # Zero-cloud (default)
slm mode b   # Local Ollama
slm mode c   # Cloud LLM

Mode A is the only agent memory that operates with zero cloud dependency while achieving competitive retrieval accuracy on a standard benchmark. All data stays on your device. No API keys. No GPU. Runs on 2 vCPUs + 4GB RAM.


Architecture

Query  ──►  Strategy Classifier  ──►  4 Parallel Channels:
                                       ├── Semantic (Fisher-Rao geodesic distance)
                                       ├── BM25 (keyword matching)
                                       ├── Entity Graph (spreading activation, 3 hops)
                                       └── Temporal (date-aware retrieval)
                                                    │
                                       RRF Fusion (k=60)
                                                    │
                                       Scene Expansion + Bridge Discovery
                                                    │
                                       Cross-Encoder Reranking
                                                    │
                                       ◄── Top-K Results with channel scores

Mathematical Foundations

Three novel contributions replace cloud LLM dependency with mathematical guarantees:

  1. Fisher-Rao Retrieval Metric — Similarity scoring derived from the Fisher information structure of diagonal Gaussian families. Graduated ramp from cosine to geodesic distance over the first 10 accesses. The first application of information geometry to agent memory retrieval.

  2. Sheaf Cohomology for Consistency — Algebraic topology detects contradictions by computing coboundary norms on the knowledge graph. The first algebraic guarantee for contradiction detection in agent memory.

  3. Riemannian Langevin Lifecycle — Memory positions evolve on the Poincare ball via discretized Langevin SDE. Frequently accessed memories stay active; neglected memories self-archive. No hardcoded thresholds.

These three layers collectively yield +12.7pp average improvement over the engineering-only baseline, with the Fisher metric alone contributing +10.8pp on the hardest conversations.


Benchmarks

Evaluated on LoCoMo — 10 multi-session conversations, 1,986 total questions, 4 scored categories.

Mode A (Zero-Cloud, 10 Conversations, 1,276 Questions)

Category Score vs. Mem0 (64.2%)
Single-Hop 72.0% +3.0pp
Multi-Hop 70.3% +8.6pp
Temporal 80.0% +21.7pp
Open-Domain 85.0% +35.0pp
Aggregate 74.8% +10.6pp

Mode A achieves 85.0% on open-domain questions — the highest of any system in the evaluation, including cloud-powered ones.

Math Layer Impact (6 Conversations, n=832)

Conversation With Math Without Delta
Easiest 78.5% 71.2% +7.3pp
Hardest 64.2% 44.3% +19.9pp
Average 71.7% 58.9% +12.7pp

Mathematical layers help most where heuristic methods struggle — the harder the conversation, the bigger the improvement.

Ablation (What Each Component Contributes)

Removed Impact
Cross-encoder reranking -30.7pp
Fisher-Rao metric -10.8pp
All math layers -7.6pp
BM25 channel -6.5pp
Sheaf consistency -1.7pp
Entity graph -1.0pp

Full ablation details in the Wiki.


EU AI Act Compliance

The EU AI Act (Regulation 2024/1689) takes full effect August 2, 2026. Every AI memory system that sends personal data to cloud LLMs for core operations has a compliance question to answer.

Requirement Mode A Mode B Mode C
Data sovereignty (Art. 10) Pass Pass Requires DPA
Right to erasure (GDPR Art. 17) Pass Pass Pass
Transparency (Art. 13) Pass Pass Pass
No network calls during memory ops Yes Yes No

To the best of our knowledge, no existing agent memory system addresses EU AI Act compliance. Modes A and B pass all checks by architectural design — no personal data leaves the device during any memory operation.

Built-in compliance tools: GDPR Article 15/17 export + complete erasure, tamper-proof SHA-256 audit chain, data provenance tracking, ABAC policy enforcement.


Web Dashboard

slm dashboard    # Opens at http://localhost:8765
Dashboard Screenshots (click to collapse)

Dashboard

Graph Math Trust

Recall Settings Memories

17 tabs: Dashboard, Recall Lab, Knowledge Graph, Memories, Trust Scores, Math Health, Compliance, Learning, IDE Connections, Settings, and more. Runs locally — no data leaves your machine.


Active Memory (V3.1) — Memory That Learns

Most AI memory systems are passive databases — you store, you search, you get results. SuperLocalMemory learns.

Every recall you make generates learning signals. Over time, the system adapts to your patterns:

Phase Signals What Happens
Baseline 0-19 Cross-encoder ranking (default behavior)
Rule-Based 20+ Heuristic boosts: recency, access count, trust score
ML Model 200+ LightGBM model trained on YOUR usage patterns

Zero-Cost Learning Signals

No LLM tokens spent. Four mathematical signals computed locally:

  • Co-Retrieval — memories retrieved together strengthen their connections
  • Confidence Lifecycle — accessed facts get boosted, unused facts decay
  • Channel Performance — tracks which retrieval channel works best for your queries
  • Entropy Gap — surprising content gets prioritized for deeper indexing

Auto-Capture & Auto-Recall

slm hooks install     # Install Claude Code hooks for invisible injection
slm observe "We decided to use PostgreSQL"  # Auto-detects decisions, bugs, preferences
slm session-context   # Get relevant context at session start

MCP Active Memory Tools

Three new tools for AI assistants:

  • session_init — call at session start, get relevant project context automatically
  • observe — send conversation content, auto-captures decisions/bugs/preferences
  • report_feedback — explicit feedback for faster learning

No competitor learns at zero token cost. Mem0, Zep, and Letta all require cloud LLM calls for their learning loops. SLM learns through mathematics.


Features

Retrieval

  • 4-channel hybrid: Semantic (Fisher-Rao) + BM25 + Entity Graph + Temporal
  • RRF fusion + cross-encoder reranking
  • Agentic sufficiency verification (auto-retry on weak results)
  • Adaptive ranking with LightGBM (learns from usage)

Intelligence

  • 11-step ingestion pipeline (entity resolution, fact extraction, emotional tagging, scene building)
  • Automatic contradiction detection via sheaf cohomology
  • Self-organizing memory lifecycle (no hardcoded thresholds)
  • Behavioral pattern detection and outcome tracking

Trust & Security

  • Bayesian Beta-distribution trust scoring (per-agent, per-fact)
  • Trust gates (block low-trust agents from writing/deleting)
  • ABAC (Attribute-Based Access Control) with DB-persisted policies
  • Tamper-proof hash-chain audit trail (SHA-256 linked entries)

Infrastructure

  • 17-tab web dashboard with real-time visualization
  • 17+ IDE integrations (Claude, Cursor, Windsurf, VS Code, JetBrains, Zed, etc.)
  • 24 MCP tools + 6 MCP resources
  • Profile isolation (independent memory spaces)
  • 1400+ tests, MIT license, cross-platform (Mac/Linux/Windows)
  • CPU-only — no GPU required

CLI Reference

Command What It Does
slm remember "..." Store a memory
slm recall "..." Search memories
slm forget "..." Delete matching memories
slm trace "..." Recall with per-channel score breakdown
slm status System status
slm health Math layer health (Fisher, Sheaf, Langevin)
slm doctor Pre-flight check (deps, worker, Ollama, database)
slm mode a/b/c Switch operating mode
slm setup Interactive first-time wizard
slm warmup Pre-download embedding model
slm migrate V2 to V3 migration
slm dashboard Launch 17-tab web dashboard
slm mcp Start MCP server (for IDE integration)
slm connect Configure IDE integrations
slm hooks install Wire auto-memory into Claude Code hooks
slm profile list/create/switch Profile management

Research Papers

V3: Information-Geometric Foundations

SuperLocalMemory V3: Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory Varun Pratap Bhardwaj (2026) arXiv:2603.14588 · Zenodo DOI: 10.5281/zenodo.19038659

V2: Architecture & Engineering

SuperLocalMemory: A Structured Local Memory Architecture for Persistent AI Agent Context Varun Pratap Bhardwaj (2026) arXiv:2603.02240 · Zenodo DOI: 10.5281/zenodo.18709670

Cite This Work

@article{bhardwaj2026slmv3,
  title={Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory},
  author={Bhardwaj, Varun Pratap},
  journal={arXiv preprint arXiv:2603.14588},
  year={2026},
  url={https://arxiv.org/abs/2603.14588}
}

Prerequisites

Requirement Version Why
Node.js 14+ npm package manager
Python 3.11+ V3 engine runtime

All Python dependencies install automatically during npm install — core math, dashboard server, learning engine, and performance optimizations. If anything fails, the installer shows exact fix commands. Run slm doctor after install to verify everything works. BM25 keyword search works even without embeddings — you're never fully blocked.

Component Size When
Core libraries (numpy, scipy, networkx) ~50MB During install
Dashboard & MCP server (fastapi, uvicorn) ~20MB During install
Learning engine (lightgbm) ~10MB During install
Search engine (sentence-transformers, torch) ~200MB During install
Embedding model (nomic-embed-text-v1.5, 768d) ~500MB First use or slm warmup
Mode B requires Ollama + a model (ollama pull llama3.2) ~2GB Manual

Contributing

See CONTRIBUTING.md for guidelines. Wiki for detailed documentation.

License

MIT License. See LICENSE.

Attribution

Part of Qualixar · Author: Varun Pratap Bhardwaj


Built with mathematical rigor. Not in the race — here to help everyone build better AI memory systems.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superlocalmemory-3.2.2.tar.gz (349.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

superlocalmemory-3.2.2-py3-none-any.whl (407.0 kB view details)

Uploaded Python 3

File details

Details for the file superlocalmemory-3.2.2.tar.gz.

File metadata

  • Download URL: superlocalmemory-3.2.2.tar.gz
  • Upload date:
  • Size: 349.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for superlocalmemory-3.2.2.tar.gz
Algorithm Hash digest
SHA256 b02b33cf2168b3f963a437b956c912b87a8cb2d3c4acf8dd3152bea24820c544
MD5 2423bb1526196eca6ac6f262153845ff
BLAKE2b-256 a4731e6ae5e14610258d64474d528956cf114ae1cab4b234da6a0db5e512bd4c

See more details on using hashes here.

Provenance

The following attestation bundles were made for superlocalmemory-3.2.2.tar.gz:

Publisher: pypi-publish.yml on qualixar/superlocalmemory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file superlocalmemory-3.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for superlocalmemory-3.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fec8ebe6e34f157066ca9d3ea59638864a91b3054b7965a579a596611a0bcf08
MD5 58bc9bc9b96f7c2635a4337913888f27
BLAKE2b-256 7f8ad5521a2ec45e75e2353e67fc935d551b16a9b6df91dbded9aac3983bdf24

See more details on using hashes here.

Provenance

The following attestation bundles were made for superlocalmemory-3.2.2-py3-none-any.whl:

Publisher: pypi-publish.yml on qualixar/superlocalmemory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page