Skip to main content

A neuro-inspired memory architecture for AI agents — combines a Semantic Palace graph, capacity-bounded Working Memory, and asynchronous consolidation.

Project description

SMRITI Memory

A neuro-inspired long-term memory architecture for AI agents.

SMRITI combines a capacity-bounded Working Memory, a graph-based Semantic Palace, and asynchronous background consolidation to give LLM agents persistent, scalable memory — without blocking real-time interactions.

📄 Paper: SMRITI: A Scalable, Neuro-Inspired Architecture for Long-Term Event Memory in LLM Agents — Shivam Tyagi, 2025 — DOI: 10.13140/RG.2.2.25477.82407

PyPI Python 3.9+ License: MIT


Architecture

                           ┌─────────────────────────────────┐
                           │    Asynchronous Consolidation   │
                           │      (8 Background Processes)   │
                           │  • Chunking      • Cross-Ref.   │
                           │  • Conflict Res. • Skill Ext.   │
                           │  • Forgetting    • Spaced Rep.  │
                           │  • Reflection    • Defragment.  │
                           └────────────────┬────────────────┘
                                            │ background
  ┌──────────┐   ┌──────────┐   ┌───────────▼─────────┐   ┌──────────┐
  │  Input   │──▶│ Attention │──▶│   Episode Buffer    │──▶│ Semantic │
  │  Text    │   │   Gate    │   │  (append-only log)  │   │  Palace  │
  └──────────┘   │ (salience │   └─────────────────────┘   │  Graph   │
                 │  filter)  │                              │ G=(V,E)  │
                 └──────────┘                              └────┬─────┘
                                                                │
  ┌──────────┐   ┌──────────┐   ┌───────────────────┐           │
  │  Query   │──▶│ Retrieval│──▶│  Working Memory   │◀──────────┘
  │          │   │  Engine  │   │   (7 ± 2 slots)   │
  └──────────┘   │ Q(v) =   │   └───────────────────┘
                 │ β₁cos +  │
                 │ β₂decay+ │   ┌───────────────────┐
                 │ β₃freq + │──▶│    Meta-Memory    │
                 │ β₄sal    │   │ (confidence map)  │
                 └──────────┘   └───────────────────┘

Core idea: Inspired by human Dual-Process Theory (Daniel Kahneman's Thinking, Fast and Slow), SMRITI decouples memory operations into two pathways:

  • System 1 (Fast & Heuristic): Real-time ingestion. Routes interactions to the short-term Episode Buffer in milliseconds without blocking the agent.
  • System 2 (Slow & Analytical): Background consolidation. Uses LLM reasoning to chunk, organize, and abstract semantic knowledge asynchronously while the agent is idle.

Quick Start — Claude Code (MCP)

The fastest way to use SMRITI is as a persistent memory layer for Claude Code. One command, and your AI remembers you across every session.

Run the install script:

bash <(curl -s https://raw.githubusercontent.com/shivamtyagi18/smriti-memory/main/install_smriti_mcp.sh)

The script will:

  • Create a dedicated venv at ~/.smriti/venv
  • Install smriti-memory into it
  • Prompt for your LLM choice and API key
  • Register the MCP server in ~/.claude.json
  • Optionally configure automatic memory hooks

Then restart Claude Code. Verify with /mcpsmriti should appear as connected.

Available tools (11):

Tool Description
smriti_encode Store information in long-term memory
smriti_recall Retrieve memories by natural-language query
smriti_get_context Inject working memory into the current prompt
smriti_how_well_do_i_know Confidence check on a topic
smriti_knowledge_gaps List topics SMRITI knows it doesn't know
smriti_pin Mark a memory as permanent (never decayed)
smriti_forget Archive a memory
smriti_consolidate Run a consolidation cycle
smriti_stats System-wide statistics
smriti_get_suggestions Proactive insights from background consolidation
smriti_open_ui Launch the visual Memory Browser in the default web browser

LLM options — set during install or via environment variables:

Model Provider Requires
mistral (default) Local Ollama ollama pull mistral
claude-* Anthropic SMRITI_LLM_API_KEY
gpt-* OpenAI SMRITI_LLM_API_KEY
gemini* Google SMRITI_LLM_API_KEY

Installation (Python Library)

pip install smriti-memory

With optional FAISS accelerated vector search:

pip install smriti-memory[faiss]

Or install from source:

git clone https://github.com/shivamtyagi18/smriti-memory.git
cd smriti-memory
pip install -e .

Prerequisites

SMRITI uses an LLM for reasoning tasks (consolidation, reflection, skill extraction). By default it connects to a local Ollama instance:

ollama pull mistral

Alternatively, you can use OpenAI, Anthropic, or Google Gemini — see Using Cloud LLM Providers below.


Using Cloud LLM Providers

SMRITI is provider-agnostic. Just change the llm_model and pass your API key:

from smriti import SMRITI, SmritiConfig

# ── OpenAI ──────────────────────────────────────────────
config = SmritiConfig(
    llm_model="gpt-4o",
    openai_api_key="sk-...",
)

# ── Anthropic ───────────────────────────────────────────
config = SmritiConfig(
    llm_model="claude-3-5-sonnet-20241022",
    anthropic_api_key="sk-ant-...",
)

# ── Google Gemini ───────────────────────────────────────
config = SmritiConfig(
    llm_model="gemini-1.5-flash",
    gemini_api_key="AIza...",
)

# ── Local Ollama (default) ──────────────────────────────
config = SmritiConfig(
    llm_model="mistral",  # or llama3, codellama, phi3, etc.
)

memory = SMRITI(config=config)

Routing is automatic based on the model name prefix: gpt-* → OpenAI, claude* → Anthropic, gemini* → Gemini, everything else → Ollama.


Quick Start

from smriti import SMRITI, SmritiConfig

# Initialize
config = SmritiConfig(
    storage_path="./my_agent_memory",
    llm_model="mistral",
)
memory = SMRITI(config=config)

# Encode information
memory.encode("User prefers Python for backend development.")
memory.encode("User is allergic to shellfish.", context="medical")

# Recall by natural-language query
results = memory.recall("What language does the user prefer?")
for mem in results:
    print(f"  [{mem.strength:.2f}] {mem.content}")

# Check what you know (and don't know)
confidence = memory.how_well_do_i_know("programming languages")
print(f"Confidence: {confidence.overall:.0%}")

# Run background consolidation
memory.consolidate()

# Persist to disk
memory.save()

Framework Integrations

SMRITI can be used natively inside standard agent frameworks.

LangChain

Use SmritiLangChainMemory to replace ConversationBufferMemory. This gives your agent the cost-savings of a capacity-bounded Working Memory while asynchronously archiving the conversation into the Semantic Palace.

from langchain.chains import ConversationChain
from smriti.integrations.langchain_memory import SmritiLangChainMemory
from smriti import SMRITI

# 1. Initialize SMRITI
smriti_engine = SMRITI(storage_path="./langchain_smriti_db")

# 2. Wrap it for LangChain
smriti_memory = SmritiLangChainMemory(smriti_client=smriti_engine, top_k=3)

# 3. Plug it into standard chains
conversation = ConversationChain(
    llm=my_llm,
    memory=smriti_memory,
)

conversation.predict(input="I prefer using PyTorch.")

See examples/langchain_agent.py or examples/quickstart.py for complete working code.

Claude Code (MCP Server)

See Quick Start — Claude Code (MCP) above for one-command setup.

Memory Browser UI

SMRITI ships with a native, zero-dependency visualizer for traversing the Semantic Palace graph.

smriti_ui --storage ~/.smriti/global --port 7799

Features:

  • Zero dependencies: Built entirely with Python's standard http.server and D3.js — no Node.js/NPM needed.
  • Backwards Compatible: Instantly works with your existing palace.json created by older versions of SMRITI. Just point --storage to your existing directory.
  • Interactive Graph: Navigate the Semantic Palace using a force-directed network view or clustered room topology.
  • Searchable Dashboard: Instantly filter your stored knowledge by content, room, and system state.
  • Real-time Statistics: Track average memory strength, composite salience, and architectural distribution.

(If using without pip installation, run python -m smriti_memcore.ui from the source root).


Key API

Method Description
encode(content, context, source) Ingest new information through the Attention Gate
recall(query, top_k) Retrieve relevant memories via graph traversal
how_well_do_i_know(topic) Meta-memory confidence check
consolidate(depth) Run background consolidation ("full", "light", "defer")
save() Persist all state to disk
pin(memory_id) Mark a memory as permanent
forget(memory_id) Gracefully forget a memory (leaves a tombstone)
stats() System-wide statistics

Configuration

All parameters are optional and have sensible defaults:

from smriti import SmritiConfig

config = SmritiConfig(
    # Working Memory
    working_memory_slots=7,          # Miller's Law: 7 ± 2

    # Retrieval scoring weights
    recency_weight=0.2,
    relevance_weight=0.4,
    strength_weight=0.2,
    salience_weight=0.2,

    # Forgetting
    decay_rate=0.99,                 # per-day temporal decay
    strength_hard_threshold=0.05,    # below this → forget

    # Palace graph
    room_merge_threshold=0.85,       # similarity to auto-merge rooms

    # LLM provider (pick one)
    llm_model="mistral",                     # Ollama (default)
    # llm_model="gpt-4o",                    # OpenAI
    # llm_model="claude-3-5-sonnet-20241022",# Anthropic
    # llm_model="gemini-1.5-flash",          # Google
    ollama_base_url="http://localhost:11434",

    # Storage
    storage_path="./smriti_data",
)

What's New in v1.0.0

  • Consolidation robustness overhaul — fixed a critical bug where singleton episodes leaked in the buffer indefinitely, causing consolidation to report "no significant memories" even when important facts were present
  • Smarter salience scoring — the heuristic scorer now differentiates content types (personal facts, knowledge updates, instructions) instead of scoring everything the same
  • Better contradiction detection — Mistral no longer incorrectly discards memories that agree with existing ones
  • Validated across 4 models — benchmarked with gpt-4o-mini, Mistral 7B, CodeLlama 7B, and Llama 3.2 3B

See CHANGELOG.md for full details.


Benchmarks

LoCoMo (Multi-System Comparison)

SMRITI was benchmarked against four baseline architectures on the LoCoMo long-sequence dataset (28 dialog turns, 15 evaluation questions, consolidation enabled):

System F1 Score Latency Tokens/Query Consolidation
FullContext 0.345 1147ms 550
MemGPT-style 0.334 1397ms 478
NaiveRAG 0.312 1387ms 145
SMRITI v2 0.279 1317ms 146 41.2s (async)
Mem0-style 0.235 1088ms 106

Results with GPT-4o-mini. SMRITI consolidation runs asynchronously and does not block queries.

Local Model Comparison (v1.0.0)

All runs use the fixed consolidation pipeline with heuristic scoring:

Model F1 Score Exact Match Latency Best Category
CodeLlama 7B 0.317 0.200 5634ms Temporal (0.682)
Mistral 7B 0.284 0.067 3181ms Knowledge Update (0.516)
gpt-4o-mini 0.262 0.000 1271ms Single-hop (0.350)
Llama 3.2 3B 0.184 0.067 1446ms Multi-hop (0.134)

Key finding: CodeLlama 7B outperforms all models on temporal reasoning (F1=0.682) and achieves the highest exact-match rate (20%). Mistral 7B remains the best all-rounder with strong knowledge-update handling.

LongMemEval (Long-Term Interactive Memory)

SMRITI integrates an evaluation harness for the LongMemEval benchmark to test retrieval over 50+ chat sessions:

System Configuration Exact Match Accuracy Average Query Latency
Baseline (Full Context) 100.0% 11.98s
SMRITI Dual-Process 80.0% 0.98s

SMRITI restricts the LLM context to the 5 most relevant memories, resulting in a >12× latency reduction compared to context-stuffing.

Vector Search Backend

SMRITI supports two vector search backends. FAISS is auto-detected when installed:

Backend 1K vectors 10K vectors 100K vectors Memory (100K)
NumPy 22 µs 179 µs 2.75 ms 146.5 MB
FAISS 28 µs 200 µs 2.24 ms 979 B

At scale, FAISS is 1.2× faster with 150,000× less memory.

Reproducing Benchmarks

pip install -e ".[benchmarks]"

# Multi-system comparison (requires API key)
python benchmarks/run_benchmark.py --model gpt-4o-mini --systems smriti --consolidate --dataset locomo

# Local model comparison (requires Ollama)
python benchmarks/run_benchmark.py --model mistral --systems smriti --consolidate --dataset locomo
python benchmarks/run_benchmark.py --model codellama --systems smriti --consolidate --dataset locomo

# Vector backend comparison
python benchmarks/vector_benchmark.py

Project Structure

smriti-memory/
├── smriti/                 # Core library
│   ├── __init__.py
│   ├── core.py            # SMRITI orchestrator
│   ├── models.py          # Data models & SmritiConfig
│   ├── palace.py          # Semantic Palace graph
│   ├── episode_buffer.py  # Append-only temporal log
│   ├── working_memory.py  # Capacity-bounded priority queue
│   ├── attention_gate.py  # Salience filter
│   ├── retrieval.py       # Multi-factor retrieval engine
│   ├── consolidation.py   # Async background processes
│   ├── meta_memory.py     # Confidence mapping
│   ├── vector_store.py    # Vector persistence
│   ├── llm_interface.py   # Multi-provider LLM connector (Ollama/OpenAI/Anthropic/Gemini)
│   ├── metrics.py         # Observability: counters, gauges, histograms, Prometheus export
│   └── integrations/      # Framework adapters
│       ├── langchain_memory.py  # LangChain BaseMemory component
│       └── mcp_server.py        # Claude Code MCP server (10 tools)
├── install_smriti_mcp.sh   # One-command Claude Code setup
├── tests/                 # 190 tests across 14 files
├── baselines/             # Baseline implementations for comparison
├── benchmarks/            # Benchmark harness & scripts
├── examples/              # Usage examples
├── paper/                 # IEEE research paper (LaTeX + Markdown)
│   └── figures/           # Benchmark charts and UI diagrams
├── pyproject.toml
├── CHANGELOG.md
├── LICENSE
└── README.md

Citation

If you use SMRITI in your research, please cite:

@article{tyagi2025smriti,
  title={SMRITI: A Scalable, Neuro-Inspired Architecture for Long-Term Event Memory in LLM Agents},
  author={Tyagi, Shivam},
  year={2025},
  doi={10.13140/RG.2.2.25477.82407}
}

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smriti_memcore-1.0.6.tar.gz (82.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smriti_memcore-1.0.6-py3-none-any.whl (70.9 kB view details)

Uploaded Python 3

File details

Details for the file smriti_memcore-1.0.6.tar.gz.

File metadata

  • Download URL: smriti_memcore-1.0.6.tar.gz
  • Upload date:
  • Size: 82.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for smriti_memcore-1.0.6.tar.gz
Algorithm Hash digest
SHA256 185ddd96d48148705293e4235082cac30a4c3a5bc60b939edcf1c07f288e7068
MD5 52f2c9ffcc1d13f36ffcafa4b8323c79
BLAKE2b-256 6966aed31a310502b40b844b6357d805dadeddc64c305bf9eb5b39ac31edd944

See more details on using hashes here.

File details

Details for the file smriti_memcore-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: smriti_memcore-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 70.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for smriti_memcore-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 5866f3b9e7ea789fb1f7d6883fa562e3d225ac483bb5fe5e849dfa18b3ab2373
MD5 5346424d60b385fd0d8de7e8bbd91d5c
BLAKE2b-256 e177456b3e1cfd5d851c9fe657d16b8ccb2d88327de28ed3f63237109999f1b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page