Skip to main content

MCP memory server combining Method of Loci, Major System, Songlines, and PAO mnemonic layers

Project description

MemShān Logo

MemShān

Memory & Context Manager for AI

Python 3.11+ MCP LLM Providers ChromaDB NetworkX


What is MemShān?

MemShān is a Python-based MCP (Model Context Protocol) memory server that gives AI assistants a structured, multi-layered long-term memory. It combines four proven cognitive mnemonic techniques into a single retrieval pipeline:

Layer Cognitive Technique What It Does
Base Method of Loci ChromaDB Wings and Rooms — spatial memory palace for semantic search
Layer 1 Major System Converts numbers in text to phonetic tags for precise numeric lookup
Layer 2 Songlines NetworkX Knowledge Graph records context trails between memory chunks
Layer 3 PAO System Compresses session logs into Subject-Action-Object triplets for long-term archival

Architecture

See ARCHITECTURE.md for the full design, data-flow diagrams, and technology decisions.
See TASK_EXECUTION_PLAN.md for the phase-by-phase build plan.


MCP Tools

Tool Description
store_memory Store text into a Wing/Room; all layers run automatically
retrieve_memory Unified query: semantic search + graph expansion + numeric tag matching
add_context_trail Manually link two memory chunks in the Songlines graph
get_context_trail Return the narrative path between two concept nodes
snapshot_session Compress a session log into SAO triplets → long-term storage
list_rooms List all Wings and Rooms in the memory palace

LLM Provider Support

MemShān defaults to Ollama (local, fully offline). Switch providers via a single env var — no code changes required.

Provider LLM_PROVIDER value Requires
Ollama (default) ollama Ollama running locally
OpenAI openai OPENAI_API_KEY in .env
Google Gemini gemini GEMINI_API_KEY in .env
Anthropic Claude anthropic ANTHROPIC_API_KEY in .env

Installation

Prerequisites

  • Python 3.11+
  • Ollama installed and running (for default local LLM)
  • Windows: batch scripts provided (.bat)
  • Linux / macOS: shell scripts provided (.sh) — make executable with chmod +x *.sh

Windows Quick Start

REM 1. Initialize git
000_init.bat

REM 2. Create virtual environment
001_env.bat

REM 3. Activate virtual environment
002_activate.bat

REM 4. Install dependencies
003_setup.bat

Linux / macOS Quick Start

# Make scripts executable (one-time)
chmod +x *.sh

# 1. Initialize git
./000_init.sh

# 2. Create virtual environment
./001_env.sh

# 3. Activate virtual environment (must be sourced)
source 002_activate.sh

# 4. Install dependencies
./003_setup.sh

Manual (inside activated venv)

pip install -r requirements.txt

Configure .env

Copy and edit the environment file:

# LLM Provider (default: ollama)
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2

# Optional providers (uncomment and add keys to switch)
# LLM_PROVIDER=openai
# OPENAI_API_KEY=sk-...

# ChromaDB storage
CHROMA_PERSIST_DIR=./data/chroma

# Embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2

Usage

REM Windows — Run the MCP server
004_run.bat
# Linux / macOS
./004_run.sh
# Equivalent (inside activated venv)
python main.py

Testing

REM Windows
005_run_test.bat
005_run_code_cov.bat
# Linux / macOS
./005_run_test.sh
./005_run_code_cov.sh
# Equivalent inside activated venv (Windows)
.venv\Scripts\pytest tests/ -v
.venv\Scripts\pytest tests/ --cov=src --cov-report=term-missing
# Equivalent inside activated venv (Linux / macOS)
.venv/bin/pytest tests/ -v
.venv/bin/pytest tests/ --cov=src --cov-report=term-missing

Coverage target: 100% line AND branch coverage per module — no exceptions.


Security Scanning

MemShān enforces a zero-tolerance vulnerability policy on the main branch. All 90 transitive dependencies are audited via pip-audit before every commit that changes requirements.txt.

Run the security scan

REM Windows — Full scan: requirements + installed environment → JSON + HTML reports
006_pip_audit.bat
# Linux / macOS
./006_pip_audit.sh

The script runs two passes:

Pass Scope Output
Requirements scan Direct deps + full transitive tree (as pip resolves) security_reports\pip_audit_<TS>.json + .html
Environment scan Everything installed in the venv security_reports\pip_audit_env_<TS>.json + .html

Both JSON files are converted automatically to self-contained, dark-themed HTML reports with package tables, CVE details, and a filter bar.

Utility script

tools/pip_audit_to_html.py — reusable converter. Accepts pip-audit JSON from a file or stdin and writes a timestamped HTML report.

# Pipe directly
$env:PYTHONUTF8="1"
python -m pip_audit -r requirements.txt --format json 2>$null |
    python tools/pip_audit_to_html.py

# From a saved file
python tools/pip_audit_to_html.py security_reports/audit.json

# Custom output path
python tools/pip_audit_to_html.py audit.json --output reports/my_report.html

Copilot prompt

Use the /pip-audit prompt in GitHub Copilot Chat to run the full scan interactively: .github/prompts/pip-audit.prompt.md

Policy

  • Run 006_pip_audit.bat (Windows) or ./006_pip_audit.sh (Linux / macOS) before every commit that adds or changes dependencies.
  • Resolve ALL findings before pushing to main.
  • Reports are gitignored — only the scripts and prompt are committed.

Every test file must cover three scenario groups:

@pytest.mark.positive  # happy path
@pytest.mark.negative  # error / failure conditions
@pytest.mark.edge      # boundary values, empty inputs, None, single-item collections

Script Reference

Core Scripts

Windows (.bat) Linux / macOS (.sh) Purpose
000_init.bat 000_init.sh Initializes git and sets user name / email
001_env.bat 001_env.sh Creates a .venv virtual environment
002_activate.bat source 002_activate.sh Activates the virtual environment
003_setup.bat 003_setup.sh Installs requirements.txt and initialises MemPalace
004_run.bat 004_run.sh Runs the MCP server (main.py)
005_run_test.bat 005_run_test.sh Runs the full pytest suite with HTML report
005_run_code_cov.bat 005_run_code_cov.sh Runs tests with HTML coverage report
006_pip_audit.bat 006_pip_audit.sh pip-audit security scan → JSON + HTML reports
008_deactivate.bat source 008_deactivate.sh Deactivates the virtual environment

MemPalace Utility Scripts

Windows (.bat) Linux / macOS (.sh) Purpose
007_mp_mine.bat 007_mp_mine.sh Mine workspace files into MemPalace
007_mp_status.bat 007_mp_status.sh Show palace drawer counts and status
007_mp_search.bat 007_mp_search.sh Search the palace with optional wing/room filters
007_mp_compress.bat 007_mp_compress.sh Compress drawers using AAAK Dialect (~30× token reduction)
007_mp_diary.bat 007_mp_diary.sh Read or write agent diary entries
007_mp_wakeup.bat 007_mp_wakeup.sh Output L0 + L1 context (~600-900 tokens) for session start
007_mp_repair.bat 007_mp_repair.sh Rebuild vector index after corruption or abrupt exit

Linux / macOS note: All .sh scripts must be made executable once: chmod +x *.sh
002_activate.sh and 008_deactivate.sh must be sourced (source <script>), not executed.


Project Structure

Implemented

src/
├── config.py                    # ✅ Pydantic BaseSettings — all env config + LLM factory
├── llm/                         # ✅ LLM provider adapters
│   ├── client.py                #    LLMClient ABC
│   ├── ollama_client.py         #    Ollama (default, fully offline)
│   ├── openai_client.py         #    OpenAI
│   ├── gemini_client.py         #    Google Gemini
│   └── anthropic_client.py      #    Anthropic Claude
├── base/                        # ✅ Method of Loci — Base Layer
│   ├── loci_store.py            #    ChromaDB Wings/Rooms abstraction
│   └── embedder.py              #    sentence-transformers wrapper
└── layers/
    ├── major_system/            # ✅ Layer 1 — Numerical Precision
    │   └── phonetic_encoder.py  #    Numbers → phonetic consonant tags
    └── songlines/               # ✅ Layer 2 — Contextual Continuity
        └── knowledge_graph.py   #    NetworkX directed graph; Context Trails; GraphML persistence

Planned (upcoming phases)

src/
├── server.py                    # 🔲 MCP server entry point (FastMCP) — Phase 9
└── layers/
    ├── pao/                     # 🔲 Layer 3 — Episodic Compression (SAO triplets) — Phase 6
    │   └── snapshot.py
    ├── pipeline/                # 🔲 Unified retrieval pipeline — Phase 7
    │   └── retrieval.py
    └── tools/                   # 🔲 MCP tool definitions — Phase 8
        └── mcp_tools.py

Success Metrics & Observability

MemShān measures intelligence density, not just retrieval correctness. The scorecard below bridges technical performance of the mnemonic layers with enterprise engineering goals.

1. Key Performance Indicators (KPIs)

How the server runs — quantified technical efficiency per retrieval layer.

Category KPI Target Rationale
Retrieval Quality Faithfulness / Groundedness > 95% Prevents hallucinated context when traversing a Songline trail
Numerical Precision Numerical Recall Accuracy 100% Validates the Major System phonetic-tag pipeline for stats and dates
Compression Context Compression Ratio ≥ 5 : 1 Measures how efficiently the PAO System converts raw session logs to actionable SAO triplets
Latency P99 Retrieval Latency < 200 ms Loci lookups must not throttle the agent's reasoning loop
Observability PulseGuard Hit Rate 100% of queries Ensures every retrieval event is logged and validated for semantic drift

2. Key Result Areas (KRAs)

What MemShān achieves — enterprise-level value domains.

KRA 1 — Amnesia-Proof Persistence

Metric: Context Retention Span

Measure how many consecutive sessions an agent maintains perfect continuity on a complex, multi-phase project without requiring a re-prompt or context refresh.

  • Baseline: standard zero-shot / short-context agent — continuity typically breaks after 1–2 sessions.
  • MemShān target: ≥ 10 sessions with no loss of project state.

KRA 2 — Quality Engineering Modernisation

Metric: Defect Detection Velocity

Measure how much faster AI-assisted QE reviews identify architectural flaws when MemShān provides accumulated project memory versus cold zero-shot prompts.

  • Baseline: cold-prompt review time per module (measured in minutes).
  • MemShān target: ≥ 40% reduction in time-to-defect-detection.

KRA 3 — Resource Optimisation

Metric: Token-to-Knowledge Density (TKD)

$$\text{TKD} = \frac{\text{Relevant facts delivered to model}}{\text{Tokens consumed from context window}}$$

High TKD means MemShān surfaces more signal within the model's 128k / 200k token budget.

  • Target: TKD ≥ 3× vs. naive full-log injection.

3. Layer-Specific Experimental Metrics

Proving that each mnemonic addition (beyond standard vector RAG) earns its place.

Songlines — Narrative Coherence

Test: Ask the AI to reconstruct a project's event history from Songline graph traversal.

Metric Description Pass Threshold
Temporal Accuracy Events retrieved in correct causal / chronological order ≥ 95%
Sequence Drift Compared against standard vector-only RAG (which frequently reorders events) Songlines must outperform by ≥ 20 pp

PAO System — Reconstruction Fidelity

Test: Compress a session log into SAO triplets, then ask the AI to reconstruct the full system state from triplets alone.

Metric Description Pass Threshold
Reconstruction Fidelity Facts present in original log that survive compression → decompression ≥ 98%
Data Leakage Rate Critical facts lost during PAO snapshot 0% for facts tagged critical

Major System — Numeric Round-Trip

Test: Store text containing numerical data, query using rephrased numeric context.

Metric Description Pass Threshold
Phonetic Tag Recall Correct chunk retrieved via phonetic tag alone 100%
False Positive Rate Unrelated numeric chunks surfaced in results < 2%

4. Observability Dashboard Checklist

Recommended metrics to surface in Grafana / a custom VeredianAI UI.

Signal What to Track Alert Condition
Memory Growth Rate Wing/Room document count over time Sudden spike > 3× daily average
Room Utilisation Query hit-count per Room (hot vs. cold context) Room unutilised for > 30 days → archive candidate
Semantic Drift Alarm Distance between query embedding and retrieved chunk Cosine distance > 0.4 → PulseGuard flag
Hallucination Rate % of responses where retrieved chunk was not used by model Target < 5%
P99 / P95 Latency Per-layer breakdown: embed → query → graph expand → merge P99 > 200 ms → alert
Provider Error Rate Per LLM provider: 4xx / 5xx / timeout rate Any provider > 1% error rate → alert

Engineering Standards

Standard Requirement
SOLID Every production class in src/ must demonstrably satisfy all five SOLID principles
OOP Encapsulation, composition over inheritance, constructor injection throughout
Test Coverage 100% line AND branch coverage per module — no exceptions
Test Scenarios Every test file: @pytest.mark.positive + @pytest.mark.negative + @pytest.mark.edge
Database Changes All schema changes via Alembic migration — never mutate schema directly
Security Scanning pip-audit -r requirements.txt before every commit with dependency changes
Session Close Write MemPalace diary entry + update Session_N.md + commit & push every session
Drift Review Compare implementation vs architecture docs every 3 sessions

See .github/copilot-instructions.md for full details.


Contributing

Contributions are welcome. Please read ARCHITECTURE.md and TASK_EXECUTION_PLAN.md before submitting a PR.

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/my-feature
  3. Commit your changes: git commit -m "feat: add my feature"
  4. Push to the branch: git push origin feature/my-feature
  5. Open a Pull Request

License

MIT License — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memshan-2.2.0.tar.gz (64.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memshan-2.2.0-py3-none-any.whl (44.0 kB view details)

Uploaded Python 3

File details

Details for the file memshan-2.2.0.tar.gz.

File metadata

  • Download URL: memshan-2.2.0.tar.gz
  • Upload date:
  • Size: 64.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for memshan-2.2.0.tar.gz
Algorithm Hash digest
SHA256 e71ae77a91a55fae16da3f8ad347667957b964941e155590e98bd2502f6a1a7e
MD5 8c1073c98d4dd9cfc0d1d74ea194ae88
BLAKE2b-256 b0b741a7bc2162934a66837b4aa00a401b0f7e733134311bc3770789f49023dd

See more details on using hashes here.

File details

Details for the file memshan-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: memshan-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 44.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for memshan-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fc500944a0daf41ffd444321b7c3331edf478b257127371ce695138266f680d6
MD5 ed491cac25db4dfbe95510bf1b73f8f4
BLAKE2b-256 c8017ba3ae728aaea49f070f5469fbf1cd5b2bba5282ce505bb8d458c3c83d5f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page