MCP memory server combining Method of Loci, Major System, Songlines, and PAO mnemonic layers
Project description
MemShān
Memory & Context Manager for AI
What is MemShān?
MemShān is a Python-based MCP (Model Context Protocol) memory server that gives AI assistants a structured, multi-layered long-term memory. It combines four proven cognitive mnemonic techniques into a single retrieval pipeline:
| Layer | Cognitive Technique | What It Does |
|---|---|---|
| Base | Method of Loci | ChromaDB Wings and Rooms — spatial memory palace for semantic search |
| Layer 1 | Major System | Converts numbers in text to phonetic tags for precise numeric lookup |
| Layer 2 | Songlines | NetworkX Knowledge Graph records context trails between memory chunks |
| Layer 3 | PAO System | Compresses session logs into Subject-Action-Object triplets for long-term archival |
Architecture
See ARCHITECTURE.md for the full design, data-flow diagrams, and technology decisions.
See TASK_EXECUTION_PLAN.md for the phase-by-phase build plan.
MCP Tools
| Tool | Description |
|---|---|
store_memory |
Store text into a Wing/Room; all layers run automatically |
retrieve_memory |
Unified query: semantic search + graph expansion + numeric tag matching |
add_context_trail |
Manually link two memory chunks in the Songlines graph |
get_context_trail |
Return the narrative path between two concept nodes |
snapshot_session |
Compress a session log into SAO triplets → long-term storage |
list_rooms |
List all Wings and Rooms in the memory palace |
LLM Provider Support
MemShān defaults to Ollama (local, fully offline). Switch providers via a single env var — no code changes required.
| Provider | LLM_PROVIDER value |
Requires |
|---|---|---|
| Ollama (default) | ollama |
Ollama running locally |
| OpenAI | openai |
OPENAI_API_KEY in .env |
| Google Gemini | gemini |
GEMINI_API_KEY in .env |
| Anthropic Claude | anthropic |
ANTHROPIC_API_KEY in .env |
Installation
Prerequisites
- Python 3.11+
- Ollama installed and running (for default local LLM)
- Windows: batch scripts provided (
.bat) - Linux / macOS: shell scripts provided (
.sh) — make executable withchmod +x *.sh
Windows Quick Start
REM 1. Initialize git
000_init.bat
REM 2. Create virtual environment
001_env.bat
REM 3. Activate virtual environment
002_activate.bat
REM 4. Install dependencies
003_setup.bat
Linux / macOS Quick Start
# Make scripts executable (one-time)
chmod +x *.sh
# 1. Initialize git
./000_init.sh
# 2. Create virtual environment
./001_env.sh
# 3. Activate virtual environment (must be sourced)
source 002_activate.sh
# 4. Install dependencies
./003_setup.sh
Manual (inside activated venv)
pip install -r requirements.txt
Configure .env
Copy and edit the environment file:
# LLM Provider (default: ollama)
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2
# Optional providers (uncomment and add keys to switch)
# LLM_PROVIDER=openai
# OPENAI_API_KEY=sk-...
# ChromaDB storage
CHROMA_PERSIST_DIR=./data/chroma
# Embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2
Usage
REM Windows — Run the MCP server
004_run.bat
# Linux / macOS
./004_run.sh
# Equivalent (inside activated venv)
python main.py
Testing
REM Windows
005_run_test.bat
005_run_code_cov.bat
# Linux / macOS
./005_run_test.sh
./005_run_code_cov.sh
# Equivalent inside activated venv (Windows)
.venv\Scripts\pytest tests/ -v
.venv\Scripts\pytest tests/ --cov=src --cov-report=term-missing
# Equivalent inside activated venv (Linux / macOS)
.venv/bin/pytest tests/ -v
.venv/bin/pytest tests/ --cov=src --cov-report=term-missing
Coverage target: 100% line AND branch coverage per module — no exceptions.
Security Scanning
MemShān enforces a zero-tolerance vulnerability policy on the main branch.
All 90 transitive dependencies are audited via pip-audit
before every commit that changes requirements.txt.
Run the security scan
REM Windows — Full scan: requirements + installed environment → JSON + HTML reports
006_pip_audit.bat
# Linux / macOS
./006_pip_audit.sh
The script runs two passes:
| Pass | Scope | Output |
|---|---|---|
| Requirements scan | Direct deps + full transitive tree (as pip resolves) | security_reports\pip_audit_<TS>.json + .html |
| Environment scan | Everything installed in the venv | security_reports\pip_audit_env_<TS>.json + .html |
Both JSON files are converted automatically to self-contained, dark-themed HTML reports with package tables, CVE details, and a filter bar.
Utility script
tools/pip_audit_to_html.py — reusable converter. Accepts pip-audit JSON from a
file or stdin and writes a timestamped HTML report.
# Pipe directly
$env:PYTHONUTF8="1"
python -m pip_audit -r requirements.txt --format json 2>$null |
python tools/pip_audit_to_html.py
# From a saved file
python tools/pip_audit_to_html.py security_reports/audit.json
# Custom output path
python tools/pip_audit_to_html.py audit.json --output reports/my_report.html
Copilot prompt
Use the /pip-audit prompt in GitHub Copilot Chat to run the full scan interactively:
.github/prompts/pip-audit.prompt.md
Policy
- Run
006_pip_audit.bat(Windows) or./006_pip_audit.sh(Linux / macOS) before every commit that adds or changes dependencies. - Resolve ALL findings before pushing to
main. - Reports are gitignored — only the scripts and prompt are committed.
Every test file must cover three scenario groups:
@pytest.mark.positive # happy path
@pytest.mark.negative # error / failure conditions
@pytest.mark.edge # boundary values, empty inputs, None, single-item collections
Script Reference
Core Scripts
Windows (.bat) |
Linux / macOS (.sh) |
Purpose |
|---|---|---|
000_init.bat |
000_init.sh |
Initializes git and sets user name / email |
001_env.bat |
001_env.sh |
Creates a .venv virtual environment |
002_activate.bat |
source 002_activate.sh |
Activates the virtual environment |
003_setup.bat |
003_setup.sh |
Installs requirements.txt and initialises MemPalace |
004_run.bat |
004_run.sh |
Runs the MCP server (main.py) |
005_run_test.bat |
005_run_test.sh |
Runs the full pytest suite with HTML report |
005_run_code_cov.bat |
005_run_code_cov.sh |
Runs tests with HTML coverage report |
006_pip_audit.bat |
006_pip_audit.sh |
pip-audit security scan → JSON + HTML reports |
008_deactivate.bat |
source 008_deactivate.sh |
Deactivates the virtual environment |
MemPalace Utility Scripts
Windows (.bat) |
Linux / macOS (.sh) |
Purpose |
|---|---|---|
007_mp_mine.bat |
007_mp_mine.sh |
Mine workspace files into MemPalace |
007_mp_status.bat |
007_mp_status.sh |
Show palace drawer counts and status |
007_mp_search.bat |
007_mp_search.sh |
Search the palace with optional wing/room filters |
007_mp_compress.bat |
007_mp_compress.sh |
Compress drawers using AAAK Dialect (~30× token reduction) |
007_mp_diary.bat |
007_mp_diary.sh |
Read or write agent diary entries |
007_mp_wakeup.bat |
007_mp_wakeup.sh |
Output L0 + L1 context (~600-900 tokens) for session start |
007_mp_repair.bat |
007_mp_repair.sh |
Rebuild vector index after corruption or abrupt exit |
Linux / macOS note: All
.shscripts must be made executable once:chmod +x *.sh
002_activate.shand008_deactivate.shmust be sourced (source <script>), not executed.
Project Structure
Implemented
src/
├── config.py # ✅ Pydantic BaseSettings — all env config + LLM factory
├── llm/ # ✅ LLM provider adapters
│ ├── client.py # LLMClient ABC
│ ├── ollama_client.py # Ollama (default, fully offline)
│ ├── openai_client.py # OpenAI
│ ├── gemini_client.py # Google Gemini
│ └── anthropic_client.py # Anthropic Claude
├── base/ # ✅ Method of Loci — Base Layer
│ ├── loci_store.py # ChromaDB Wings/Rooms abstraction
│ └── embedder.py # sentence-transformers wrapper
└── layers/
├── major_system/ # ✅ Layer 1 — Numerical Precision
│ └── phonetic_encoder.py # Numbers → phonetic consonant tags
└── songlines/ # ✅ Layer 2 — Contextual Continuity
└── knowledge_graph.py # NetworkX directed graph; Context Trails; GraphML persistence
Planned (upcoming phases)
src/
├── server.py # 🔲 MCP server entry point (FastMCP) — Phase 9
└── layers/
├── pao/ # 🔲 Layer 3 — Episodic Compression (SAO triplets) — Phase 6
│ └── snapshot.py
├── pipeline/ # 🔲 Unified retrieval pipeline — Phase 7
│ └── retrieval.py
└── tools/ # 🔲 MCP tool definitions — Phase 8
└── mcp_tools.py
Success Metrics & Observability
MemShān measures intelligence density, not just retrieval correctness. The scorecard below bridges technical performance of the mnemonic layers with enterprise engineering goals.
1. Key Performance Indicators (KPIs)
How the server runs — quantified technical efficiency per retrieval layer.
| Category | KPI | Target | Rationale |
|---|---|---|---|
| Retrieval Quality | Faithfulness / Groundedness | > 95% | Prevents hallucinated context when traversing a Songline trail |
| Numerical Precision | Numerical Recall Accuracy | 100% | Validates the Major System phonetic-tag pipeline for stats and dates |
| Compression | Context Compression Ratio | ≥ 5 : 1 | Measures how efficiently the PAO System converts raw session logs to actionable SAO triplets |
| Latency | P99 Retrieval Latency | < 200 ms | Loci lookups must not throttle the agent's reasoning loop |
| Observability | PulseGuard Hit Rate | 100% of queries | Ensures every retrieval event is logged and validated for semantic drift |
2. Key Result Areas (KRAs)
What MemShān achieves — enterprise-level value domains.
KRA 1 — Amnesia-Proof Persistence
Metric: Context Retention Span
Measure how many consecutive sessions an agent maintains perfect continuity on a complex, multi-phase project without requiring a re-prompt or context refresh.
- Baseline: standard zero-shot / short-context agent — continuity typically breaks after 1–2 sessions.
- MemShān target: ≥ 10 sessions with no loss of project state.
KRA 2 — Quality Engineering Modernisation
Metric: Defect Detection Velocity
Measure how much faster AI-assisted QE reviews identify architectural flaws when MemShān provides accumulated project memory versus cold zero-shot prompts.
- Baseline: cold-prompt review time per module (measured in minutes).
- MemShān target: ≥ 40% reduction in time-to-defect-detection.
KRA 3 — Resource Optimisation
Metric: Token-to-Knowledge Density (TKD)
$$\text{TKD} = \frac{\text{Relevant facts delivered to model}}{\text{Tokens consumed from context window}}$$
High TKD means MemShān surfaces more signal within the model's 128k / 200k token budget.
- Target: TKD ≥ 3× vs. naive full-log injection.
3. Layer-Specific Experimental Metrics
Proving that each mnemonic addition (beyond standard vector RAG) earns its place.
Songlines — Narrative Coherence
Test: Ask the AI to reconstruct a project's event history from Songline graph traversal.
| Metric | Description | Pass Threshold |
|---|---|---|
| Temporal Accuracy | Events retrieved in correct causal / chronological order | ≥ 95% |
| Sequence Drift | Compared against standard vector-only RAG (which frequently reorders events) | Songlines must outperform by ≥ 20 pp |
PAO System — Reconstruction Fidelity
Test: Compress a session log into SAO triplets, then ask the AI to reconstruct the full system state from triplets alone.
| Metric | Description | Pass Threshold |
|---|---|---|
| Reconstruction Fidelity | Facts present in original log that survive compression → decompression | ≥ 98% |
| Data Leakage Rate | Critical facts lost during PAO snapshot | 0% for facts tagged critical |
Major System — Numeric Round-Trip
Test: Store text containing numerical data, query using rephrased numeric context.
| Metric | Description | Pass Threshold |
|---|---|---|
| Phonetic Tag Recall | Correct chunk retrieved via phonetic tag alone | 100% |
| False Positive Rate | Unrelated numeric chunks surfaced in results | < 2% |
4. Observability Dashboard Checklist
Recommended metrics to surface in Grafana / a custom VeredianAI UI.
| Signal | What to Track | Alert Condition |
|---|---|---|
| Memory Growth Rate | Wing/Room document count over time | Sudden spike > 3× daily average |
| Room Utilisation | Query hit-count per Room (hot vs. cold context) | Room unutilised for > 30 days → archive candidate |
| Semantic Drift Alarm | Distance between query embedding and retrieved chunk | Cosine distance > 0.4 → PulseGuard flag |
| Hallucination Rate | % of responses where retrieved chunk was not used by model | Target < 5% |
| P99 / P95 Latency | Per-layer breakdown: embed → query → graph expand → merge | P99 > 200 ms → alert |
| Provider Error Rate | Per LLM provider: 4xx / 5xx / timeout rate | Any provider > 1% error rate → alert |
Engineering Standards
| Standard | Requirement |
|---|---|
| SOLID | Every production class in src/ must demonstrably satisfy all five SOLID principles |
| OOP | Encapsulation, composition over inheritance, constructor injection throughout |
| Test Coverage | 100% line AND branch coverage per module — no exceptions |
| Test Scenarios | Every test file: @pytest.mark.positive + @pytest.mark.negative + @pytest.mark.edge |
| Database Changes | All schema changes via Alembic migration — never mutate schema directly |
| Security Scanning | pip-audit -r requirements.txt before every commit with dependency changes |
| Session Close | Write MemPalace diary entry + update Session_N.md + commit & push every session |
| Drift Review | Compare implementation vs architecture docs every 3 sessions |
See .github/copilot-instructions.md for full details.
Contributing
Contributions are welcome. Please read ARCHITECTURE.md and TASK_EXECUTION_PLAN.md before submitting a PR.
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Commit your changes:
git commit -m "feat: add my feature" - Push to the branch:
git push origin feature/my-feature - Open a Pull Request
License
MIT License — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memshan-2.2.0.tar.gz.
File metadata
- Download URL: memshan-2.2.0.tar.gz
- Upload date:
- Size: 64.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e71ae77a91a55fae16da3f8ad347667957b964941e155590e98bd2502f6a1a7e
|
|
| MD5 |
8c1073c98d4dd9cfc0d1d74ea194ae88
|
|
| BLAKE2b-256 |
b0b741a7bc2162934a66837b4aa00a401b0f7e733134311bc3770789f49023dd
|
File details
Details for the file memshan-2.2.0-py3-none-any.whl.
File metadata
- Download URL: memshan-2.2.0-py3-none-any.whl
- Upload date:
- Size: 44.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc500944a0daf41ffd444321b7c3331edf478b257127371ce695138266f680d6
|
|
| MD5 |
ed491cac25db4dfbe95510bf1b73f8f4
|
|
| BLAKE2b-256 |
c8017ba3ae728aaea49f070f5469fbf1cd5b2bba5282ce505bb8d458c3c83d5f
|