Skip to main content

Give your AI a memory — mine projects and conversations into a searchable palace. No API key required.

Project description

MemPal

Every conversation you've ever had with AI — one searchable palace.


Every time you start a new conversation with an AI — Claude, ChatGPT, Copilot, any of them — it has no idea who you are, what you're working on, or what you decided last week. You explain the same things over and over. Important decisions vanish when the chat window closes.

MemPal gives your AI a permanent memory. It reads your past conversations and project files, organizes them into a searchable archive on your computer, and lets your AI recall any of it instantly. Nothing is uploaded. Nothing leaves your machine. Your AI just... remembers.



Getting Started · Why It Matters · How It Works · Examples


graph LR
    A["🗂 Mine\nYour conversations & code"] -->|organize & index| B["🏛 Store\nSearchable archive\non your machine"]
    B -->|instant recall| C["🔍 Search\nFind anything you\never discussed"]
    C -->|plug into any AI| D["🤖 Remember\nYour AI picks up\nwhere you left off"]

    style A fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style B fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style C fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style D fill:#1a1920,stroke:#ff80eb,color:#e8e4df,stroke-width:2px


What Is MemPal?

MemPal is a tool that turns your AI conversations and project files into a searchable, permanent memory that any AI can access.

In plain terms: You've been talking to AI tools for months or years. You've made hundreds of decisions, solved countless bugs, debated architectures, chosen frameworks. All of that knowledge is trapped in old chat logs — or already gone. MemPal extracts it, organizes it, and makes it searchable so your AI can reference it in future conversations.

How it's different from ChatGPT's memory or Claude's projects: Those features store short summaries that the AI writes about you. MemPal stores your actual conversations — the exact words, the reasoning, the context — compressed into a format that's 30x smaller but doesn't lose any information. The difference is like keeping a diary vs. having someone write a one-sentence note about your day.

Who it's for: Developers who use AI tools daily and are tired of re-explaining their projects every session. If you use Claude Code, Cursor, ChatGPT, or any AI coding tool — and you've ever thought "I already told you this last week" — MemPal is for you.

What it requires: Python and one terminal command. No accounts, no API keys, no cloud services. Everything stays on your computer.

For the technical details: MemPal uses ChromaDB for vector search, classifies content with keyword heuristics (no LLM calls), compresses memories 30x with a custom lossless dialect called AAAK, and integrates with AI tools via MCP (Model Context Protocol).


Table of Contents


The Problem Nobody Is Solving

Your AI conversations are your new institutional memory — and you're throwing them away every session.

Think about where decisions actually happen now. Not in documentation. Not in project management tools. Not in architecture docs that nobody updates. They happen in conversations with AI. You debug with Claude, you design with ChatGPT, you review with Copilot. The reasoning, the trade-offs, the "we tried X and it failed because Y" — all of it lives in chat sessions that evaporate when the window closes.

Tools that give AI access to documents (called "RAG") index files — PDFs, codebases, wikis. But those documents are the output of decisions, not the decisions themselves. The actual moment where you chose Postgres over Redis, where you figured out why authentication was failing, where you decided to abandon a migration — that happened in a conversation. And nobody is treating conversations as a first-class data source.

MemPal does.

"But I already use CLAUDE.md / cursor rules"

Those are great — and MemPal complements them. Here's the difference:

CLAUDE.md / cursor rules MemPal
Who writes it You, manually Mined automatically from your conversations and code
What it contains What you remember to write down Everything — including things you forgot were important
How it grows You update it when you remember to Auto-save hooks capture decisions as they happen
Searchable Ctrl+F on one file Semantic search across months of conversations
Cross-project One file per project Search across all projects at once
Size You keep it short so the AI reads it 4-layer stack loads only what's needed (~600 tokens to wake up)

CLAUDE.md is your conscious notes. MemPal is your searchable, compressed, persistent memory of everything you've ever discussed with an AI.

"Context windows are huge now — just paste it in"

Let's do the math.

A developer using Claude Code daily generates roughly:

  • ~50,000 tokens per session (your messages + AI responses)
  • ~3 sessions per day = 150,000 tokens/day
  • ~130 working days over 6 months = 19.5 million tokens

That's 6 months of decisions, debugging sessions, architecture discussions, and context. Here's what it costs to load that into a new session:

Method Tokens loaded Fits in context? Cost per session (Sonnet) Cost per year (daily use)
Raw conversations 19.5M No (exceeds every context window) Impossible
LLM summary ~650K Barely (uses 50%+ of window) ~$1.95 ~$507/yr
AAAK compressed ~650K Yes (fits in ~5% of window) ~$1.95 ~$507/yr
MemPal L0+L1 wake-up ~900 Yes (0.5% of window) ~$0.003 ~$0.70/yr
MemPal + 5 targeted searches ~13,500 Yes (1% of window) ~$0.04 ~$10/yr

$507/year vs $10/year to remember the same things. And the $10 version is more accurate because it loads verbatim content, not summaries.

The real win isn't just cost — it's that summaries are lossy. An LLM summarizing 6 months of conversations gives you "the team discussed database options." AAAK compression gives you the exact quote, the people involved, the emotional weight, the connection to what came before, and the reasoning — in 30x less space.

"This is just a ChromaDB wrapper"

The chunking + embedding + search pipeline is table stakes. What's not table stakes:

  1. Conversation-native ingest — MemPal understands that a conversation is Q+A pairs, not paragraphs. It chunks by exchange, not by character count. It normalizes 5 different chat export formats into one structure.

  2. AAAK lossless compression — Not summaries. Not embeddings. A structured dialect that preserves every fact, person, emotion, and connection in 30x less space, and that any LLM reads natively. You can diff it, grep it, version-control it.

  3. The 4-layer memory stack — An AI doesn't need to load everything to "wake up." 600 tokens for identity + essential story. Deep search only when needed. This is an architecture decision, not just "put stuff in ChromaDB."

  4. Auto-save hooks — The hooks don't save data. They tell the AI when to save and let the AI decide what to save. The AI has context about what matters. The hook just creates the checkpoint.

Without MemPal

  • Every session starts blank
  • Context dies when the window fills
  • Decisions get forgotten and re-debated
  • No memory across projects
  • You are the memory

With MemPal

  • AI wakes up knowing your project in ~600 tokens
  • Memories persist across sessions forever
  • Decisions, preferences, milestones are searchable
  • Cross-project semantic search
  • The palace is the memory

vs. Enterprise solutions (OpenViking, etc.)

  • Require Python + Go + C++ + Rust + CMake
  • 14 CI/CD pipelines, Kubernetes, Docker
  • API keys and cloud accounts required
  • Backed by ByteDance engineering teams

MemPal

  • Python 3.9+ and nothing else
  • One dependency: ChromaDB
  • No API key, no cloud, no account
  • Built by one person who needed it

Quick Start

git clone https://github.com/moonmadness1217/mempal-local.git
cd mempal-local
pip install -r requirements.txt

# Mine a project
python mempal.py init ~/projects/my_app
python mempal.py mine ~/projects/my_app
python mempal.py search "why did we switch to GraphQL"

# Mine conversations
python mempal.py mine ~/chats/ --mode convos
python mempal.py search "what did we decide about auth"

# Check what's stored
python mempal.py status

No API key. No account. No data leaves your machine.


Real-World Examples

A dev team with 2 years of AI conversations

Your team has been using Claude Code, ChatGPT, and Slack daily across multiple projects. Two years of decisions, debugging sessions, architecture debates — scattered across exports and chat logs.

# Mine all your Claude Code sessions
python mempal.py mine ~/.claude/projects/ --mode convos --wing work

# Mine ChatGPT exports
python mempal.py mine ~/Downloads/chatgpt-export/ --mode convos --wing work

# Mine Slack exports from your engineering channel
python mempal.py mine ~/Downloads/slack-export/eng-team/ --mode convos --wing work

# Now search across all of it
python mempal.py search "why did we abandon the microservices migration"
python mempal.py search "what was the Redis caching decision"
python mempal.py search "who figured out the auth bug in March"

Six months from now, a new engineer asks "why do we do X this way?" Instead of digging through Slack, you search the palace.

A solo developer across multiple projects

You freelance across 4-5 projects. Each one has its own codebase, its own decisions, its own history of "we tried X and it didn't work."

# Mine each project
python mempal.py init ~/projects/client-a && python mempal.py mine ~/projects/client-a
python mempal.py init ~/projects/client-b && python mempal.py mine ~/projects/client-b
python mempal.py init ~/projects/side-project && python mempal.py mine ~/projects/side-project

# Mine your AI conversations about each
python mempal.py mine ~/chats/client-a/ --mode convos --wing client-a
python mempal.py mine ~/chats/client-b/ --mode convos --wing client-b

# Search across everything or within one project
python mempal.py search "rate limiting approach"                    # all projects
python mempal.py search "rate limiting approach" --wing client-a   # just client-a

# Wake up Claude Code with full context for today's project
python mempal.py wake-up --wing client-b

Each project gets its own wing. Your AI assistant can recall what you decided on Client A without polluting Client B's context.

An open-source maintainer

You maintain a popular library. Contributors ask the same questions. You've explained the same architectural decisions dozens of times in issues and PRs.

# Mine the codebase
python mempal.py init ~/projects/my-library
python mempal.py mine ~/projects/my-library

# Mine your conversation history about the project
python mempal.py mine ~/chats/library-discussions/ --mode convos --wing my-library --extract general

# Now when someone asks "why doesn't this library support X?"
python mempal.py search "why we don't support" --wing my-library --room decisions

The --extract general flag classifies your conversations into decisions, problems, milestones — so you can search specifically for past decisions and their reasoning.

A researcher with text-based notes

You have years of markdown notes, plain text research logs, and CSV data from experiments.

# Mine your research notes
python mempal.py init ~/research/protein-folding
python mempal.py mine ~/research/protein-folding

# Mine your lab notebook (markdown files)
python mempal.py mine ~/research/lab-notes/ --mode convos --wing protein-folding

# Search for methodology decisions
python mempal.py search "why we switched from method A to method B"
python mempal.py search "failed experiments with temperature above 40C"

Note: MemPal reads plain text formats. If your notes are in .md, .txt, .csv, or .json, they work today. See the next section for what doesn't work.


What MemPal Is Not

MemPal does one thing well: it turns text files and AI conversations into searchable, persistent memory. Here's what it's not designed for:

Not a document converter

MemPal reads text-based files. It does not parse:

  • .xlsx / .xls (Excel spreadsheets)
  • .docx / .doc (Word documents)
  • .pdf (PDF files)
  • .pptx (PowerPoint)
  • Images, audio, or video

If you have a folder of Word docs and Excel files from years of company work, you need to convert them to text first (using tools like pandoc, pdfplumber, or python-docx). Once they're text, MemPal can mine them.

Not a knowledge graph

MemPal uses vector search (ChromaDB) — it finds content that's semantically similar to your query. It does not build:

  • DAGs (directed acyclic graphs)
  • Entity-relationship graphs
  • Semantic networks with traversable edges
  • Knowledge graphs (Neo4j, etc.)

If you need "walk from concept A to concept B through their relationships," you need a graph database. MemPal answers "find me everything related to concept A" — which is a different (and often more practical) question.

Not a RAG framework

MemPal is not trying to be LangChain, LlamaIndex, or a retrieval-augmented generation pipeline. It doesn't orchestrate LLM calls, manage prompts, or chain retrievals. It's a storage and retrieval layer that any RAG system or AI tool can query.

Where MemPal is strongest

Use case Fit
Dev team with years of AI chat history Perfect — this is exactly what it's built for
Solo dev juggling multiple projects Perfect — per-project wings, cross-project search
Codebase + conversation memory for Claude Code Perfect — MCP integration, auto-save hooks
Markdown/text research notes Great — works if your notes are already text
Slack/ChatGPT/Claude exports Great — built-in format normalization
Company docs in Word/Excel/PDF Not yet — needs text extraction first
Building a knowledge graph Wrong tool — use Neo4j or similar
Real-time streaming data Wrong tool — MemPal is batch-oriented

Key Features

Two Ingest Modes

Mine codebases (folder structure → rooms) or conversation exports (Claude, ChatGPT, Slack, plain text). Same palace, same search.

4-Layer Memory Stack

L0 identity + L1 essential story = ~600 token wake-up. L2 on-demand recall. L3 deep semantic search. Under 5% of any context window.

30x Compression

AAAK dialect: entity codes, emotion markers, importance flags, tunnels between memories. Plain text any LLM reads natively.

MCP Integration

One command to add MemPal to Claude Code. Three tools: search, status, list wings. AI queries the palace mid-conversation.

5-Type Extractor

Classifies memories into decisions, preferences, milestones, problems, and emotional moments. Keyword heuristics — no LLM needed.

Auto-Save Hooks

Bash hooks trigger saves every N turns and before context compaction. The AI decides what to save — the hook decides when.

How It Works

MemPal has a simple pipeline: normalize → chunk → detect room → store → search.

graph TD
    A["📁 Your Files"] --> B{"What kind?"}

    B -->|"Code, docs, configs"| C["⛏ Project Miner\nminer.py"]
    B -->|"Claude, ChatGPT, Slack"| D["💬 Convo Miner\nconvo_miner.py"]

    D --> E["🔄 Normalize\nnormalize.py\nAny format → standard transcript"]
    E --> F["✂️ Chunk by exchange\n1 user msg + 1 AI response = 1 chunk"]

    C --> G["✂️ Chunk by paragraph\n800 chars, 100-char overlap"]

    F --> H{"🏷 Detect Room"}
    G --> H

    H -->|"Projects: folder name"| I["backend/ → backend\ndocs/ → documentation"]
    H -->|"Convos: keyword scoring"| J["'bug','error' → technical\n'decided','chose' → decisions"]

    I --> K["🏛 Palace\nChromaDB"]
    J --> K

    K --> L["🔍 Search\nsearcher.py\nSemantic vector similarity"]
    K --> M["📦 Compress\ndialect.py\nAAAK ~30x reduction"]
    K --> N["🧠 Memory Stack\nlayers.py\nL0 → L1 → L2 → L3"]

    style A fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style K fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style L fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style M fill:#1a1920,stroke:#ff80eb,color:#e8e4df,stroke-width:2px
    style N fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px

Everything runs locally. ChromaDB handles embeddings and vector search with no external API.


Architecture

mempal-local/
├── mempal.py                  # CLI entry point — all commands route through here
├── config.py                  # Configuration management (~/.mempal/config.json)
├── normalize.py               # Converts chat exports to standard transcript format
├── room_detector_local.py     # Maps files/content to rooms (60+ folder patterns)
├── miner.py                   # Mines project files (code, docs, configs)
├── convo_miner.py             # Mines conversation exports (chat sessions)
├── general_extractor.py       # Classifies content into 5 memory types
├── searcher.py                # Semantic search across the palace
├── layers.py                  # 4-layer memory stack (L0-L3)
├── dialect.py                 # AAAK dialect compression (~30x)
├── mcp_server.py              # MCP server for Claude Code integration
├── hooks/
│   ├── mempal_save_hook.sh    # Auto-save every N conversation turns
│   └── mempal_precompact_hook.sh  # Emergency save before context compaction
├── tests/
│   ├── test_config.py
│   ├── test_miner.py
│   ├── test_convo_miner.py
│   └── test_normalize.py
├── examples/
│   ├── basic_mining.py        # 3-step workflow example
│   ├── convo_import.py        # Conversation import examples
│   └── mcp_setup.md           # MCP configuration guide
├── pyproject.toml             # Packaging (pip-installable)
├── requirements.txt           # chromadb>=0.4.0, pyyaml>=6.0
└── CONTRIBUTING.md

The Palace: Wings, Rooms, Drawers

MemPal organizes everything into a spatial metaphor — a memory palace:

graph TD
    P["🏛 PALACE<br><i>~/.mempal/palace</i>"]

    P --> W1["🪟 WING: my_app"]
    P --> W2["🪟 WING: team_chats"]
    P --> W3["🪟 WING: ..."]

    W1 --> R1["🚪 backend"]
    W1 --> R2["🚪 frontend"]
    W1 --> R3["🚪 documentation"]

    R1 --> D1["📄 chunk from app.py"]
    R1 --> D2["📄 chunk from routes.py"]
    R1 --> D3["📄 chunk from models.py"]
    R2 --> D4["📄 chunk from index.html"]
    R3 --> D5["📄 chunk from README.md"]

    W2 --> R4["🚪 decisions"]
    W2 --> R5["🚪 technical"]
    W2 --> R6["🚪 problems"]

    R4 --> D6["📄 'we chose Postgres because...'"]
    R5 --> D7["📄 'the API rate limit fix was...'"]
    R6 --> D8["📄 'auth kept failing because...'"]

    style P fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:3px
    style W1 fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style W2 fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style W3 fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:1px,stroke-dasharray: 5 5
    style R1 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R2 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R3 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R4 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R5 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R6 fill:#1a1920,stroke:#8ec96c,color:#e8e4df

PalaceWing (a project) → Room (an aspect) → Drawer (verbatim content, never summarized)

Key principle: Drawers store verbatim content — exact words, never summarized. Your AI can summarize at query time if it wants to. The palace preserves the original.

Every drawer has metadata:

Field Example Purpose
wing "my_app" Which project/source
room "backend" Which aspect
source_file "/path/to/app.py" Where it came from
chunk_index 0 Which chunk of that file
filed_at "2026-03-20T..." When it was stored
ingest_mode "projects" or "convos" How it was mined
added_by "mempal" What agent filed it

Mining: Getting Data In

Mode 1: Projects

Point MemPal at any codebase. It scans the folder structure, auto-detects rooms from directory names, and chunks every readable file into the palace.

python mempal.py init ~/projects/my_app    # scan folders, create mempal.yaml
python mempal.py mine ~/projects/my_app    # chunk and store everything

How room detection works for projects:

Your folder Room created
frontend/, ui/, components/ frontend
backend/, api/, server/ backend
docs/, documentation/, wiki/ documentation
tests/, testing/, qa/ testing
planning/, roadmap/, strategy/ planning
config/, settings/, infra/ configuration
scripts/, tools/, utils/ scripts
design/, mockups/, wireframes/ design
meetings/, standup/ meetings
costs/, budget/, pricing/ costs

70+ folder patterns are recognized. Files that don't match any pattern go to general.

What gets chunked: .txt .md .py .js .ts .jsx .tsx .json .yaml .yml .html .css .java .go .rs .rb .sh .csv .sql .toml

What gets skipped: node_modules/, .git/, __pycache__/, .venv/, venv/, build/, dist/, coverage/, .next/, .mempal/, env/.

Chunking rules: 800 characters per chunk, breaking on paragraph or line boundaries, with 100-character overlap between chunks for context continuity. Chunks under 50 characters are discarded.

Mode 2: Conversations

Point MemPal at your exported chat files. It normalizes any format into a standard transcript, then chunks by exchange pair (one user message + one AI response = one drawer).

python mempal.py mine ~/chats/ --mode convos
python mempal.py mine ~/chats/ --mode convos --wing my_project

Supported formats:

Format File type What MemPal looks for
Claude Code sessions .jsonl {"type": "human"} / {"type": "assistant"} objects
ChatGPT exports .json conversations.json with tree-structured mapping
Claude.ai exports .json [{"role": "user", "content": "..."}] arrays
Slack exports .json [{"type": "message", "user": "...", "text": "..."}]
Plain text / Markdown .txt .md Lines starting with > are user turns

All formats get normalized to the same internal structure before chunking:

graph LR
    A["Claude Code\n.jsonl"] --> N["🔄 normalize.py"]
    B["ChatGPT\nconversations.json"] --> N
    C["Claude.ai\n.json"] --> N
    D["Slack\nexport .json"] --> N
    E["Plain text\n.txt / .md"] --> N

    N --> O["> user message\nAI response\n> user message\nAI response"]

    style N fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style O fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px

The palace doesn't care where the data came from.

How room detection works for conversations:

Keywords in content Room created
code, python, bug, error, api, deploy technical
architecture, design, pattern, schema architecture
plan, roadmap, milestone, sprint planning
decided, chose, switched, trade-off decisions
problem, issue, broken, fix, solved problems

The General Extractor (5 memory types)

By default, conversation mining chunks by exchange pair. Add --extract general to also classify each chunk into one of five memory types using keyword heuristics (no LLM needed):

python mempal.py mine ~/chats/ --mode convos --extract general
Type What it catches Example signals
Decision Choices and their reasoning "we chose X because", "decided to", "went with"
Preference Rules and style preferences "always use", "never do", "I prefer"
Milestone Breakthroughs and completions "finally worked", "shipped", "breakthrough"
Problem What broke and how it was fixed "bug", "crash", "root cause", "the fix was"
Emotional Feelings and relationship moments "love", "scared", "proud", "grateful"

Each type has 16-34 regex markers. Confidence scoring filters out weak matches (threshold: 0.3). Code blocks are stripped before classification so if error: in code doesn't trigger "problem."


Search: Getting Data Out

python mempal.py search "why did we switch databases"
python mempal.py search "auth bug" --wing my_app
python mempal.py search "deploy process" --room technical --results 10

Search uses ChromaDB's vector similarity. Your query gets embedded and compared against all stored drawers. Results return:

  • The verbatim text
  • Which wing and room it came from
  • The source file path
  • A similarity score (0-1, higher = closer match)

You can filter by --wing (project) and --room (aspect) to narrow results.


4-Layer Memory Stack

The memory stack controls how much context gets loaded and when. The goal: give an AI everything it needs to "wake up" in under 900 tokens, with deeper recall available on demand.

block-beta
    columns 1

    block:L0:1
        columns 3
        L0a["🪪 L0: IDENTITY"] L0b["~100 tokens"] L0c["ALWAYS LOADED"]
    end

    block:L1:1
        columns 3
        L1a["📖 L1: ESSENTIAL STORY"] L1b["~500-800 tokens"] L1c["ALWAYS LOADED"]
    end

    block:L2:1
        columns 3
        L2a["📂 L2: ON-DEMAND"] L2b["~200-500 each"] L2c["WHEN RELEVANT"]
    end

    block:L3:1
        columns 3
        L3a["🔍 L3: DEEP SEARCH"] L3b["unlimited"] L3c["WHEN ASKED"]
    end

    style L0 fill:#2a4a2a,stroke:#8ec96c,color:#e8e4df
    style L1 fill:#2a3a4a,stroke:#6c8ec9,color:#e8e4df
    style L2 fill:#3a3020,stroke:#c9a86c,color:#e8e4df
    style L3 fill:#3a2030,stroke:#ff80eb,color:#e8e4df

Wake-up cost (L0 + L1)

████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ~5%

~600-900 tokens out of any context window

Everything else: free for conversation

░░░░████████████████████████████████████ ~95%

L2 and L3 load only when needed

Wake-up cost: L0 + L1 = ~600-900 tokens. That's under 5% of even a small context window.

python mempal.py wake-up                    # show L0 + L1
python mempal.py wake-up --wing my_app      # wake-up scoped to one project

Setting up your identity (L0)

Create ~/.mempal/identity.txt:

I am Atlas, a personal AI assistant for Alice.
Traits: warm, direct, remembers everything.
Current project: A journaling app that helps people process emotions.

This is plain text you write yourself. It becomes the first thing loaded into every session.


AAAK Dialect Compression

Why this matters

Every AI model has a context window. Even a million-token window fills up when you're loading memory from hundreds of past conversations. The standard approach is to summarize — but summaries are lossy. An LLM summarizing your conversation drops the exact quote, the specific person who said it, the emotional weight of the moment, the connection to what came before. You get a bland paragraph that says "the team discussed database options" instead of the actual decision, who made it, why, and what they were feeling when they did.

AAAK is a lossless compression dialect — it preserves every fact, every person, every emotion, every connection, just in a denser format. The key insight: LLMs can read structured shorthand just as well as prose. So instead of spending 5,000 tokens on a full conversation transcript, you spend 170 tokens on an AAAK-compressed version that contains the same information.

This is the difference between "the team discussed auth" and knowing that Alice and Bob decided to switch to Postgres because Redis sessions weren't ACID-compliant, that it was a high-stakes pivot, and that the emotional arc went from doubt to determination to relief.

What it looks like

Before (raw conversation, ~5,000 tokens):

So Alice and I had a long discussion about the auth system yesterday. We've been going back and forth for weeks honestly. Bob was really pushing for keeping Redis for session storage but Alice kept pointing out that the sessions weren't ACID compliant and we'd already had two incidents where session data got corrupted during peak load. It was a tough call because the Redis implementation was already done, but ultimately we decided to switch to Postgres. Alice felt really relieved afterward — she'd been worried about this for a while. She stayed up that night and wrote the entire migration test suite, which honestly was impressive...

After (AAAK compressed, ~170 tokens):

FILE_042|ALC|2026-03-15|Auth migration decision
Z1:ALC,BOB|auth,postgres,migration|"switched to Postgres because Redis sessions weren't ACID"|W8|trust,relief|DECISION,PIVOT
Z2:ALC|testing,integration|"wrote the migration test suite overnight"|W6|determ,satis|MILESTONE
T:Z1<->Z2|caused_by
ARC:doubt->determ->relief

~30x smaller. Nothing lost. Any LLM can read this and reconstruct the full meaning: who was involved, what was decided, why, the key quote, the emotional context, and how the two events connect.

What an LLM summary would produce from the same conversation:

"The team discussed the auth system and decided to migrate from Redis to Postgres for session storage. A test suite was written afterward."

That summary dropped: who specifically pushed for what (Bob wanted Redis, Alice pushed for Postgres), why (ACID compliance, two prior incidents), the emotional weight (Alice was worried for weeks, felt relieved), that Alice wrote the tests overnight (dedication signal), and the causal connection between the decision and the test suite. An LLM reading the summary months later has no way to recover any of that. An LLM reading the AAAK version has all of it.

How to use it

python mempal.py compress --wing my_app --dry-run   # preview what compression looks like
python mempal.py compress --wing my_app              # compress and store

Anatomy of a compressed record

FILE_042|ALC|2026-03-15|Auth migration decision
   │      │       │              │
   │      │       │              └─── Title: what this memory is about
   │      │       └──────────────── Date: when it happened
   │      └──────────────────────── Primary entity: who (3-letter code)
   └─────────────────────────────── File number: unique ID
Z1:ALC,BOB|auth,postgres,migration|"switched to Postgres..."|W8|trust,relief|DECISION,PIVOT
│    │            │                        │                   │      │            │
│    │            │                        │                   │      │            └─ Flags: WHY it matters
│    │            │                        │                   │      └────────────── Emotions: HOW it felt
│    │            │                        │                   └───────────────────── Weight: 1-10 importance
│    │            │                        └───────────────────────────────────────── Key quote: verbatim
│    │            └────────────────────────────────────────────────────────────────── Topics: searchable tags
│    └─────────────────────────────────────────────────────────────────────────────── WHO was involved
└──────────────────────────────────────────────────────────────────────────────────── Zettel ID: atomic memory unit
T:Z1<->Z2|caused_by          ← Tunnel: Z1 caused Z2 (the decision led to the test suite)
ARC:doubt->determ->relief    ← Arc: the emotional journey across the whole file

How it all connects

graph LR
    Z1["Z1: Auth Decision\nALC, BOB\nW8 · DECISION, PIVOT\ntrust, relief"] -->|caused_by| Z2["Z2: Test Suite\nALC\nW6 · MILESTONE\ndeterm, satis"]

    ARC["ARC: doubt → determ → relief"]

    style Z1 fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style Z2 fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style ARC fill:#1a1920,stroke:#ff80eb,color:#e8e4df,stroke-width:1px,stroke-dasharray: 5 5

Components reference

Component Format What it preserves
Header FILE_NUM|ENTITY|DATE|TITLE Who, when, what this is about
Zettels Z1:ENTITIES|topics|"quote"|WEIGHT|EMOTIONS|FLAGS The atomic facts — each one is a distinct memory
Entity codes Alice → ALC People, preserved as 3-letter codes (configurable via entities.json)
Emotion codes 29 types: vul, joy, fear, trust, grief, wonder, rage, love, hope, determ, etc. The emotional weight of the moment — not just what happened but how it felt
Importance flags ORIGIN, CORE, PIVOT, DECISION, TECHNICAL Why this memory matters
Weight W1-W10 How important (1 = minor, 10 = critical)
Tunnels T:Z1<->Z2|caused_by Connections between memories — what led to what
Arcs ARC:doubt->determ->relief Emotional trajectory — not just one feeling but the journey

Why not just use embeddings?

Embeddings are great for search (MemPal uses them via ChromaDB). But embeddings are opaque — you can't read a vector and understand what it means. AAAK compression is human-readable and LLM-readable. You can open a compressed file, read it yourself, and understand every memory in it. An LLM can load it into context and reason about it. You can version-control it, diff it, and grep it. Try doing that with a vector database.

The compression spectrum

graph LR
    A["Raw Transcript\n~5,000 tokens\n100% fidelity"] -->|"LLM summary"| B["Summary\n~500 tokens\n❌ lossy — drops quotes,\npeople, emotions, connections"]

    A -->|"AAAK compress"| C["AAAK Dialect\n~170 tokens\n✅ lossless — preserves\neverything in shorthand"]

    A -->|"embedding"| D["Vector\n768 floats\n❌ opaque — can't read,\ncan't diff, can't grep"]

    style A fill:#1a1920,stroke:#e8e4df,color:#e8e4df,stroke-width:1px
    style B fill:#3a2020,stroke:#ff6666,color:#e8e4df,stroke-width:2px
    style C fill:#2a4a2a,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style D fill:#3a2020,stroke:#ff6666,color:#e8e4df,stroke-width:2px

MCP Server (Claude Code Integration)

MemPal includes an MCP server so Claude Code can search your palace mid-conversation.

sequenceDiagram
    participant You
    participant Claude as Claude Code
    participant MCP as MemPal MCP Server
    participant Palace as Palace (ChromaDB)

    You->>Claude: "Why did we switch to Postgres?"
    Claude->>MCP: mempal_search("switch to Postgres")
    MCP->>Palace: vector similarity query
    Palace-->>MCP: matching drawers + scores
    MCP-->>Claude: verbatim results
    Claude-->>You: "Based on your past conversations,<br>you switched because Redis<br>sessions weren't ACID compliant..."

Setup

claude mcp add mempal -- python /path/to/mempal-local/mcp_server.py

Or add to your .mcp.json:

{
  "mcpServers": {
    "mempal": {
      "command": "python",
      "args": ["/path/to/mempal-local/mcp_server.py"]
    }
  }
}

Available tools

Tool What it does
mempal_status Total drawers, wings, rooms — palace overview
mempal_search Semantic search with optional wing/room filters
mempal_list_wings All wings with drawer counts

The server speaks JSON-RPC 2.0 over stdio (stdin/stdout). Claude Code handles the protocol automatically.


Auto-Save Hooks

Two bash hooks that integrate with Claude Code's event system to automatically save memories during conversations. No extra API calls — the hooks run locally.

graph TD
    A["You chat with Claude Code"] --> B{"Every 15th\nhuman message"}
    B -->|"not yet"| A
    B -->|"checkpoint!"| C["🛑 Hook BLOCKS\nthe AI response"]
    C --> D["AI saves key topics,\ndecisions, quotes\nto the palace"]
    D --> A

    E["Context window\nfilling up"] --> F["🚨 PreCompact hook\nALWAYS blocks"]
    F --> G["AI saves EVERYTHING\nbefore compaction"]
    G --> H["Context compacts\nsafely"]

    style C fill:#3a3020,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style F fill:#3a2020,stroke:#ff6666,color:#e8e4df,stroke-width:2px

Save Hook (hooks/mempal_save_hook.sh)

Fires after every assistant response. Counts human messages in the session transcript. Every N messages (default: 15), it blocks the AI response and injects a system message:

"AUTO-SAVE checkpoint. Save key topics, decisions, quotes from the last 15 exchanges."

The AI then saves to the palace using its own judgment about what's important and which wing/room to file it under. After saving, it resumes normally.

PreCompact Hook (hooks/mempal_precompact_hook.sh)

Fires when Claude Code is about to compress the context window. Always blocks — compaction means information is about to be lost, so this is an emergency save:

"COMPACTION IMMINENT. Save ALL topics, decisions, and context before the window shrinks."

Installing hooks

# In your Claude Code settings:
claude config set hooks.stop "./hooks/mempal_save_hook.sh"
claude config set hooks.precompact "./hooks/mempal_precompact_hook.sh"

Hook state is tracked in ~/.mempal/hook_state/ with logging to hook.log. The save interval is configurable via SAVE_INTERVAL in the hook script.


All Commands

# Setup
python mempal.py init <dir>                              # detect rooms, create mempal.yaml

# Mining — projects
python mempal.py mine <dir>                              # mine project files into palace
python mempal.py mine <dir> --dry-run                    # preview without storing
python mempal.py mine <dir> --limit 10                   # only process 10 files

# Mining — conversations
python mempal.py mine <dir> --mode convos                # mine chat exports
python mempal.py mine <dir> --mode convos --wing my_app  # tag with a project name
python mempal.py mine <dir> --mode convos --extract general  # classify into 5 memory types

# Search
python mempal.py search "query"                          # search everything
python mempal.py search "query" --wing my_app            # search one project
python mempal.py search "query" --room backend           # search one room
python mempal.py search "query" --results 10             # return more results

# Compression
python mempal.py compress --wing my_app                  # AAAK compress a wing
python mempal.py compress --wing my_app --dry-run        # preview compression
python mempal.py compress --config entities.json         # use custom entity mappings

# Memory stack
python mempal.py wake-up                                 # show L0 + L1 context
python mempal.py wake-up --wing my_app                   # project-specific wake-up

# Status
python mempal.py status                                  # drawer counts by wing and room

All commands accept --palace <path> to override the default palace location (~/.mempal/palace).


Configuration

Global config (~/.mempal/config.json)

{
  "palace_path": "/custom/path/to/palace",
  "collection_name": "mempal_drawers",
  "people_map": {"Alice": "ALC", "Bob": "BOB"},
  "topic_wings": ["emotions", "consciousness"],
  "hall_keywords": {"emotions": ["scared", "happy"]}
}

Priority: environment variables > config.json > defaults.

Project config (mempal.yaml in project root)

Created by mempal init. Defines wing name and rooms:

wing: my_app
rooms:
  - name: backend
    description: Files from backend/
  - name: frontend
    description: Files from frontend/
  - name: general
    description: Files that don't fit other rooms

Edit to rename or add rooms, then re-run mempal mine. It skips files already stored (deduplication by source_file).

Identity (~/.mempal/identity.txt)

Plain text file you write. Becomes Layer 0. See 4-Layer Memory Stack.

People map (~/.mempal/people_map.json)

Maps name variants to canonical entity codes for compression:

{
  "Alice": "ALC",
  "Alice Chen": "ALC",
  "A.": "ALC",
  "Bob": "BOB"
}

File-by-File Reference

File What it does
mempal.py CLI entry point. Parses args, routes to the right module.
config.py MempalConfig class. Loads ~/.mempal/config.json, handles defaults, env var overrides.
normalize.py Converts 5 chat formats (Claude Code JSONL, Claude.ai JSON, ChatGPT JSON, Slack JSON, plain text) to a standard > user turn transcript format.
room_detector_local.py Maps folders to room names using 70+ patterns. No API calls. Falls back to keyword frequency in filenames. Writes mempal.yaml.
miner.py Project ingest. Scans directories, chunks files (800 chars, paragraph boundaries), routes to rooms by folder path, stores to ChromaDB.
convo_miner.py Conversation ingest. Normalizes format, chunks by exchange pair (Q+A) or by paragraph, detects room from content keywords.
general_extractor.py Classifies text into 5 memory types (decision, preference, milestone, problem, emotional) using regex marker scoring. No LLM.
searcher.py Semantic search via ChromaDB vectors. Filters by wing/room. Returns verbatim text + similarity scores.
layers.py MemoryStack with 4 layers. L0 (identity file), L1 (auto-generated top-15 moments), L2 (on-demand room recall), L3 (full search).
dialect.py Dialect class. AAAK compression with entity codes, emotion markers (29 types), importance flags, tunnels, and arcs. ~30x compression ratio.
mcp_server.py JSON-RPC 2.0 server over stdio. Exposes mempal_status, mempal_search, mempal_list_wings to Claude Code.
hooks/mempal_save_hook.sh Bash hook for Claude Code stop event. Counts exchanges, triggers save every N turns.
hooks/mempal_precompact_hook.sh Bash hook for Claude Code precompact event. Emergency save before context window compression.

Python API

Use MemPal as a library in your own scripts:

from miner import mine
from convo_miner import mine_convos
from searcher import search_memories
from layers import MemoryStack
from general_extractor import extract_memories
from dialect import Dialect

# Mine a project
mine("/path/to/project", palace_path="~/.mempal/palace")

# Mine conversations
mine_convos("/path/to/chats", palace_path="~/.mempal/palace", wing="my_project")

# Search
results = search_memories("what was the auth decision", palace_path="~/.mempal/palace")
for r in results["results"]:
    print(f"[{r['wing']}/{r['room']}] {r['text'][:100]}... ({r['similarity']:.2f})")

# Wake up (L0 + L1)
stack = MemoryStack()
context = stack.wake_up(wing="my_app")

# Classify content
memories = extract_memories("We decided to use Postgres because...")
# → [{"content": "...", "memory_type": "decision", "chunk_index": 0}]

# Compress
d = Dialect()
compressed = d.compress(text, metadata={"wing": "my_app", "room": "backend"})
print(d.compression_stats())  # → {"original_tokens": 5000, "compressed_tokens": 170, "ratio": "29.4x"}

Requirements

  • Python 3.9+
  • chromadb>=0.4.0
  • pyyaml>=6.0

No API key. No internet required after install.

pip install -r requirements.txt

Contributing

See CONTRIBUTING.md for dev setup, testing, and PR guidelines.


License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mempal-1.0.1.tar.gz (93.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mempal-1.0.1-py3-none-any.whl (66.9 kB view details)

Uploaded Python 3

File details

Details for the file mempal-1.0.1.tar.gz.

File metadata

  • Download URL: mempal-1.0.1.tar.gz
  • Upload date:
  • Size: 93.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for mempal-1.0.1.tar.gz
Algorithm Hash digest
SHA256 03d6e7893e929c69d7b7bd7284466b6910a46d1c3b19218c9d0b35974855d592
MD5 ca90bc7bdf72860da66585a828d2bbe2
BLAKE2b-256 8652a5915e8edf4ef98c1d913ccc4b203cc2b7ef39d74afc64a67680d4d02a51

See more details on using hashes here.

File details

Details for the file mempal-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: mempal-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 66.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for mempal-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fd9fadd878dbcfc714a7b7d864d06ab7879054a23a26d9d5f4d67bc7f18d0869
MD5 78ba2faff70d69dd59830df118d52ba9
BLAKE2b-256 ae2252f45e016499e24448aa6dc51ad36c424dfdb04bf49d1b3d9903dc5ba0c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page