Give your AI a memory — mine projects and conversations into a searchable palace. No API key required.

These details have not been verified by PyPI

Project links

Project description

MemPal

Every conversation you've ever had with AI — one searchable palace.

Every time you start a new conversation with an AI — Claude, ChatGPT, Copilot, any of them — it has no idea who you are, what you're working on, or what you decided last week. You explain the same things over and over. Important decisions vanish when the chat window closes.

MemPal gives your AI a permanent memory. It reads your past conversations and project files, organizes them into a searchable archive on your computer, and lets your AI recall any of it instantly. Nothing is uploaded. Nothing leaves your machine. Your AI just... remembers.

Getting Started · Why It Matters · How It Works · Examples

graph LR
    A["🗂 Mine\nYour conversations & code"] -->|organize & index| B["🏛 Store\nSearchable archive\non your machine"]
    B -->|instant recall| C["🔍 Search\nFind anything you\never discussed"]
    C -->|plug into any AI| D["🤖 Remember\nYour AI picks up\nwhere you left off"]

    style A fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style B fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style C fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style D fill:#1a1920,stroke:#ff80eb,color:#e8e4df,stroke-width:2px

What Is MemPal?

MemPal is a tool that turns your AI conversations and project files into a searchable, permanent memory that any AI can access.

In plain terms: You've been talking to AI tools for months or years. You've made hundreds of decisions, solved countless bugs, debated architectures, chosen frameworks. All of that knowledge is trapped in old chat logs — or already gone. MemPal extracts it, organizes it, and makes it searchable so your AI can reference it in future conversations.

How it's different from ChatGPT's memory or Claude's projects: Those features store short summaries that the AI writes about you. MemPal stores your actual conversations — the exact words, the reasoning, the context — compressed into a format that's 30x smaller but doesn't lose any information. The difference is like keeping a diary vs. having someone write a one-sentence note about your day.

Who it's for: Developers who use AI tools daily and are tired of re-explaining their projects every session. If you use Claude Code, Cursor, ChatGPT, or any AI coding tool — and you've ever thought "I already told you this last week" — MemPal is for you.

What it requires: Python and one terminal command. No accounts, no API keys, no cloud services. Everything stays on your computer.

For the technical details: MemPal uses ChromaDB for vector search, classifies content with keyword heuristics (no LLM calls), compresses memories 30x with a custom lossless dialect called AAAK, and integrates with AI tools via MCP (Model Context Protocol).

The Problem
Quick Start
Real-World Examples
What MemPal Is Not
How It Works
Architecture
The Palace: Wings, Rooms, Drawers
Mining: Getting Data In
Search: Getting Data Out
4-Layer Memory Stack
AAAK Dialect Compression
MCP Server (Claude Code Integration)
Auto-Save Hooks
All Commands
Configuration
File-by-File Reference
Python API
Requirements
Contributing

The Problem Nobody Is Solving

Your AI conversations are your new institutional memory — and you're throwing them away every session.

Think about where decisions actually happen now. Not in documentation. Not in project management tools. Not in architecture docs that nobody updates. They happen in conversations with AI. You debug with Claude, you design with ChatGPT, you review with Copilot. The reasoning, the trade-offs, the "we tried X and it failed because Y" — all of it lives in chat sessions that evaporate when the window closes.

Tools that give AI access to documents (called "RAG") index files — PDFs, codebases, wikis. But those documents are the output of decisions, not the decisions themselves. The actual moment where you chose Postgres over Redis, where you figured out why authentication was failing, where you decided to abandon a migration — that happened in a conversation. And nobody is treating conversations as a first-class data source.

MemPal does.

"But I already use CLAUDE.md / cursor rules"

Those are great — and MemPal complements them. Here's the difference:

	CLAUDE.md / cursor rules	MemPal
Who writes it	You, manually	Mined automatically from your conversations and code
What it contains	What you remember to write down	Everything — including things you forgot were important
How it grows	You update it when you remember to	Auto-save hooks capture decisions as they happen
Searchable	Ctrl+F on one file	Semantic search across months of conversations
Cross-project	One file per project	Search across all projects at once
Size	You keep it short so the AI reads it	4-layer stack loads only what's needed (~600 tokens to wake up)

CLAUDE.md is your conscious notes. MemPal is your searchable, compressed, persistent memory of everything you've ever discussed with an AI.

"Context windows are huge now — just paste it in"

Let's do the math.

A developer using Claude Code daily generates roughly:

~50,000 tokens per session (your messages + AI responses)
~3 sessions per day = 150,000 tokens/day
~130 working days over 6 months = 19.5 million tokens

That's 6 months of decisions, debugging sessions, architecture discussions, and context. Here's what it costs to load that into a new session:

Method	Tokens loaded	Fits in context?	Cost per session (Sonnet)	Cost per year (daily use)
Raw conversations	19.5M	No (exceeds every context window)	Impossible	—
LLM summary	~650K	Barely (uses 50%+ of window)	~$1.95	~$507/yr
AAAK compressed	~650K	Yes (fits in ~5% of window)	~$1.95	~$507/yr
MemPal L0+L1 wake-up	~900	Yes (0.5% of window)	~$0.003	~$0.70/yr
MemPal + 5 targeted searches	~13,500	Yes (1% of window)	~$0.04	~$10/yr

$507/year vs $10/year to remember the same things. And the $10 version is more accurate because it loads verbatim content, not summaries.

The real win isn't just cost — it's that summaries are lossy. An LLM summarizing 6 months of conversations gives you "the team discussed database options." AAAK compression gives you the exact quote, the people involved, the emotional weight, the connection to what came before, and the reasoning — in 30x less space.

"This is just a ChromaDB wrapper"

The chunking + embedding + search pipeline is table stakes. What's not table stakes:

Conversation-native ingest — MemPal understands that a conversation is Q+A pairs, not paragraphs. It chunks by exchange, not by character count. It normalizes 5 different chat export formats into one structure.
AAAK lossless compression — Not summaries. Not embeddings. A structured dialect that preserves every fact, person, emotion, and connection in 30x less space, and that any LLM reads natively. You can diff it, grep it, version-control it.
The 4-layer memory stack — An AI doesn't need to load everything to "wake up." 600 tokens for identity + essential story. Deep search only when needed. This is an architecture decision, not just "put stuff in ChromaDB."
Auto-save hooks — The hooks don't save data. They tell the AI when to save and let the AI decide what to save. The AI has context about what matters. The hook just creates the checkpoint.

Without MemPal

Every session starts blank
Context dies when the window fills
Decisions get forgotten and re-debated
No memory across projects
You are the memory

With MemPal

AI wakes up knowing your project in ~600 tokens
Memories persist across sessions forever
Decisions, preferences, milestones are searchable
Cross-project semantic search
The palace is the memory

vs. Enterprise solutions (OpenViking, etc.)

Require Python + Go + C++ + Rust + CMake
14 CI/CD pipelines, Kubernetes, Docker
API keys and cloud accounts required
Backed by ByteDance engineering teams

MemPal

Python 3.9+ and nothing else
One dependency: ChromaDB
No API key, no cloud, no account
Built by one person who needed it

Quick Start

git clone https://github.com/moonmadness1217/mempal-local.git
cd mempal-local
pip install -r requirements.txt

# Mine a project
python mempal.py init ~/projects/my_app
python mempal.py mine ~/projects/my_app
python mempal.py search "why did we switch to GraphQL"

# Mine conversations
python mempal.py mine ~/chats/ --mode convos
python mempal.py search "what did we decide about auth"

# Check what's stored
python mempal.py status

No API key. No account. No data leaves your machine.

Real-World Examples

A dev team with 2 years of AI conversations

Your team has been using Claude Code, ChatGPT, and Slack daily across multiple projects. Two years of decisions, debugging sessions, architecture debates — scattered across exports and chat logs.

# Mine all your Claude Code sessions
python mempal.py mine ~/.claude/projects/ --mode convos --wing work

# Mine ChatGPT exports
python mempal.py mine ~/Downloads/chatgpt-export/ --mode convos --wing work

# Mine Slack exports from your engineering channel
python mempal.py mine ~/Downloads/slack-export/eng-team/ --mode convos --wing work

# Now search across all of it
python mempal.py search "why did we abandon the microservices migration"
python mempal.py search "what was the Redis caching decision"
python mempal.py search "who figured out the auth bug in March"

Six months from now, a new engineer asks "why do we do X this way?" Instead of digging through Slack, you search the palace.

A solo developer across multiple projects

You freelance across 4-5 projects. Each one has its own codebase, its own decisions, its own history of "we tried X and it didn't work."

# Mine each project
python mempal.py init ~/projects/client-a && python mempal.py mine ~/projects/client-a
python mempal.py init ~/projects/client-b && python mempal.py mine ~/projects/client-b
python mempal.py init ~/projects/side-project && python mempal.py mine ~/projects/side-project

# Mine your AI conversations about each
python mempal.py mine ~/chats/client-a/ --mode convos --wing client-a
python mempal.py mine ~/chats/client-b/ --mode convos --wing client-b

# Search across everything or within one project
python mempal.py search "rate limiting approach"                    # all projects
python mempal.py search "rate limiting approach" --wing client-a   # just client-a

# Wake up Claude Code with full context for today's project
python mempal.py wake-up --wing client-b

Each project gets its own wing. Your AI assistant can recall what you decided on Client A without polluting Client B's context.

An open-source maintainer

You maintain a popular library. Contributors ask the same questions. You've explained the same architectural decisions dozens of times in issues and PRs.

# Mine the codebase
python mempal.py init ~/projects/my-library
python mempal.py mine ~/projects/my-library

# Mine your conversation history about the project
python mempal.py mine ~/chats/library-discussions/ --mode convos --wing my-library --extract general

# Now when someone asks "why doesn't this library support X?"
python mempal.py search "why we don't support" --wing my-library --room decisions

The --extract general flag classifies your conversations into decisions, problems, milestones — so you can search specifically for past decisions and their reasoning.

A researcher with text-based notes

You have years of markdown notes, plain text research logs, and CSV data from experiments.

# Mine your research notes
python mempal.py init ~/research/protein-folding
python mempal.py mine ~/research/protein-folding

# Mine your lab notebook (markdown files)
python mempal.py mine ~/research/lab-notes/ --mode convos --wing protein-folding

# Search for methodology decisions
python mempal.py search "why we switched from method A to method B"
python mempal.py search "failed experiments with temperature above 40C"

Note: MemPal reads plain text formats. If your notes are in .md, .txt, .csv, or .json, they work today. See the next section for what doesn't work.

What MemPal Is Not

MemPal does one thing well: it turns text files and AI conversations into searchable, persistent memory. Here's what it's not designed for:

Not a document converter

MemPal reads text-based files. It does not parse:

.xlsx / .xls (Excel spreadsheets)
.docx / .doc (Word documents)
.pdf (PDF files)
.pptx (PowerPoint)
Images, audio, or video

If you have a folder of Word docs and Excel files from years of company work, you need to convert them to text first (using tools like pandoc, pdfplumber, or python-docx). Once they're text, MemPal can mine them.

Not a knowledge graph

MemPal uses vector search (ChromaDB) — it finds content that's semantically similar to your query. It does not build:

DAGs (directed acyclic graphs)
Entity-relationship graphs
Semantic networks with traversable edges
Knowledge graphs (Neo4j, etc.)

If you need "walk from concept A to concept B through their relationships," you need a graph database. MemPal answers "find me everything related to concept A" — which is a different (and often more practical) question.

Not a RAG framework

MemPal is not trying to be LangChain, LlamaIndex, or a retrieval-augmented generation pipeline. It doesn't orchestrate LLM calls, manage prompts, or chain retrievals. It's a storage and retrieval layer that any RAG system or AI tool can query.

Where MemPal is strongest

Use case	Fit
Dev team with years of AI chat history	Perfect — this is exactly what it's built for
Solo dev juggling multiple projects	Perfect — per-project wings, cross-project search
Codebase + conversation memory for Claude Code	Perfect — MCP integration, auto-save hooks
Markdown/text research notes	Great — works if your notes are already text
Slack/ChatGPT/Claude exports	Great — built-in format normalization
Company docs in Word/Excel/PDF	Not yet — needs text extraction first
Building a knowledge graph	Wrong tool — use Neo4j or similar
Real-time streaming data	Wrong tool — MemPal is batch-oriented

Key Features

Two Ingest Modes _{Mine codebases (folder structure → rooms) or conversation exports (Claude, ChatGPT, Slack, plain text). Same palace, same search.}	4-Layer Memory Stack _{L0 identity + L1 essential story = ~600 token wake-up. L2 on-demand recall. L3 deep semantic search. Under 5% of any context window.}	30x Compression _{AAAK dialect: entity codes, emotion markers, importance flags, tunnels between memories. Plain text any LLM reads natively.}
MCP Integration _{One command to add MemPal to Claude Code. Three tools: search, status, list wings. AI queries the palace mid-conversation.}	5-Type Extractor _{Classifies memories into decisions, preferences, milestones, problems, and emotional moments. Keyword heuristics — no LLM needed.}	Auto-Save Hooks _{Bash hooks trigger saves every N turns and before context compaction. The AI decides what to save — the hook decides when.}

How It Works

MemPal has a simple pipeline: normalize → chunk → detect room → store → search.

graph TD
    A["📁 Your Files"] --> B{"What kind?"}

    B -->|"Code, docs, configs"| C["⛏ Project Miner\nminer.py"]
    B -->|"Claude, ChatGPT, Slack"| D["💬 Convo Miner\nconvo_miner.py"]

    D --> E["🔄 Normalize\nnormalize.py\nAny format → standard transcript"]
    E --> F["✂️ Chunk by exchange\n1 user msg + 1 AI response = 1 chunk"]

    C --> G["✂️ Chunk by paragraph\n800 chars, 100-char overlap"]

    F --> H{"🏷 Detect Room"}
    G --> H

    H -->|"Projects: folder name"| I["backend/ → backend\ndocs/ → documentation"]
    H -->|"Convos: keyword scoring"| J["'bug','error' → technical\n'decided','chose' → decisions"]

    I --> K["🏛 Palace\nChromaDB"]
    J --> K

    K --> L["🔍 Search\nsearcher.py\nSemantic vector similarity"]
    K --> M["📦 Compress\ndialect.py\nAAAK ~30x reduction"]
    K --> N["🧠 Memory Stack\nlayers.py\nL0 → L1 → L2 → L3"]

    style A fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style K fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style L fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style M fill:#1a1920,stroke:#ff80eb,color:#e8e4df,stroke-width:2px
    style N fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px

Everything runs locally. ChromaDB handles embeddings and vector search with no external API.

Architecture

mempal-local/
├── mempal.py                  # CLI entry point — all commands route through here
├── config.py                  # Configuration management (~/.mempal/config.json)
├── normalize.py               # Converts chat exports to standard transcript format
├── room_detector_local.py     # Maps files/content to rooms (60+ folder patterns)
├── miner.py                   # Mines project files (code, docs, configs)
├── convo_miner.py             # Mines conversation exports (chat sessions)
├── general_extractor.py       # Classifies content into 5 memory types
├── searcher.py                # Semantic search across the palace
├── layers.py                  # 4-layer memory stack (L0-L3)
├── dialect.py                 # AAAK dialect compression (~30x)
├── mcp_server.py              # MCP server for Claude Code integration
├── hooks/
│   ├── mempal_save_hook.sh    # Auto-save every N conversation turns
│   └── mempal_precompact_hook.sh  # Emergency save before context compaction
├── tests/
│   ├── test_config.py
│   ├── test_miner.py
│   ├── test_convo_miner.py
│   └── test_normalize.py
├── examples/
│   ├── basic_mining.py        # 3-step workflow example
│   ├── convo_import.py        # Conversation import examples
│   └── mcp_setup.md           # MCP configuration guide
├── pyproject.toml             # Packaging (pip-installable)
├── requirements.txt           # chromadb>=0.4.0, pyyaml>=6.0
└── CONTRIBUTING.md

The Palace: Wings, Rooms, Drawers

MemPal organizes everything into a spatial metaphor — a memory palace:

graph TD
    P["🏛 PALACE<br><i>~/.mempal/palace</i>"]

    P --> W1["🪟 WING: my_app"]
    P --> W2["🪟 WING: team_chats"]
    P --> W3["🪟 WING: ..."]

    W1 --> R1["🚪 backend"]
    W1 --> R2["🚪 frontend"]
    W1 --> R3["🚪 documentation"]

    R1 --> D1["📄 chunk from app.py"]
    R1 --> D2["📄 chunk from routes.py"]
    R1 --> D3["📄 chunk from models.py"]
    R2 --> D4["📄 chunk from index.html"]
    R3 --> D5["📄 chunk from README.md"]

    W2 --> R4["🚪 decisions"]
    W2 --> R5["🚪 technical"]
    W2 --> R6["🚪 problems"]

    R4 --> D6["📄 'we chose Postgres because...'"]
    R5 --> D7["📄 'the API rate limit fix was...'"]
    R6 --> D8["📄 'auth kept failing because...'"]

    style P fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:3px
    style W1 fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style W2 fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:2px
    style W3 fill:#1a1920,stroke:#6c8ec9,color:#e8e4df,stroke-width:1px,stroke-dasharray: 5 5
    style R1 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R2 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R3 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R4 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R5 fill:#1a1920,stroke:#8ec96c,color:#e8e4df
    style R6 fill:#1a1920,stroke:#8ec96c,color:#e8e4df

Palace → Wing (a project) → Room (an aspect) → Drawer (verbatim content, never summarized)

Key principle: Drawers store verbatim content — exact words, never summarized. Your AI can summarize at query time if it wants to. The palace preserves the original.

Every drawer has metadata:

Field	Example	Purpose
`wing`	`"my_app"`	Which project/source
`room`	`"backend"`	Which aspect
`source_file`	`"/path/to/app.py"`	Where it came from
`chunk_index`	`0`	Which chunk of that file
`filed_at`	`"2026-03-20T..."`	When it was stored
`ingest_mode`	`"projects"` or `"convos"`	How it was mined
`added_by`	`"mempal"`	What agent filed it

Mining: Getting Data In

Mode 1: Projects

Point MemPal at any codebase. It scans the folder structure, auto-detects rooms from directory names, and chunks every readable file into the palace.

python mempal.py init ~/projects/my_app    # scan folders, create mempal.yaml
python mempal.py mine ~/projects/my_app    # chunk and store everything

How room detection works for projects:

Your folder	Room created
`frontend/`, `ui/`, `components/`	frontend
`backend/`, `api/`, `server/`	backend
`docs/`, `documentation/`, `wiki/`	documentation
`tests/`, `testing/`, `qa/`	testing
`planning/`, `roadmap/`, `strategy/`	planning
`config/`, `settings/`, `infra/`	configuration
`scripts/`, `tools/`, `utils/`	scripts
`design/`, `mockups/`, `wireframes/`	design
`meetings/`, `standup/`	meetings
`costs/`, `budget/`, `pricing/`	costs

70+ folder patterns are recognized. Files that don't match any pattern go to general.

What gets chunked: .txt .md .py .js .ts .jsx .tsx .json .yaml .yml .html .css .java .go .rs .rb .sh .csv .sql .toml

What gets skipped: node_modules/, .git/, __pycache__/, .venv/, venv/, build/, dist/, coverage/, .next/, .mempal/, env/.

Chunking rules: 800 characters per chunk, breaking on paragraph or line boundaries, with 100-character overlap between chunks for context continuity. Chunks under 50 characters are discarded.

Mode 2: Conversations

Point MemPal at your exported chat files. It normalizes any format into a standard transcript, then chunks by exchange pair (one user message + one AI response = one drawer).

python mempal.py mine ~/chats/ --mode convos
python mempal.py mine ~/chats/ --mode convos --wing my_project

Supported formats:

Format	File type	What MemPal looks for
Claude Code sessions	`.jsonl`	`{"type": "human"}` / `{"type": "assistant"}` objects
ChatGPT exports	`.json`	`conversations.json` with tree-structured `mapping`
Claude.ai exports	`.json`	`[{"role": "user", "content": "..."}]` arrays
Slack exports	`.json`	`[{"type": "message", "user": "...", "text": "..."}]`
Plain text / Markdown	`.txt` `.md`	Lines starting with `>` are user turns

All formats get normalized to the same internal structure before chunking:

graph LR
    A["Claude Code\n.jsonl"] --> N["🔄 normalize.py"]
    B["ChatGPT\nconversations.json"] --> N
    C["Claude.ai\n.json"] --> N
    D["Slack\nexport .json"] --> N
    E["Plain text\n.txt / .md"] --> N

    N --> O["> user message\nAI response\n> user message\nAI response"]

    style N fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style O fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px

The palace doesn't care where the data came from.

How room detection works for conversations:

Keywords in content	Room created
code, python, bug, error, api, deploy	technical
architecture, design, pattern, schema	architecture
plan, roadmap, milestone, sprint	planning
decided, chose, switched, trade-off	decisions
problem, issue, broken, fix, solved	problems

The General Extractor (5 memory types)

By default, conversation mining chunks by exchange pair. Add --extract general to also classify each chunk into one of five memory types using keyword heuristics (no LLM needed):

python mempal.py mine ~/chats/ --mode convos --extract general

Type	What it catches	Example signals
Decision	Choices and their reasoning	"we chose X because", "decided to", "went with"
Preference	Rules and style preferences	"always use", "never do", "I prefer"
Milestone	Breakthroughs and completions	"finally worked", "shipped", "breakthrough"
Problem	What broke and how it was fixed	"bug", "crash", "root cause", "the fix was"
Emotional	Feelings and relationship moments	"love", "scared", "proud", "grateful"

Each type has 16-34 regex markers. Confidence scoring filters out weak matches (threshold: 0.3). Code blocks are stripped before classification so if error: in code doesn't trigger "problem."

Search: Getting Data Out

python mempal.py search "why did we switch databases"
python mempal.py search "auth bug" --wing my_app
python mempal.py search "deploy process" --room technical --results 10

Search uses ChromaDB's vector similarity. Your query gets embedded and compared against all stored drawers. Results return:

The verbatim text
Which wing and room it came from
The source file path
A similarity score (0-1, higher = closer match)

You can filter by --wing (project) and --room (aspect) to narrow results.

4-Layer Memory Stack

The memory stack controls how much context gets loaded and when. The goal: give an AI everything it needs to "wake up" in under 900 tokens, with deeper recall available on demand.

block-beta
    columns 1

    block:L0:1
        columns 3
        L0a["🪪 L0: IDENTITY"] L0b["~100 tokens"] L0c["ALWAYS LOADED"]
    end

    block:L1:1
        columns 3
        L1a["📖 L1: ESSENTIAL STORY"] L1b["~500-800 tokens"] L1c["ALWAYS LOADED"]
    end

    block:L2:1
        columns 3
        L2a["📂 L2: ON-DEMAND"] L2b["~200-500 each"] L2c["WHEN RELEVANT"]
    end

    block:L3:1
        columns 3
        L3a["🔍 L3: DEEP SEARCH"] L3b["unlimited"] L3c["WHEN ASKED"]
    end

    style L0 fill:#2a4a2a,stroke:#8ec96c,color:#e8e4df
    style L1 fill:#2a3a4a,stroke:#6c8ec9,color:#e8e4df
    style L2 fill:#3a3020,stroke:#c9a86c,color:#e8e4df
    style L3 fill:#3a2030,stroke:#ff80eb,color:#e8e4df

Wake-up cost (L0 + L1)

████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ~5%

~600-900 tokens out of any context window

Everything else: free for conversation

░░░░████████████████████████████████████ ~95%

L2 and L3 load only when needed

Wake-up cost: L0 + L1 = ~600-900 tokens. That's under 5% of even a small context window.

python mempal.py wake-up                    # show L0 + L1
python mempal.py wake-up --wing my_app      # wake-up scoped to one project

Setting up your identity (L0)

Create ~/.mempal/identity.txt:

I am Atlas, a personal AI assistant for Alice.
Traits: warm, direct, remembers everything.
Current project: A journaling app that helps people process emotions.

This is plain text you write yourself. It becomes the first thing loaded into every session.

AAAK Dialect Compression

Why this matters

Every AI model has a context window. Even a million-token window fills up when you're loading memory from hundreds of past conversations. The standard approach is to summarize — but summaries are lossy. An LLM summarizing your conversation drops the exact quote, the specific person who said it, the emotional weight of the moment, the connection to what came before. You get a bland paragraph that says "the team discussed database options" instead of the actual decision, who made it, why, and what they were feeling when they did.

AAAK is a lossless compression dialect — it preserves every fact, every person, every emotion, every connection, just in a denser format. The key insight: LLMs can read structured shorthand just as well as prose. So instead of spending 5,000 tokens on a full conversation transcript, you spend 170 tokens on an AAAK-compressed version that contains the same information.

This is the difference between "the team discussed auth" and knowing that Alice and Bob decided to switch to Postgres because Redis sessions weren't ACID-compliant, that it was a high-stakes pivot, and that the emotional arc went from doubt to determination to relief.

What it looks like

Before (raw conversation, ~5,000 tokens):

So Alice and I had a long discussion about the auth system yesterday. We've been going back and forth for weeks honestly. Bob was really pushing for keeping Redis for session storage but Alice kept pointing out that the sessions weren't ACID compliant and we'd already had two incidents where session data got corrupted during peak load. It was a tough call because the Redis implementation was already done, but ultimately we decided to switch to Postgres. Alice felt really relieved afterward — she'd been worried about this for a while. She stayed up that night and wrote the entire migration test suite, which honestly was impressive...

After (AAAK compressed, ~170 tokens):

FILE_042|ALC|2026-03-15|Auth migration decision
Z1:ALC,BOB|auth,postgres,migration|"switched to Postgres because Redis sessions weren't ACID"|W8|trust,relief|DECISION,PIVOT
Z2:ALC|testing,integration|"wrote the migration test suite overnight"|W6|determ,satis|MILESTONE
T:Z1<->Z2|caused_by
ARC:doubt->determ->relief

~30x smaller. Nothing lost. Any LLM can read this and reconstruct the full meaning: who was involved, what was decided, why, the key quote, the emotional context, and how the two events connect.

What an LLM summary would produce from the same conversation:

"The team discussed the auth system and decided to migrate from Redis to Postgres for session storage. A test suite was written afterward."

That summary dropped: who specifically pushed for what (Bob wanted Redis, Alice pushed for Postgres), why (ACID compliance, two prior incidents), the emotional weight (Alice was worried for weeks, felt relieved), that Alice wrote the tests overnight (dedication signal), and the causal connection between the decision and the test suite. An LLM reading the summary months later has no way to recover any of that. An LLM reading the AAAK version has all of it.

How to use it

python mempal.py compress --wing my_app --dry-run   # preview what compression looks like
python mempal.py compress --wing my_app              # compress and store

Anatomy of a compressed record

FILE_042|ALC|2026-03-15|Auth migration decision
   │      │       │              │
   │      │       │              └─── Title: what this memory is about
   │      │       └──────────────── Date: when it happened
   │      └──────────────────────── Primary entity: who (3-letter code)
   └─────────────────────────────── File number: unique ID

Z1:ALC,BOB|auth,postgres,migration|"switched to Postgres..."|W8|trust,relief|DECISION,PIVOT
│    │            │                        │                   │      │            │
│    │            │                        │                   │      │            └─ Flags: WHY it matters
│    │            │                        │                   │      └────────────── Emotions: HOW it felt
│    │            │                        │                   └───────────────────── Weight: 1-10 importance
│    │            │                        └───────────────────────────────────────── Key quote: verbatim
│    │            └────────────────────────────────────────────────────────────────── Topics: searchable tags
│    └─────────────────────────────────────────────────────────────────────────────── WHO was involved
└──────────────────────────────────────────────────────────────────────────────────── Zettel ID: atomic memory unit

T:Z1<->Z2|caused_by          ← Tunnel: Z1 caused Z2 (the decision led to the test suite)
ARC:doubt->determ->relief    ← Arc: the emotional journey across the whole file

How it all connects

graph LR
    Z1["Z1: Auth Decision\nALC, BOB\nW8 · DECISION, PIVOT\ntrust, relief"] -->|caused_by| Z2["Z2: Test Suite\nALC\nW6 · MILESTONE\ndeterm, satis"]

    ARC["ARC: doubt → determ → relief"]

    style Z1 fill:#1a1920,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style Z2 fill:#1a1920,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style ARC fill:#1a1920,stroke:#ff80eb,color:#e8e4df,stroke-width:1px,stroke-dasharray: 5 5

Components reference

Component	Format	What it preserves
Header	`FILE_NUM\|ENTITY\|DATE\|TITLE`	Who, when, what this is about
Zettels	`Z1:ENTITIES\|topics\|"quote"\|WEIGHT\|EMOTIONS\|FLAGS`	The atomic facts — each one is a distinct memory
Entity codes	`Alice → ALC`	People, preserved as 3-letter codes (configurable via `entities.json`)
Emotion codes	29 types: `vul`, `joy`, `fear`, `trust`, `grief`, `wonder`, `rage`, `love`, `hope`, `determ`, etc.	The emotional weight of the moment — not just what happened but how it felt
Importance flags	`ORIGIN`, `CORE`, `PIVOT`, `DECISION`, `TECHNICAL`	Why this memory matters
Weight	`W1`-`W10`	How important (1 = minor, 10 = critical)
Tunnels	`T:Z1<->Z2\|caused_by`	Connections between memories — what led to what
Arcs	`ARC:doubt->determ->relief`	Emotional trajectory — not just one feeling but the journey

Why not just use embeddings?

Embeddings are great for search (MemPal uses them via ChromaDB). But embeddings are opaque — you can't read a vector and understand what it means. AAAK compression is human-readable and LLM-readable. You can open a compressed file, read it yourself, and understand every memory in it. An LLM can load it into context and reason about it. You can version-control it, diff it, and grep it. Try doing that with a vector database.

The compression spectrum

graph LR
    A["Raw Transcript\n~5,000 tokens\n100% fidelity"] -->|"LLM summary"| B["Summary\n~500 tokens\n❌ lossy — drops quotes,\npeople, emotions, connections"]

    A -->|"AAAK compress"| C["AAAK Dialect\n~170 tokens\n✅ lossless — preserves\neverything in shorthand"]

    A -->|"embedding"| D["Vector\n768 floats\n❌ opaque — can't read,\ncan't diff, can't grep"]

    style A fill:#1a1920,stroke:#e8e4df,color:#e8e4df,stroke-width:1px
    style B fill:#3a2020,stroke:#ff6666,color:#e8e4df,stroke-width:2px
    style C fill:#2a4a2a,stroke:#8ec96c,color:#e8e4df,stroke-width:2px
    style D fill:#3a2020,stroke:#ff6666,color:#e8e4df,stroke-width:2px

MCP Server (Claude Code Integration)

MemPal includes an MCP server so Claude Code can search your palace mid-conversation.

sequenceDiagram
    participant You
    participant Claude as Claude Code
    participant MCP as MemPal MCP Server
    participant Palace as Palace (ChromaDB)

    You->>Claude: "Why did we switch to Postgres?"
    Claude->>MCP: mempal_search("switch to Postgres")
    MCP->>Palace: vector similarity query
    Palace-->>MCP: matching drawers + scores
    MCP-->>Claude: verbatim results
    Claude-->>You: "Based on your past conversations,<br>you switched because Redis<br>sessions weren't ACID compliant..."

Setup

claude mcp add mempal -- python /path/to/mempal-local/mcp_server.py

Or add to your .mcp.json:

{
  "mcpServers": {
    "mempal": {
      "command": "python",
      "args": ["/path/to/mempal-local/mcp_server.py"]
    }
  }
}

Available tools

Tool	What it does
`mempal_status`	Total drawers, wings, rooms — palace overview
`mempal_search`	Semantic search with optional wing/room filters
`mempal_list_wings`	All wings with drawer counts

The server speaks JSON-RPC 2.0 over stdio (stdin/stdout). Claude Code handles the protocol automatically.

Auto-Save Hooks

Two bash hooks that integrate with Claude Code's event system to automatically save memories during conversations. No extra API calls — the hooks run locally.

graph TD
    A["You chat with Claude Code"] --> B{"Every 15th\nhuman message"}
    B -->|"not yet"| A
    B -->|"checkpoint!"| C["🛑 Hook BLOCKS\nthe AI response"]
    C --> D["AI saves key topics,\ndecisions, quotes\nto the palace"]
    D --> A

    E["Context window\nfilling up"] --> F["🚨 PreCompact hook\nALWAYS blocks"]
    F --> G["AI saves EVERYTHING\nbefore compaction"]
    G --> H["Context compacts\nsafely"]

    style C fill:#3a3020,stroke:#c9a86c,color:#e8e4df,stroke-width:2px
    style F fill:#3a2020,stroke:#ff6666,color:#e8e4df,stroke-width:2px

Save Hook (`hooks/mempal_save_hook.sh`)

Fires after every assistant response. Counts human messages in the session transcript. Every N messages (default: 15), it blocks the AI response and injects a system message:

"AUTO-SAVE checkpoint. Save key topics, decisions, quotes from the last 15 exchanges."

The AI then saves to the palace using its own judgment about what's important and which wing/room to file it under. After saving, it resumes normally.

PreCompact Hook (`hooks/mempal_precompact_hook.sh`)

Fires when Claude Code is about to compress the context window. Always blocks — compaction means information is about to be lost, so this is an emergency save:

"COMPACTION IMMINENT. Save ALL topics, decisions, and context before the window shrinks."

Installing hooks

# In your Claude Code settings:
claude config set hooks.stop "./hooks/mempal_save_hook.sh"
claude config set hooks.precompact "./hooks/mempal_precompact_hook.sh"

Hook state is tracked in ~/.mempal/hook_state/ with logging to hook.log. The save interval is configurable via SAVE_INTERVAL in the hook script.

All Commands

# Setup
python mempal.py init <dir>                              # detect rooms, create mempal.yaml

# Mining — projects
python mempal.py mine <dir>                              # mine project files into palace
python mempal.py mine <dir> --dry-run                    # preview without storing
python mempal.py mine <dir> --limit 10                   # only process 10 files

# Mining — conversations
python mempal.py mine <dir> --mode convos                # mine chat exports
python mempal.py mine <dir> --mode convos --wing my_app  # tag with a project name
python mempal.py mine <dir> --mode convos --extract general  # classify into 5 memory types

# Search
python mempal.py search "query"                          # search everything
python mempal.py search "query" --wing my_app            # search one project
python mempal.py search "query" --room backend           # search one room
python mempal.py search "query" --results 10             # return more results

# Compression
python mempal.py compress --wing my_app                  # AAAK compress a wing
python mempal.py compress --wing my_app --dry-run        # preview compression
python mempal.py compress --config entities.json         # use custom entity mappings

# Memory stack
python mempal.py wake-up                                 # show L0 + L1 context
python mempal.py wake-up --wing my_app                   # project-specific wake-up

# Status
python mempal.py status                                  # drawer counts by wing and room

All commands accept --palace <path> to override the default palace location (~/.mempal/palace).

Configuration

Global config (`~/.mempal/config.json`)

{
  "palace_path": "/custom/path/to/palace",
  "collection_name": "mempal_drawers",
  "people_map": {"Alice": "ALC", "Bob": "BOB"},
  "topic_wings": ["emotions", "consciousness"],
  "hall_keywords": {"emotions": ["scared", "happy"]}
}

Priority: environment variables > config.json > defaults.

Project config (`mempal.yaml` in project root)

Created by mempal init. Defines wing name and rooms:

wing: my_app
rooms:
  - name: backend
    description: Files from backend/
  - name: frontend
    description: Files from frontend/
  - name: general
    description: Files that don't fit other rooms

Edit to rename or add rooms, then re-run mempal mine. It skips files already stored (deduplication by source_file).

Identity (`~/.mempal/identity.txt`)

Plain text file you write. Becomes Layer 0. See 4-Layer Memory Stack.

People map (`~/.mempal/people_map.json`)

Maps name variants to canonical entity codes for compression:

{
  "Alice": "ALC",
  "Alice Chen": "ALC",
  "A.": "ALC",
  "Bob": "BOB"
}

File-by-File Reference

File	What it does
`mempal.py`	CLI entry point. Parses args, routes to the right module.
`config.py`	`MempalConfig` class. Loads `~/.mempal/config.json`, handles defaults, env var overrides.
`normalize.py`	Converts 5 chat formats (Claude Code JSONL, Claude.ai JSON, ChatGPT JSON, Slack JSON, plain text) to a standard `> user turn` transcript format.
`room_detector_local.py`	Maps folders to room names using 70+ patterns. No API calls. Falls back to keyword frequency in filenames. Writes `mempal.yaml`.
`miner.py`	Project ingest. Scans directories, chunks files (800 chars, paragraph boundaries), routes to rooms by folder path, stores to ChromaDB.
`convo_miner.py`	Conversation ingest. Normalizes format, chunks by exchange pair (Q+A) or by paragraph, detects room from content keywords.
`general_extractor.py`	Classifies text into 5 memory types (decision, preference, milestone, problem, emotional) using regex marker scoring. No LLM.
`searcher.py`	Semantic search via ChromaDB vectors. Filters by wing/room. Returns verbatim text + similarity scores.
`layers.py`	`MemoryStack` with 4 layers. L0 (identity file), L1 (auto-generated top-15 moments), L2 (on-demand room recall), L3 (full search).
`dialect.py`	`Dialect` class. AAAK compression with entity codes, emotion markers (29 types), importance flags, tunnels, and arcs. ~30x compression ratio.
`mcp_server.py`	JSON-RPC 2.0 server over stdio. Exposes `mempal_status`, `mempal_search`, `mempal_list_wings` to Claude Code.
`hooks/mempal_save_hook.sh`	Bash hook for Claude Code `stop` event. Counts exchanges, triggers save every N turns.
`hooks/mempal_precompact_hook.sh`	Bash hook for Claude Code `precompact` event. Emergency save before context window compression.

Python API

Use MemPal as a library in your own scripts:

from miner import mine
from convo_miner import mine_convos
from searcher import search_memories
from layers import MemoryStack
from general_extractor import extract_memories
from dialect import Dialect

# Mine a project
mine("/path/to/project", palace_path="~/.mempal/palace")

# Mine conversations
mine_convos("/path/to/chats", palace_path="~/.mempal/palace", wing="my_project")

# Search
results = search_memories("what was the auth decision", palace_path="~/.mempal/palace")
for r in results["results"]:
    print(f"[{r['wing']}/{r['room']}] {r['text'][:100]}... ({r['similarity']:.2f})")

# Wake up (L0 + L1)
stack = MemoryStack()
context = stack.wake_up(wing="my_app")

# Classify content
memories = extract_memories("We decided to use Postgres because...")
# → [{"content": "...", "memory_type": "decision", "chunk_index": 0}]

# Compress
d = Dialect()
compressed = d.compress(text, metadata={"wing": "my_app", "room": "backend"})
print(d.compression_stats())  # → {"original_tokens": 5000, "compressed_tokens": 170, "ratio": "29.4x"}

Requirements

Python 3.9+
chromadb>=0.4.0
pyyaml>=6.0

No API key. No internet required after install.

pip install -r requirements.txt

Contributing

See CONTRIBUTING.md for dev setup, testing, and PR guidelines.

License

MIT — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.1

Mar 21, 2026

1.0.0

Mar 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mempal-1.0.1.tar.gz (93.2 kB view details)

Uploaded Mar 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mempal-1.0.1-py3-none-any.whl (66.9 kB view details)

Uploaded Mar 21, 2026 Python 3

File details

Details for the file mempal-1.0.1.tar.gz.

File metadata

Download URL: mempal-1.0.1.tar.gz
Upload date: Mar 21, 2026
Size: 93.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for mempal-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`03d6e7893e929c69d7b7bd7284466b6910a46d1c3b19218c9d0b35974855d592`
MD5	`ca90bc7bdf72860da66585a828d2bbe2`
BLAKE2b-256	`8652a5915e8edf4ef98c1d913ccc4b203cc2b7ef39d74afc64a67680d4d02a51`

See more details on using hashes here.

File details

Details for the file mempal-1.0.1-py3-none-any.whl.

File metadata

Download URL: mempal-1.0.1-py3-none-any.whl
Upload date: Mar 21, 2026
Size: 66.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for mempal-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd9fadd878dbcfc714a7b7d864d06ab7879054a23a26d9d5f4d67bc7f18d0869`
MD5	`78ba2faff70d69dd59830df118d52ba9`
BLAKE2b-256	`ae2252f45e016499e24448aa6dc51ad36c424dfdb04bf49d1b3d9903dc5ba0c0`

See more details on using hashes here.

mempal 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MemPal

Every conversation you've ever had with AI — one searchable palace.

What Is MemPal?

Table of Contents

The Problem Nobody Is Solving

"But I already use CLAUDE.md / cursor rules"

"Context windows are huge now — just paste it in"

"This is just a ChromaDB wrapper"

Quick Start

Real-World Examples

A dev team with 2 years of AI conversations

A solo developer across multiple projects

An open-source maintainer

A researcher with text-based notes

What MemPal Is Not

Not a document converter

Not a knowledge graph

Not a RAG framework

Where MemPal is strongest

Key Features

Two Ingest Modes

4-Layer Memory Stack

30x Compression

MCP Integration

5-Type Extractor

Auto-Save Hooks

How It Works

Architecture

The Palace: Wings, Rooms, Drawers

Mining: Getting Data In

Mode 1: Projects

Mode 2: Conversations

The General Extractor (5 memory types)

Search: Getting Data Out

4-Layer Memory Stack

Setting up your identity (L0)

AAAK Dialect Compression

Why this matters

What it looks like

How to use it

Anatomy of a compressed record

How it all connects

Components reference

Why not just use embeddings?

The compression spectrum

MCP Server (Claude Code Integration)

Setup

Available tools

Auto-Save Hooks

Save Hook (hooks/mempal_save_hook.sh)

PreCompact Hook (hooks/mempal_precompact_hook.sh)

Installing hooks

All Commands

Configuration

Global config (~/.mempal/config.json)

Project config (mempal.yaml in project root)

Identity (~/.mempal/identity.txt)

People map (~/.mempal/people_map.json)

File-by-File Reference

Python API

Requirements

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Save Hook (`hooks/mempal_save_hook.sh`)

PreCompact Hook (`hooks/mempal_precompact_hook.sh`)

Global config (`~/.mempal/config.json`)

Project config (`mempal.yaml` in project root)

Identity (`~/.mempal/identity.txt`)

People map (`~/.mempal/people_map.json`)