Skip to main content

Git for AI memory — version-controlled context persistence across Claude, GPT, Gemini, Cursor, Windsurf, and more

Project description

memgit logo

memgit — git for AI memory

Your AI assistants forget everything when the session ends. memgit fixes that.

Version-controlled, cross-AI context that persists, diffs, rolls back, and syncs like code. Switch from Claude to Cursor to ChatGPT mid-project — your context is already there.

PyPI License: MIT Tests


Why not claude.md? Why not mem-search?

You've probably already tried both. Here's why they hit a ceiling:

Capability claude.md mem-search plugin memgit
Loads only relevant context ❌ loads everything ⚠️ loads recent observations ✅ BM25 search — top-k per query
Version history ✅ full commit log
Diff between sessions memgit diff
Roll back a wrong memory ❌ manual edit memgit rollback
Works in Cursor, Windsurf, GPT ❌ Claude only ❌ Claude only ✅ all via MCP / HTTP
Team sync ❌ copy-paste files memgit git push
Scales to 10k+ sessions ❌ file grows ❌ search slows memgit squash
Measurable token savings memgit stats
Export / import standard format ✅ TOON + git

Proof — token savings you can measure

Run this on your own store to see the actual numbers:

$ memgit stats

  Total memories:   108   (41 feedback · 23 user · 19 project · 12 reference · 8 convention · 5 lesson)
  Priority:          3 critical · 67 medium · 38 low

  Token cost comparison:
  ┌─────────────────────────────────────┬──────────────────┬───────────────────┬─────────────────────┐
  │ Approach                            │ Tokens/session   │ vs full load      │ $/session (GPT-4o)  │
  ├─────────────────────────────────────┼──────────────────┼───────────────────┼─────────────────────┤
  │ claude.md / dump all memories       │ 12,840           │ 100%  baseline    │ $0.0321             │
  │ memgit search (BM25 top-8)          │ 640              │ 5%  (95% savings) │ $0.0016             │
  └─────────────────────────────────────┴──────────────────┴───────────────────┴─────────────────────┘

  Weekly savings (10 sessions/week):
    Tokens saved:   122,000/week
    Cost saved:     $0.31/week  →  $15.86/year  (at GPT-4o input pricing, $2.50/M)

Why such a big difference? claude.md loads all context every session. memgit uses BM25 relevance scoring — it loads only the 8 memories most relevant to the current session, not everything you've ever recorded.


The git analogy is literal

memgit's data model maps exactly to git:

memgit git
mnemonic file
MindState tree
checkpoint commit
thread branch
memgit commit git commit
memgit diff git diff
memgit log git log
memgit squash --keep-last 100 git rebase -i --autosquash
memgit git push git push

This is not metaphorical — memgit uses a content-addressed object store (SHA-256 blobs) identical to git's architecture. Every memory has a stable SHA. Identical content has identical SHAs. Old state is always recoverable.


The store IS a git repo

Every memory is a readable .toon file under memories/. Push your entire memory set to GitHub with standard git:

memgit git init --remote git@github.com:yourteam/ai-memory.git
memgit git push

Teammates pull and start with your AI's learned rules from session 1:

git clone git@github.com:yourteam/ai-memory.git ~/.claude/memgit-store
memgit setup all

You can grep, git blame, and git diff your memories just like code:

grep -rl "database" ~/.claude/memgit-store/memories/
git log --follow memories/no-db-mock.toon
git diff HEAD~7 memories/

Install

Mac / Linux:

pip install memgit

Mac (Homebrew):

brew tap code4161/tap && brew install memgit

Windows:

pip install memgit

(choco install memgit is not live yet — the Chocolatey package is not on community.chocolatey.org. Use pip until it lands.)

Any AI tool config (no Python needed — npx auto-installs on first run):

{ "mcpServers": { "memgit": { "command": "npx", "args": ["-y", "memgit-mcp"] } } }

Quickstart (3 minutes)

# 1. Install and initialize
pip install memgit
memgit init               # auto-detects best location (~/.claude/memgit-store etc.)

# 2. Import existing memories (if you use Claude Code)
memgit import claude-code ~/.claude/projects/

# 3. Register with your AI tools (interactive picker)
memgit setup

# 4. See your token savings
memgit stats

Restart your AI tool — it now searches your memory store at the start of every session.


Resume where you left off

Ask an AI "can we proceed on the pending tasks?" in a fresh session and it will guess from whatever file happens to be open. memgit resume replaces the guess with the record:

memgit resume            # last checkpoints, work in flight, recent + critical memories
memgit resume --plain    # plain text, for piping into an AI context
memgit resume --json     # for tooling

Wire it into Claude Code so every new session starts with this digest in context — no tool call, no judgment required:

memgit setup hooks       # installs a SessionStart hook (~/.claude/settings.json)

The digest is deliberately bounded (~350 tokens measured on a 500-memory store): rules are clipped, the critical list is capped, and full text is one get_memory call away.


Scale to 10,000+ sessions

After months of use, your checkpoint history grows. Squash compresses it, gc reclaims the disk:

memgit squash --keep-last 100    # keep last 100 checkpoints, squash everything older
memgit squash --older-than 30    # squash everything older than 30 days
memgit squash --dry-run          # preview first

memgit gc                        # delete unreachable objects, trim reflogs
memgit gc --dry-run              # preview
memgit gc --squash-keep 200      # compact history, then sweep

The current memory state is always preserved — and squash is lossless-in-substance: every collapsed checkpoint leaves a one-line record (time, author, diff, message) in an append-only archive under .memgit/logs/archive/ that gc never touches. Benchmark on a 2,000-checkpoint store: 94% smaller (39.5 MB → 2.2 MB), fsck clean. History operations stay O(1) as the chain grows (SHA resolution and checkpoint counting measured at ~0.08 ms at 2,000 checkpoints).


Multiple agents, one memory

All writes go through a git-style store lock (0.08 ms overhead), so concurrent agents can't corrupt the store or lose each other's updates. Two patterns:

Shared thread — agents write concurrently; if one commits while another has work staged, the second commit auto-merges (three-way, against the recorded base) instead of clobbering. Set MEMGIT_AUTHOR=agent-name so each checkpoint says who did it.

Thread per agent — isolate, then integrate:

memgit thread create agent-1     # branch off for each agent
# ... agents work on their own threads ...
memgit merge agent-1             # three-way merge back (common-ancestor based)

Conflicts (same memory changed on both sides) resolve to the newest version; an edit always beats a delete. Both histories are preserved.


What the AI sees

Once registered via MCP, every AI tool gets 6 tools:

Tool When the AI uses it
resume_session When the request depends on prior state — "continue", "the pending tasks", session start
search_memories Before answering anything that touches past work or preferences
get_memory When it needs full details of a specific memory
list_memories To browse or audit what's stored
save_memory When it learns something worth keeping for next time
get_checkpoint_log To check when memories were last synced

The tool descriptions teach the AI judgment — "does this request depend on state you don't have in context?" — rather than keyword triggers. Measured cost of the whole tool surface: ~1,150 tokens once per session; a resume_session reply is ~335.


Commands

# Core (git-like)
memgit init                       # initialize store (auto-detects best path)
memgit add <slug> <rule>          # stage a memory
memgit commit -m "message"        # checkpoint current state
memgit log                        # history
memgit diff [sha1] [sha2]         # what changed
memgit show <slug>                # display a memory
memgit remove <slug>              # remove from active index (history preserved)
memgit status                     # staged changes
memgit search <query>             # BM25 relevance search
memgit rollback <ref>             # restore state to a checkpoint (HEAD~N or SHA)
memgit resume                     # where we left off — session-start digest
memgit merge <thread>             # three-way merge a thread into the current one

# Scale & proof
memgit squash                     # compress old history (archives what it collapses)
memgit gc                         # reclaim disk: sweep unreachable objects
memgit stats                      # token savings + disk usage
memgit lint                       # validate all memories
memgit fsck                       # verify store integrity

# Import / export
memgit sync                       # sync from Claude Code files + commit
memgit import claude-code <path>
memgit import file <path>
memgit export <slug>

# Git sync (team features)
memgit git init [--remote URL]
memgit git push [remote] [branch]
memgit git pull [remote] [branch]
memgit git export
memgit git status

# AI tool registration
memgit setup                      # interactive step-by-step picker
memgit setup all                  # auto-register every detected tool
memgit setup claude-code
memgit setup cursor
memgit setup windsurf
memgit setup cline
memgit setup continue
memgit setup gemini-cli
memgit setup hooks                # Claude Code SessionStart hook → auto-inject resume digest

# Server
memgit serve                      # MCP stdio (Claude Code, Cursor, Windsurf, Cline)
memgit serve --http               # HTTP REST (ChatGPT Custom Actions, Gemini)

# Visualization
memgit graph                      # D3.js interactive relationship map
memgit thread list / switch / create

AI tool support

Tool Protocol Command
Claude Code MCP stdio memgit setup claude-code
Claude Desktop MCP stdio memgit setup claude-desktop
Cursor MCP stdio memgit setup cursor
Windsurf MCP stdio memgit setup windsurf
Cline / Roo-Code MCP stdio memgit setup cline
Continue.dev MCP stdio memgit setup continue
ChatGPT (Custom Actions) HTTP + OpenAPI memgit serve --http → import http://localhost:7474/openapi.json
Gemini API HTTP function calling memgit serve --http + llm-tool-definitions.json
Any MCP tool MCP stdio Add {"command": "memgit", "args": ["serve"]} to config

TOON format — compact, readable, diffable

Standard markdown memory file:

## Rule: Never mock the database in tests
**Type:** feedback  
**Priority:** medium  
**Why:** We got burned last quarter — mocked tests passed but the prod migration failed.  
**When to apply:** Any time writing tests that touch persistence layers.  
**Tags:** testing, database

The same memory in TOON:

TOON1|fb|no-db-mock|2026-07-01T10:00Z
#testing #database
RULE:Never mock the database in tests
WHY:Mocked tests passed but prod migration failed last quarter
WHEN:Any persistence test

Measured with a real tokenizer, TOON is ~5–10% leaner than equivalent markdown — a nice bonus, not the headline. The headline saving is retrieval: memgit loads the top-8 relevant memories per query instead of everything.

At 108 memories: 12,840 tokens (dump everything) → 640 tokens (memgit BM25 top-8)

For exact token counts in memgit stats, install the optional tokenizer: pip install "memgit[tokens]".


Architecture

~/.claude/memgit-store/
  .memgit/
    objects/     ← SHA-256 content-addressed blobs (gzip compressed)
    refs/threads/main   ← HEAD checkpoint SHA
    TOON_INDEX   ← active slug→sha mapping
    config       ← author, default thread
    logs/        ← ref change audit trail
  memories/      ← flat .toon files (git-trackable, human-readable)
  .git/          ← standard git repo (after `memgit git init`)

Contributing

git clone https://github.com/code4161/memgit.git
cd memgit
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest    # 48 tests, all passing, < 1 second

See CONTRIBUTING.md.


Roadmap

  • Content-addressed object store (git-identical architecture)
  • TOON format (compact line-oriented memory format)
  • MCP server — Claude Code, Cursor, Windsurf, Cline, Continue.dev
  • HTTP server — ChatGPT Custom Actions, Gemini function calling
  • BM25 relevance search (load only what matters)
  • memgit stats — measured token savings proof
  • memgit squash — scale to 10k+ sessions
  • memgit git push/pull — team sync via standard git
  • Flat memories/ directory — grep/diff/blame your memories
  • D3.js graph visualization of memory relationships
  • memgit resume + SessionStart hook — sessions start with "where we left off"
  • memgit gc — space reclamation (mark-and-sweep, lossless squash archive)
  • Multi-agent write safety — store lock, auto-merge commits, memgit merge
  • PyPI + Homebrew (tap) + npm published (v0.1.5)
  • Chocolatey (not yet live on community.chocolatey.org)
  • Interactive setup wizard (memgit setup)
  • Smart memgit init (auto-detects tool, no path needed)
  • VS Code extension (v0.1.5, Marketplace: code416-memgit.memgit)
  • JetBrains plugin (Phase 3)
  • Semantic search via embeddings (Phase 4)
  • memgit.dev website (live)
  • Memory compression / auto-summarization (Phase 5)
  • Team access control + audit trail (Phase 5)
  • Memory marketplace — share reusable context packs (Phase 6)

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memgit-0.2.0.tar.gz (74.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memgit-0.2.0-py3-none-any.whl (64.0 kB view details)

Uploaded Python 3

File details

Details for the file memgit-0.2.0.tar.gz.

File metadata

  • Download URL: memgit-0.2.0.tar.gz
  • Upload date:
  • Size: 74.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for memgit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 35a889caa8bfe0f73b4dc90a4a2c74b8d2f3fe39d571c10771b9ae6ab26019cc
MD5 bc0cd78da8c996b4745b49bbe6116eb8
BLAKE2b-256 589dba0c45067f7d3360f1fdc0806db9c8bcd425a87579d34567bc04d94a4dbc

See more details on using hashes here.

File details

Details for the file memgit-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: memgit-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 64.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for memgit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 113af6cc091ee23ff4c5a8f72b05dc4622b0704fa96fe7e994591b0ce94b9212
MD5 60ff37e838f9dd191ca48cb818c2a6ee
BLAKE2b-256 93f2cc8a3a05b1032ff82a1ef48504be9209d65f2ccfb6c4ac862bfd7d17026d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page