Project-scoped, cross-session memory for AI coding agents — hybrid keyword+semantic retrieval over a tagged SQLite store.

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

timgordontg

These details have not been verified by PyPI

Project description

engrim

Cross-session memory for Claude Code. A small local store that remembers your project's decisions and facts, tags them by repo, and hands your agent just the relevant slice — at the start of each session and on every prompt — so you can /clear aggressively without losing the why behind your work.

engrim live in the Claude Code status bar — 44 curated, clear-safe

_{engrim, ambient in the Claude Code status bar — it even tells you when it's ✓ clear-safe to clear.}

Clearing your agent's context without engrim feels like closing a doc without hitting save. We've all got the high-school muscle memory of losing an unsaved assignment — so we don't clear, we just keep hammering one bloated session. engrim is the save button. Externalize your decisions as you go, then clear freely and watch them reload intact.

The problem, concretely: an agent's context window is expensive and forgetful. Everything it knows about your project either sits in the window (costly, and re-billed every turn) or vanishes when you clear. The first thing lost is the rationale that was never written into your code or files. engrim keeps that on disk and returns a precise, meaning-ranked slice on demand — so your memory grows unbounded while the context you actually load stays small.

It's the supercharger to your agent's engine, not a replacement for it. Claude Code's native context handling is excellent; engrim doesn't compete with it — it complements it, and charges it: the engine stays lean and fast, engrim holds the deep, queryable, cross-session layer you pull from on demand.

pip install engrim
engrim setup

That's the whole install. setup wires engrim into Claude Code and warms the semantic model once. Next session, your agent starts already knowing your project — and it understands what you mean, not just keywords you typed. A prompt about "the database" surfaces the decision that said "Postgres."

Who it's for

engrim is built for heavy, long-horizon work — the multi-week build where the why spans dozens of sessions and lives in nobody's head by Friday. Trading systems, data pipelines, gnarly migrations, anything you'll /clear and resume across days. If that's you, this is the tool that would've saved you three months ago.

If you do short, one-off sessions, you probably don't need it — and that's fine. engrim earns its keep precisely when a project is too big to hold in one window.

It compounds. The longer and deeper your Claude sessions run, the more decisions and rationale engrim has captured — and the more valuable each reload becomes. A fresh project barely needs it; a project you've been hammering for weeks has a memory layer no single context window could ever hold, handed back to you a relevant slice at a time. Value grows with the depth of the work, exactly when a plain context window is hitting its limits.

Try the save button (60 seconds)

Don't take it on faith — prove it to yourself:

engrim add -t decision -s "Chose Postgres over Mongo for relational integrity" --tags db
# ...now /clear your Claude Code session (or close it and open a new one)...
engrim context        # there's your decision, already reloaded and oriented

The first time you clear and watch your context come back, the fear is gone for good.

See it working — the live status line

engrim setup adds an ambient line to Claude Code's status bar, so you can watch the memory layer work without it ever cluttering the chat:

🧠 engrim · 32 curated · +14 logged · ✎ 2 to capture

32 curated — high-signal records in this project's memory (grows when you add).
+14 logged — turns recorded to the transcript log this session. This ticks up every turn: even when you're not adding curated records, engrim is capturing the raw back-and-forth for free (zero tokens, never loaded into context). It's the live proof that nothing is silently dropping on the floor.
✎ 2 to capture / ✓ clear-safe — recent decisions that don't yet appear in curated memory. Your at-a-glance "is it safe to /clear?" answer: capture what's flagged, watch it flip to ✓, then clear freely. The pencil is an invitation, not a warning — capturing as you work is normal. (Same honest signal as engrim review, kept light enough to live in the status bar.)

The number that moves is the log; the number that matters long-term is curated. Seeing both is how you learn the rhythm: work, capture the decisions, clear, reload.

Smart, hot context loading

engrim is retrieval-first by design, not a notes file that grew a search box. The bet — straight out of ML research — is that learned, query-conditioned selection of what to load beats stuffing everything into the window and hoping the model sorts it out.

Retrieval is hybrid: every record is embedded when written, and recall fuses keyword relevance (SQLite FTS5 / bm25) with semantic similarity (embedding cosine) via reciprocal-rank fusion.

Keyword nails exact IDs, symbols, error strings, file names.
Semantic catches paraphrase and concept — "how should I reach customers" finds the record about outreach, with zero shared words.
Fused, they beat either alone.

Semantic is on out of the box — the embedding model (a fast static embedder, model2vec, no GPU and no per-prompt neural pass) ships as a dependency and engrim setup warms it for you. Want a zero-third-party-dependency, fully-offline posture instead? Set ENGRIM_EMBED=off and engrim runs pure-lexical on the standard library alone.

Clear often — keep the context that matters

Because engrim persists your project's decisions and reloads them at session start, you can /clear or /compact aggressively to keep your agent fast and lean — without losing the context you care about:

Capture as you go — engrim add ... the moment a decision or fact lands (your agent can do this for you).
Clear freely — when the session gets heavy, /clear. The bloat goes; the knowledge stays.
Reload lean — the SessionStart hook injects your project's memory pack, so the next session starts oriented in a few hundred tokens instead of dragging a giant window.

Externalize → clear → reload. A small, durable store beats a giant, expensive context window — and it's lighter on rate limits, latency, and output quality (a clean, relevant slice gets sharper attention than 100k tokens of accumulated cruft).

Not sure it's safe to clear? engrim review scans your recent transcript for decisions that aren't in curated memory yet and flags them — so you capture them before you clear, not after you realize they're gone. It's deliberately honest: it nudges you when unsure rather than falsely reassuring you, and only says "looks safe to clear" when nothing's outstanding.

Continue-as-clear: pick up exactly where you left off

/clear banks the tokens; the boot pack brings back the spine. Two mechanics make the reload feel like --continue — full awareness of where you stopped, at boot-pack cost (hundreds of tokens, not a replayed transcript):

A pinned resume cursor. Tag any record resume-pointer and the newest one is pinned to the top of the boot pack, untruncated, under a [▶ RESUME HERE] banner — the first thing the next session reads. That's your cursor (the next action), not just your conclusions. Refresh it before a heavy clear (engrim supersede --id <old> --status superseded, then a fresh add … --tags resume-pointer) and the next boot opens on exactly what's next.
An open-task continuity tail. Even when you forget to curate, the boot pack surfaces the last stretch of uncaptured turns that carry open-loop cues — "pick this up later", "the next step is…", "still need to…" — so a mid-task clear can't drop your place. It's hard-capped (a few lines, its own tiny budget) so it never bloats context, and the "safe to clear?" nudge stays decision-only so it won't nag on ordinary work-in-progress.

Net effect: /clear aggressively and reopen oriented on what you were doing, not just what you decided.

"Isn't this just markdown notes / CLAUDE.md?"

No — and the difference is the whole point. Markdown memory is great until it outgrows the context window, and then it forces a bad trade: bloat the window, or truncate the file. (Claude Code literally warns you when a memory file gets too big.) engrim removes the choice — your memory grows unbounded while the context you load stays small and relevant.

	Markdown files	engrim
Loading	the whole file (or truncated)	a budget-capped relevant slice
Retrieval	"load it all and hope"	hybrid keyword + semantic ranking, on demand
Scale	context cost grows until it hits a wall	stays bounded as memory grows 100×
Structure	unstructured prose	types, tags, status, recency
Lifecycle	rots; you hand-prune	`supersede` stale records, keep history honest
Multi-project	per-folder, manual	one store, auto-scoped by project path
Programmatic	hand-edit	a queryable DB any tool or hook reads & writes

It's not anti-markdown and not a replacement for your agent's md memory. Think engine and supercharger: your CLAUDE.md is the engine — the hot working set your agent loads natively every session. engrim is the supercharger bolted alongside — the deep, queryable archive you pull from on demand so the window stays small. Clean division of labor: md = hot working set, db = deep archive. The seed-once context build below establishes that division without you babysitting it.

Day-1 context build, then db-canonical (seed-once)

The first time engrim runs for a project it does a one-time context build: it mirrors your existing markdown memory (e.g. Claude Code's per-project memory dir) into the store, so day one already carries your pre-install history. After that it steps aside — the store is canonical, sessions read from it, and new knowledge is logged via engrim add. It will not keep re-importing historical md over your live store (that would let stale history overwrite real work). Re-pull md deliberately any time with engrim sync --claude --force.

engrim sync ./docs            # build the store from a markdown dir (idempotent; re-runs are safe)
engrim sync --claude          # seed-once from Claude Code's memory dir for this project
engrim sync --claude --force  # deliberately rebuild from md (the only way to re-import)

Two tiers: curated memory + the full-conversation log

engrim keeps two separate stores so they never compete:

Curated memory (memories) — the decisions/facts/feedback you (or your agent) record. Small, high-signal, and the only tier loaded into the boot pack. This is what keeps context lean.
Transcript log (log) — an append-only record of the raw back-and-forth, never injected into context. A flight recorder: complete, replayable, queried only on demand.

With engrim setup, a Stop hook tails Claude Code's own transcript JSONL into SQLite after each turn — reading only the new bytes (a per-session byte-offset cursor) and de-duping on each turn's stable uuid, so it's cheap and idempotent. Writing to the log costs zero tokens and zero context (it's a disk append), so you can /clear every single turn and lose nothing. Browse or search it with engrim logs [-q "..."].

engrim does not enlarge Claude's context window — nothing can. It lets you use the window efficiently: load a lean slice, keep the rest on disk, pull more only when you ask. That's the trade that keeps a big chat gentle on session and rate limits instead of re-billing a giant window every turn.

A global layer that follows you everywhere

Most memory is project-specific — this repo chose Postgres. But some truths are about you, not any one repo: who authors your code, how you like commits written, conventions you hold across every project. engrim keeps a single global user-layer for exactly those and co-loads it alongside whatever project you're in — so you state them once and they ride into every session, everywhere.

engrim add --global -t user -s "Sole author — never add Co-Authored-By trailers"

It's purely additive: the global layer rides alongside project memory — the boot pack, the minder, and recall all span both — never instead of it, and one project's own records never leak into another. Budget discipline is unchanged: the global layer flows through the same fair, capped boot pack, so it can't bloat your context. This is vertical layering (user ⊕ project), deliberately not arbitrary multi-project loading — that would dilute the lean slice that's the whole point. Don't want it? ENGRIM_NO_GLOBAL=1 turns it off and reads go back to project-only.

How it works

Store: one SQLite file with an FTS5 virtual table kept in sync by triggers, plus an embedding table for the semantic tier. WAL mode for safe concurrency.
Hybrid recall: bm25 lexical rank fused with embedding cosine (reciprocal-rank fusion). Embeddings are computed once at write time (add auto-embeds) and stored; per prompt the minder embeds only the short query and does cosine — no per-turn neural pass.
The four-hook loop (engrim setup wires it, idempotently):
- SessionStart → engrim hook: on a project's first run, seed-once from file-memory; then inject the budget-capped boot pack (you see exactly what was restored). It also reconciles recent transcripts, so a session that crashed or hard-closed (skipping Stop/SessionEnd) gets its dropped tail swept up on the next boot — idempotent.
- UserPromptSubmit → engrim assist: the minder. Ranks the store against your prompt and injects only the top few records (gated on ≥2 substantive terms; ~150-token cap; hits-only).
- Stop → engrim log --hook: tails the transcript JSONL into the append-only log table — new bytes only, de-duped by uuid. Never enters context.
- SessionEnd → engrim sync --claude: a final, seed-gated mirror (no-op once seeded).
Boot-pack priority: user → feedback → state → decision → fact → reference, recent-first within each type, capped to a character budget so it loads cheaply.
Continue-as-clear: the newest resume-pointer-tagged record is pinned to the top of the pack, untruncated, under [▶ RESUME HERE]; the pack's recency tail also surfaces recent uncaptured open-task/decision turns — so a clear mid-task still reloads where you left off, not just what you decided.
Perf: each hook is a short-lived process, off the critical path except UserPromptSubmit, where it's invisible against model latency. A project with nothing embedded never pays the model-load cost.

Configuration

Everything is controllable by flag or environment variable — nothing is hard-coded.

Env var	Default	Purpose
`ENGRIM_DB`	`~/.engrim/memory.db`	Path to the SQLite store. Point many machines/containers at one shared file.
`ENGRIM_PROJECT`	(unset)	A stable project tag. Set it so the same project resolves identically across host and containers.
`ENGRIM_EMBED`	(on)	Semantic recall. On by default; set to `off` for pure-lexical, zero-third-party-dependency mode.
`ENGRIM_EMBED_MODEL`	`minishlab/potion-base-8M`	The embedding model (any model2vec static model).
`ENGRIM_NO_GLOBAL`	(unset)	Set to disable the global user-layer co-load, so reads are project-only.

Project-tag precedence: --project flag → $ENGRIM_PROJECT → git root of cwd → cwd.

git root means you can launch from any subdirectory of a repo and get the same memory.
$ENGRIM_PROJECT gives a stable tag when paths differ (a host's /home/you/app vs a container's /app).

Docker — works with and without it

engrim needs no Docker. But if you live in containers, you can give every container and the host one shared brain in two steps:

Share the store — mount one host directory into each container and point ENGRIM_DB at it.
Stabilize the tag — set ENGRIM_PROJECT so a project resolves the same name everywhere.

# examples/docker-compose.yml
services:
  app:
    image: your-app
    environment:
      ENGRIM_DB: /memory/memory.db      # shared store inside the container
      ENGRIM_PROJECT: my-app            # stable tag across host + every container
    volumes:
      - ~/.engrim:/memory               # one host dir, shared by all containers

SQLite runs in WAL mode (concurrent readers, serialized writers) — ideal for a shared local volume. (Avoid a single store over a network filesystem like NFS.)

Commands

command	what it does
`engrim setup`	wire the four-hook loop into Claude Code + warm the semantic model (one-time, idempotent)
`engrim add`	write a memory: `-t TYPE -s "summary" [-d detail] [--tags a,b]` (auto-embeds); `--global` writes the user-layer that loads in every project
`engrim recall -q "..."`	hybrid keyword + semantic recall for the current project
`engrim assist`	the minder — UserPromptSubmit hook: auto-inject the relevant slice for a prompt
`engrim context`	priority-ordered, budget-capped session-boot pack
`engrim review`	"safe to clear" coverage check — flags recent decisions in the log not yet in curated memory
`engrim sync [DIR\|--claude]`	seed-once md→store mirror (upsert + reconcile); `--force` to rebuild
`engrim import <path>`	bulk-import markdown notes as records (insert-only)
`engrim embed`	backfill embeddings (rarely needed — `add` auto-embeds; use after a model change)
`engrim logs [-q "..."]`	browse/search the transcript log (kept out of the boot pack)
`engrim supersede --id N`	mark a record `superseded`/`done`
`engrim list` · `projects` · `stats`	browse and summarize (stats reports context economics)
`engrim mcp`	run engrim as an MCP server (stdio) for any MCP client

Record types: decision, fact, feedback, state, user, reference. Stale records are superseded rather than deleted, so history stays honest.

Use it as an MCP server

Prefer wiring engrim in over the Model Context Protocol? It ships a server too — no extra dependency, just the stdio transport (newline-delimited JSON-RPC) over engrim's zero-dep core:

claude mcp add engrim -- engrim mcp        # Claude Code

…or point any MCP client at the engrim mcp command. It exposes three tools backed by the exact recall / boot-pack / write logic the CLI uses:

tool	does
`engrim_recall`	hybrid keyword + semantic search over the project's records
`engrim_context`	the curated, budget-capped session-boot pack
`engrim_add`	write a durable record (`decision`/`fact`/`feedback`/`state`/`user`/`reference`)

Project scope is the client's working directory by default ("auto"), same as the CLI. The hook loop and the MCP server are two front doors to one store — use either, or both.

Security & privacy

engrim is built to be safe to run without a second thought. Full posture in SECURITY.md.

No telemetry, never phones home. engrim itself opens no sockets. The one exception is a one-time download of the embedding model from HuggingFace on first run (cached forever after, then fully offline). Run ENGRIM_EMBED=off for a no-network, pure-standard-library posture.
Minimal dependencies. The core is pure standard library; the semantic tier adds model2vec and its well-known ML deps (numpy, safetensors, tokenizers, huggingface-hub). Lexical mode pulls none of them.
No dangerous primitives. No eval/exec/pickle/subprocess/shell, and no yaml.load (frontmatter is parsed by hand) — the classic RCE vectors are designed out.
No SQL-injection surface. Every input is bound via ?; search terms are tokenized/quoted before reaching SQLite, so code symbols and punctuation can't inject query operators.
Private on disk. The store is created owner-only (0600) on POSIX.
Your data never ships to git. *.db is gitignored; you commit the tool, never your memory.

Testing

A real test suite covers persistence, hybrid + semantic recall, supersession, the boot pack, the hook JSON contract, git-root tagging, and the env overrides. CI runs it across Python 3.10–3.13.

pip install pytest
pytest -q

Limitations & edge cases (the honest list)

engrim is small and opinionated on purpose. Known edges, stated plainly:

It remembers what gets recorded, not everything you say. This is the most important one. Curated memory holds what you (or the agent) add; the boot pack and the minder only surface that. If a decision is discussed but never recorded before the session closes, it isn't in curated memory — capture matters. (The transcript log keeps the raw back-and-forth, but it's a flight recorder you query deliberately, not what the minder reasons over.)
Retrieval, not comprehension. The minder ranks records by relevance to your prompt — fast and good, but it doesn't understand them. Hybrid ranking can still over-weight a stray keyword match on very abstract queries; rerank and adaptive fusion are on the roadmap.
First-run model download. The semantic tier fetches a small model from HuggingFace once (engrim setup does this up front with a visible message). Offline/locked host? ENGRIM_EMBED=off runs lexical with no network.
It doesn't enlarge the context window. Nothing can. It lets you use the window efficiently.
Monorepos: git-root tagging treats the whole repo as one project. For per-package memory, set ENGRIM_PROJECT per package.
Docker on macOS/Windows: SQLite over a Docker Desktop bind mount can hit filesystem-locking flakiness. Use a named volume there; native Linux bind mounts are fine.
Single-user by design. No auth or multi-tenant model — it's your local memory, not a team server.
Moved a project folder? Memory stays under the old path tag. Re-tag by exporting/re-adding, or pin a stable ENGRIM_PROJECT.

Agent support

engrim's core is agent-agnostic — a CLI over a SQLite file, so anything that can run a shell command can use it. The auto-load integration (hooks that inject memory automatically) is Claude Code-first in these early releases; the brain is universal, only that adapter is per-agent. Broader provider support is on the roadmap. Open SQLite, no lock-in.

Why it's free

engrim is MIT-licensed and free because the people doing the hardest builds shouldn't have to lose their context to do them. If it saves you a fraction of what it would've saved me, that's the point. Ideas and pull requests welcome.

License

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

timgordontg

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.0

Jun 24, 2026

This version

0.8.1

Jun 23, 2026

0.8.0

Jun 23, 2026

0.7.2

Jun 23, 2026

0.7.1

Jun 23, 2026

0.7.0

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

engrim-0.8.1.tar.gz (64.7 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

engrim-0.8.1-py3-none-any.whl (45.8 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file engrim-0.8.1.tar.gz.

File metadata

Download URL: engrim-0.8.1.tar.gz
Upload date: Jun 23, 2026
Size: 64.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for engrim-0.8.1.tar.gz
Algorithm	Hash digest
SHA256	`9b8ac8733b5af941f329863d51419a7406bc88093a0ba149862828562f6d1b8c`
MD5	`48f6177caddac404fe49f4bbcdcd28cd`
BLAKE2b-256	`4a3ab30abee495fd1ceba0317122567fccbab5281443f83f3d8afe8c323e3bcd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for engrim-0.8.1.tar.gz:

Publisher: release.yml on timgordontg/engrim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: engrim-0.8.1.tar.gz
- Subject digest: 9b8ac8733b5af941f329863d51419a7406bc88093a0ba149862828562f6d1b8c
- Sigstore transparency entry: 1932416284
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: timgordontg/engrim@a77d16f3011389eb11b56db5e17c1dae58ea951c
- Branch / Tag: refs/tags/v0.8.1
- Owner: https://github.com/timgordontg
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a77d16f3011389eb11b56db5e17c1dae58ea951c
- Trigger Event: release

File details

Details for the file engrim-0.8.1-py3-none-any.whl.

File metadata

Download URL: engrim-0.8.1-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 45.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for engrim-0.8.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7bbca1901ea01caf4f08d39a0c9afc91051db533d0b1c13ad92a22ff4be6c2f0`
MD5	`0c25f1ea5731506a604cfc59f906d6eb`
BLAKE2b-256	`8aa9f15cdcf751001bffcc9606c1028eb200180c62ac2e28e3e8f992ed082b25`

See more details on using hashes here.

Provenance

The following attestation bundles were made for engrim-0.8.1-py3-none-any.whl:

Publisher: release.yml on timgordontg/engrim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: engrim-0.8.1-py3-none-any.whl
- Subject digest: 7bbca1901ea01caf4f08d39a0c9afc91051db533d0b1c13ad92a22ff4be6c2f0
- Sigstore transparency entry: 1932416329
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: timgordontg/engrim@a77d16f3011389eb11b56db5e17c1dae58ea951c
- Branch / Tag: refs/tags/v0.8.1
- Owner: https://github.com/timgordontg
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a77d16f3011389eb11b56db5e17c1dae58ea951c
- Trigger Event: release

engrim 0.8.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

engrim

Who it's for

Try the save button (60 seconds)

See it working — the live status line

Smart, hot context loading

Clear often — keep the context that matters

Continue-as-clear: pick up exactly where you left off

"Isn't this just markdown notes / CLAUDE.md?"

Day-1 context build, then db-canonical (seed-once)

Two tiers: curated memory + the full-conversation log

A global layer that follows you everywhere

How it works

Configuration

Docker — works with and without it

Commands

Use it as an MCP server

Security & privacy

Testing

Limitations & edge cases (the honest list)

Agent support

Why it's free

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance