Hierarchical memory with consolidation and principled decay for LLM agents and assistants.

These details have not been verified by PyPI

Project links

Project description

Engram

Hierarchical memory with consolidation and principled decay for LLM agents and assistants.

Engram is a memory layer for LLM systems that does what existing memory libraries don't: it consolidates. Raw events get abstracted into general patterns, redundant or contradicted memories decay, and retrieval reads across a hierarchy from specific episodes to compressed knowledge — the way human memory actually works.

It is designed as a single primitive that serves both agents (procedural memory: "in situations like this, this approach worked") and assistants (semantic memory: "the user has a golden retriever they care about"), with the same algorithm and the same API.

Why Engram exists

Every production LLM system today has a memory problem. The complaint is universal — "it doesn't remember me" — and the typical solution is some variation of "dump everything into a vector store and search by cosine similarity."

That isn't memory. It's a logbook with a search bar.

Real memory does three things that current systems don't:

Consolidates. Many specific events become one general principle. "User mentioned dog in conv 3, vet in conv 7, kibble in conv 12" becomes "user has a dog they actively care about."
Forgets selectively. Routine, redundant, or contradicted memories decay; surprising, frequently-used, or recently-relevant ones strengthen.
Reads across abstractions. Retrieval sometimes wants the general pattern, sometimes the specific episode, sometimes both. Flat stores can't do this cleanly.

Engram is built around these three principles.

The core idea

Engram organizes memory as a hierarchy with three things flowing through it:

                      ┌──────────────────────────────┐
                      │   Consolidated abstractions  │   ← retrieved when general
                      │   (semantic / procedural)    │      patterns suffice
                      └──────────────▲───────────────┘
                                     │  consolidation
                                     │  (clustering + abstraction)
                      ┌──────────────┴───────────────┐
                      │   Mid-level summaries        │
                      │   (episode clusters)         │
                      └──────────────▲───────────────┘
                                     │
                      ┌──────────────┴───────────────┐
                      │   Raw event log              │   ← retrieved when
                      │   (episodic memory)          │      specifics matter
                      └──────────────────────────────┘
                                     │
                                     ▼
                              decay function
                          (recency, reinforcement,
                           corroboration, contradiction)

The four moving parts:

Event log. Every observation lands here first, with provenance and timestamp.
Consolidation pass. Periodically (or on trigger), the system clusters related events, extracts abstractions, and links them to their supporting evidence.
Decay. Memories at every level have weights driven by reinforcement (was this retrieved? did it lead to a successful outcome?), corroboration (how many independent events support it?), contradiction (was it ever overruled?), and recency.
Hierarchical retrieval. Queries read across the whole hierarchy, preferring abstractions when they suffice and drilling into specifics when they don't.

Quick start

pip install engrampy

The PyPI distribution name is engrampy. The Python import name is just engram (so from engram import Memory works as expected). The bare engram and engram-memory names on PyPI are squatted by unrelated parties — PEP 541 reclaim requests are pending.

from engram import Memory

memory = Memory(
    backend="sqlite",                    # or "postgres", "duckdb"
    embedding_model="text-embedding-3-small",
    consolidation_model="gpt-4o-mini",   # used for abstraction extraction
    consolidation_interval=50,           # consolidate every 50 events
)

That's the whole setup. Engram is model-agnostic — bring whichever LLM and embedding model you want for the consolidation and retrieval steps.

Usage

As assistant memory

# Observe events as they happen
memory.observe("User mentioned they have a golden retriever named Max.")
memory.observe("User asked about senior dog food.")
memory.observe("User said Max is 9 years old and slowing down.")

# ... many sessions later ...

context = memory.retrieve("what should I know about the user's pets?")
# Returns:
# [
#   {
#     "level": "abstraction",
#     "content": "User has an aging golden retriever (Max, ~9yo) and is
#                 actively researching senior dog care.",
#     "confidence": 0.91,
#     "supported_by": [event_ids...],
#   },
#   ... lower-level episodes available if needed
# ]

As agent procedural memory

# Record what the agent tried and what happened
memory.observe({
    "type": "procedure",
    "situation": "API returned 429 rate limit",
    "action": "exponential backoff with jitter, retry up to 5x",
    "outcome": "success",
})

# Later, in a new situation
procedures = memory.retrieve_procedures(
    situation="hitting 503 errors from downstream service"
)
# Returns consolidated procedures from analogous past situations,
# ranked by how often they worked.

Manual consolidation and inspection

# Trigger consolidation explicitly (e.g. during downtime)
memory.consolidate()

# Inspect what's been consolidated
memory.summary()

# Resolve a contradiction manually
memory.reconcile(memory_id="...", resolution="prefer_recent")

How it works under the hood

Consolidation

When triggered, the consolidation pass:

Clusters recent unconsolidated events using embedding similarity, with a configurable cohesion threshold.
Extracts abstractions from each cluster using a cheap LLM call. The prompt is structured to produce generalizations, not summaries.
Links abstractions to their supporting events (provenance is always preserved).
Detects contradictions with existing abstractions and flags them for resolution.
Promotes stable, frequently-corroborated abstractions to higher levels of the hierarchy.

Decay

Each memory item carries a weight $w \in [0, 1]$. The weight evolves as:

$$ w_{t+1} = w_t \cdot \alpha^{\Delta t} + \beta \cdot r_t + \gamma \cdot c_t - \delta \cdot x_t $$

Where $r_t$ is reinforcement (was it retrieved and useful?), $c_t$ is new corroboration, $x_t$ is contradiction, and $\alpha, \beta, \gamma, \delta$ are tunable. Items below a threshold are pruned (or kept as cold storage, depending on configuration).

Retrieval

Retrieval is coarse-to-fine by default: search abstractions first, then drill into supporting episodes only if the query demands specifics or the top-level results are low-confidence. This is both faster and more grounded than flat vector search.

What makes Engram different

	Flat vector store	mem0 / similar	Engram
Stores raw events	✓	✓	✓
Summarization	—	✓	✓
Multi-level hierarchy	—	—	✓
Principled decay	—	partial	✓
Contradiction handling	—	—	✓
Provenance tracking	—	partial	✓
Procedural memory	—	—	✓
Coarse-to-fine retrieval	—	—	✓

The headline difference is that Engram treats memory as a living hierarchy that changes over time, not as a static append-only store with a search index. The downstream effects — better recall on long conversations, transferable agent procedures, graceful handling of stale or contradictory information — fall out of that.

Benchmarks

The success criterion for Engram is beating state-of-the-art on long-horizon memory benchmarks, not just being correct in principle.

Tracked suites:

LongMemEval — long-horizon conversational memory.
LoCoMo — multi-session dialogue with memory recall (especially the temporal and adversarial splits, where flat RAG breaks).
Custom procedural transfer benchmark — does an agent with Engram do better on tasks it has seen analogues of? (Constructed from agent traces.)

Tracked baselines: Chroma (flat dense), Chroma + BM25 (hybrid), Letta / MemGPT, Zep / Graphiti, Cognee, HippoRAG, mem0, A-MEM, full-context (as upper bound).

The full plan — targets, why-we-think-we-can-win, and the reproducibility discipline — is in benchmarks/SOTA.md. The running comparison is in benchmarks/SCOREBOARD.md. A claim of "we beat X" requires a committed manifest in benchmarks/runs/; without one it doesn't count.

Roadmap

Stage-by-stage breakdown — including cross-cutting standards on speed, quality, and security — in ROADMAP.md. High-level milestones:

v0.1 — Core primitive. Event log, basic consolidation, decay, coarse-to-fine retrieval. SQLite backend. Reference benchmarks against flat vector store.

v0.2 — Procedural memory. First-class support for action/situation/outcome triples. Procedure retrieval API. Agent-framework integrations (LangGraph, LlamaIndex, raw OpenAI).

v0.3 — Contradiction and temporal reasoning. Trust-weighted conflict resolution, temporal segmentation ("X was true until March"), explicit invalidation.

v0.4 — Multi-tenant and production. Postgres backend, async API, observability, memory inspector UI.

v1.0 — Paper + stable API. Frozen public API, full benchmark suite, peer-reviewed paper.

Research

Engram is an applied research project as much as a library. The paper-track contributions:

A formal framing of memory as a hierarchical decay process with measurable consolidation quality.
Algorithmic choices (when to consolidate, what to abstract, how to decay) with ablations.
A unified primitive for episodic→semantic consolidation and episodic→procedural abstraction.
Benchmarks against existing memory libraries and flat baselines.

Drafts and notes live in /research.

Citation

If you use Engram in research, please cite (citation will be added when the paper is on arXiv):

@misc{engram2026,
  title  = {Engram: Hierarchical Memory with Consolidation and Decay for LLM Systems},
  author = {[your name]},
  year   = {2026},
  note   = {Preprint forthcoming},
}

Contributing

Engram is early. The most useful contributions right now:

Benchmark runs — reproducing baselines, finding failure modes.
Algorithmic experiments — alternative consolidation strategies, decay functions, retrieval policies.
Integrations — bindings for popular agent/RAG frameworks.
Edge cases — adversarial conversations or agent traces that break the current implementation.

See CONTRIBUTING.md for setup and conventions.

License

MIT. See LICENSE.

Acknowledgments

Engram draws on ideas from cognitive neuroscience (complementary learning systems, episodic-to-semantic consolidation, Ebbinghaus decay), spaced repetition systems, and prior memory libraries (mem0, Letta/MemGPT, Zep). Standing on shoulders.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

May 18, 2026

0.2.1

May 12, 2026

This version

0.2.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

engrampy-0.2.0.tar.gz (288.0 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

engrampy-0.2.0-py3-none-any.whl (194.2 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file engrampy-0.2.0.tar.gz.

File metadata

Download URL: engrampy-0.2.0.tar.gz
Upload date: May 12, 2026
Size: 288.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for engrampy-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`0a4d1c09e653e9712ea1f15f4f3b2a81dd0414823914c063d8c41d367ddf2c52`
MD5	`5cdf1930ec2b1599a583cf0fd4443560`
BLAKE2b-256	`5e8127403f8f2ae6eb74a0759879176e2ef56d4b12698cce4222fd0a4c13abfd`

See more details on using hashes here.

File details

Details for the file engrampy-0.2.0-py3-none-any.whl.

File metadata

Download URL: engrampy-0.2.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 194.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for engrampy-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0a1d7689d61e9ae20ef85f564f8da568a527c2f63f476851b29e5c96cf56d326`
MD5	`a5ff82f10c765459d97f5ed367ffc31b`
BLAKE2b-256	`f4f4f7dcadb4f5ccd21c885444cb64e1990a9b4d79ef63287336664d0ee4fd1c`

See more details on using hashes here.

engrampy 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Engram

Why Engram exists

The core idea

Quick start

Usage

As assistant memory

As agent procedural memory

Manual consolidation and inspection

How it works under the hood

Consolidation

Decay

Retrieval

What makes Engram different

Benchmarks

Roadmap

Research

Citation

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes