Self-aware adaptive memory for LLM agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

emson

These details have not been verified by PyPI

Project description

elfmem

Adaptive, self-aware memory for LLM agents.

elfmem gives your LLM agent a memory that grows, evolves, and forgets — just like a human's. Knowledge that gets used survives; knowledge that doesn't fades away. Identity persists across sessions. Context is always relevant.

import asyncio
from elfmem import MemorySystem

async def main():
    system = await MemorySystem.from_config("agent.db", {
        "llm": {"model": "claude-sonnet-4-6"},
        "embeddings": {"model": "text-embedding-3-small", "dimensions": 1536},
    })

    async with system.session():
        # Teach the agent something
        await system.learn("Use Celery with Redis for background tasks in Django.")
        await system.learn("I always explain my reasoning before giving recommendations.")

        # Retrieve relevant context for a prompt
        identity = await system.frame("self")         # Who am I?
        context  = await system.frame("attention",    # What do I know about this?
                                      query="background job processing")

        print(identity.text)   # Agent identity, values, style
        print(context.text)    # Relevant knowledge, ranked by importance

asyncio.run(main())

Features

Adaptive decay — Knowledge survives when reinforced through use, fades when ignored. Session-aware clock means your agent's memory doesn't decay over weekends.
SELF frame — Persistent agent identity. Values, style, and constraints survive across sessions with near-permanent decay rates.
Hybrid retrieval — 4-stage pipeline: pre-filter, vector search, graph expansion, composite scoring. Finds knowledge that's relevant and important.
Knowledge graph — Semantic edges between memory blocks. Co-retrieved knowledge strengthens connections. Graph expansion recovers related-but-not-similar context.
Contradiction detection — LLM-powered detection of conflicting knowledge. Newer, higher-confidence blocks win.
Near-duplicate resolution — Detects when new knowledge updates existing knowledge. Old block archived, new block inherits history.
Zero infrastructure — SQLite backend. No Redis, no Postgres, no vector database. One file, fully portable.
Any LLM provider — LiteLLM backend supports 100+ providers. Switch from OpenAI to Anthropic to local Ollama with a config change.

Installation

uv add elfmem

Or with pip:

pip install elfmem

Requires Python 3.11+.

How It Works

The Lifecycle

Every piece of knowledge follows the same path:

learn()        →  Instant ingestion. Content-hash dedup. No API calls.
consolidate()  →  Batch processing. Embeddings, self-alignment scoring,
                  tag inference, near-duplicate detection, graph edges.
recall()       →  4-stage hybrid retrieval. Reinforces returned blocks.
curate()       →  Maintenance. Archives decayed blocks, prunes weak edges,
                  reinforces top-scoring knowledge.

Three Frames

Frames are pre-configured retrieval pipelines optimized for different contexts:

Frame	Purpose	Scoring Priority	Use Case
SELF	Agent identity	Confidence, reinforcement, centrality	System prompt injection
ATTENTION	Query-relevant knowledge	Similarity, recency	RAG-style retrieval
TASK	Goal-oriented context	Balanced across all signals	Task planning

# Identity context — cached, no embedding needed
self_ctx = await system.frame("self")

# Knowledge retrieval — hybrid pipeline with graph expansion
attn_ctx = await system.frame("attention", query="async error handling")

# Task context — balanced scoring, goal blocks guaranteed
task_ctx = await system.frame("task", query="refactor the API layer")

Decay Tiers

Knowledge decays at different rates based on its nature:

Tier	Half-life	Use Case
Permanent	~80,000 hours	Constitutional beliefs, core identity
Durable	~693 hours	Stable preferences, learned values
Standard	~69 hours	General knowledge
Ephemeral	~14 hours	Session observations, temporary facts

Decay is session-aware: the clock only ticks during active use. Your agent's memory doesn't degrade over holidays or downtime.

Composite Scoring

Every block is scored across five dimensions:

Score = w_similarity    * cosine_similarity(query, block)
      + w_confidence    * block.confidence
      + w_recency       * exp(-lambda * hours_since_reinforced)
      + w_centrality    * normalized_weighted_degree(block)
      + w_reinforcement * log(1 + count) / log(1 + max_count)

Each frame uses different weights. SELF emphasizes confidence and reinforcement. ATTENTION emphasizes similarity and recency.

Configuration

Minimal (defaults)

system = await MemorySystem.from_config("agent.db")
# Uses claude-sonnet-4-6 for LLM, text-embedding-3-small for embeddings
# Requires ANTHROPIC_API_KEY environment variable

YAML config file

# elfmem.yaml
llm:
  model: "claude-sonnet-4-6"
  contradiction_model: "claude-opus-4-6"  # higher precision for contradictions

embeddings:
  model: "text-embedding-3-small"
  dimensions: 1536

memory:
  inbox_threshold: 10
  curate_interval_hours: 40
  self_alignment_threshold: 0.70
  prune_threshold: 0.05

system = await MemorySystem.from_config("agent.db", "elfmem.yaml")

Local models (no API key)

llm:
  model: "ollama/llama3.2"
  base_url: "http://localhost:11434"

embeddings:
  model: "ollama/nomic-embed-text"
  dimensions: 768
  base_url: "http://localhost:11434"

Environment variables

export ANTHROPIC_API_KEY=sk-ant-...
# or
export OPENAI_API_KEY=sk-...
# or any provider LiteLLM supports

API keys are read by LiteLLM from standard environment variables. They never appear in config files.

Agent Integration Pattern

async def run_turn(system, user_message):
    # 1. Assemble context
    self_ctx = await system.frame("self")
    attn_ctx = await system.frame("attention", query=user_message)

    # 2. Build prompt with memory context
    prompt = f"""
    {self_ctx.text}

    {attn_ctx.text}

    User: {user_message}
    """

    # 3. Generate response
    response = await llm.complete(prompt)

    # 4. Learn from the interaction
    if worth_remembering(response):
        await system.learn(extract_knowledge(response))

    return response

API Reference

MemorySystem

# Factory
system = await MemorySystem.from_config(db_path, config=None)

# Session management (required)
async with system.session():
    ...

# Write
result = await system.learn(content, tags=None, category="knowledge")

# Read
frame_result = await system.frame(name, query=None, top_k=5)
blocks = await system.recall(name, query=None, top_k=5)  # raw, no side effects

# Maintenance (usually automatic)
await system.consolidate()  # process inbox → active
await system.curate()       # archive decayed, prune edges, reinforce top-N

Return Types

LearnResult(block_id, status)           # "created" | "duplicate_rejected"
FrameResult(text, blocks, frame_name)   # rendered text + scored blocks
ConsolidateResult(processed, promoted, deduplicated, edges_created)
CurateResult(archived, edges_pruned, reinforced)

Custom Prompts

Override the LLM prompts for domain-specific agents:

prompts:
  self_alignment: |
    You are evaluating a memory block for a medical AI assistant...
    {self_context}
    {block}
    Respond: {"score": <float>}

  valid_self_tags:
    - "self/constitutional"
    - "self/domain/oncology"
    - "self/regulatory/hipaa"

Custom Adapters

For full control, implement the port protocols directly:

from elfmem.ports.services import LLMService, EmbeddingService

class MyLLMService:
    async def score_self_alignment(self, block: str, self_context: str) -> float: ...
    async def infer_self_tags(self, block: str, self_context: str) -> list[str]: ...
    async def detect_contradiction(self, block_a: str, block_b: str) -> float: ...

system = MemorySystem(engine, llm_service=MyLLMService(), embedding_service=MyEmbedder())

Architecture

src/elfmem/
├── api.py                  # MemorySystem — public API
├── config.py               # ElfmemConfig — Pydantic configuration
├── scoring.py              # Composite scoring formula (frozen)
├── types.py                # Domain types — shared vocabulary
├── prompts.py              # LLM prompt templates
├── session.py              # Session lifecycle, active hours tracking
├── ports/
│   └── services.py         # LLMService + EmbeddingService protocols
├── adapters/
│   ├── mock.py             # Deterministic mocks for testing
│   ├── litellm.py          # Real adapters (LiteLLM + instructor)
│   └── models.py           # Pydantic response models
├── db/
│   ├── models.py           # SQLAlchemy Core table definitions
│   ├── engine.py           # Async engine factory
│   └── queries.py          # All database operations
├── memory/
│   ├── blocks.py           # Block state, content hashing, decay tiers
│   ├── dedup.py            # Near-duplicate detection and resolution
│   ├── graph.py            # Centrality, expansion, edge reinforcement
│   └── retrieval.py        # 4-stage hybrid retrieval pipeline
├── context/
│   ├── frames.py           # Frame definitions, registry, cache
│   ├── rendering.py        # Blocks → rendered text
│   └── contradiction.py    # Contradiction suppression
└── operations/
    ├── learn.py            # learn() — fast-path ingestion
    ├── consolidate.py      # consolidate() — batch promotion
    ├── recall.py           # recall() — retrieval + reinforcement
    └── curate.py           # curate() — maintenance

Four layers, clear boundaries:

Layer	Responsibility	Side Effects
Storage (db/)	Tables, queries, engine	Database writes
Memory (memory/)	Blocks, dedup, graph, retrieval	None (pure)
Context (context/)	Frames, rendering, contradictions	None (pure)
Operations (operations/)	Orchestration, lifecycle	All side effects

Development

# Clone
git clone https://github.com/emson/elfmem.git
cd elfmem

# Install with dev dependencies
uv sync --extra dev

# Run tests (no API key needed — uses deterministic mocks)
uv run pytest

# Type checking
uv run mypy --ignore-missing-imports src/elfmem/

# Lint
uv run ruff check src/ tests/

Testing Philosophy

All tests run against deterministic mock services. No API keys, no network calls, fully reproducible. The mock embedding service produces hash-seeded vectors — same input always gives the same embedding. The mock LLM service returns configurable scores and tags via substring matching.

from elfmem.adapters.mock import make_mock_llm, make_mock_embedding

# Control exactly what the LLM returns
llm = make_mock_llm(
    alignment_overrides={"identity": 0.95},
    tag_overrides={"identity": ["self/value"]},
)

# Control similarity between specific texts
embedding = make_mock_embedding(
    similarity_overrides={
        frozenset({"cats are great", "dogs are great"}): 0.85,
    },
)

Design Decisions

Decision	Rationale
SQLAlchemy Core, not ORM	Bulk updates, embedding BLOBs, N+1 centrality queries
Session-aware decay, not wall-clock	Knowledge survives holidays and downtime
Soft bias for identity, not hard gates	Everything is learned; self-aligned knowledge just survives longer
Retrieval is pure; reinforcement is separate	Clean separation of read path and side effects
LiteLLM as unified backend	One adapter for 100+ providers; switch with config
Mock-first testing	All logic verified without API keys; adapters are thin wrappers

License

MIT

Acknowledgements

elfmem was designed through 26 structured explorations and 6 subsystem playgrounds, building mathematical confidence in every architectural decision before writing code. The complete design documentation is in sim/explorations/.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

emson

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.13.3

May 8, 2026

0.13.2

May 8, 2026

0.13.1

May 7, 2026

0.13.0

May 7, 2026

0.12.0

May 7, 2026

0.11.0

May 3, 2026

0.9.1

Apr 29, 2026

0.8.0

Apr 28, 2026

0.7.0

Apr 28, 2026

0.6.0

Apr 26, 2026

0.5.1

Mar 28, 2026

0.5.0

Mar 28, 2026

0.4.0

Mar 27, 2026

This version

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elfmem-0.1.0.tar.gz (522.9 kB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

elfmem-0.1.0-py3-none-any.whl (45.9 kB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file elfmem-0.1.0.tar.gz.

File metadata

Download URL: elfmem-0.1.0.tar.gz
Upload date: Mar 4, 2026
Size: 522.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for elfmem-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a66b2677894d019e43789316e89381282a7068302fb88154cf83894cc97d2543`
MD5	`39c062e8ee41884fc37efe0d4c0544c1`
BLAKE2b-256	`847181cc0695313e1e2bb724c530518db82ff01c73e7a143d6e904d9bb58d1ca`

See more details on using hashes here.

File details

Details for the file elfmem-0.1.0-py3-none-any.whl.

File metadata

Download URL: elfmem-0.1.0-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 45.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for elfmem-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ec904314c577ebd2ac2d05ca060eaba4585971a352d7151f99be118a827f7d26`
MD5	`4dd527db65cda134f9165bdeb3471ec4`
BLAKE2b-256	`aee8da5283bde401dcef262c848b197216c62eb9700f6d2b51e77fbf63ba32fe`

See more details on using hashes here.

elfmem 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

elfmem

Features

Installation

How It Works

The Lifecycle

Three Frames

Decay Tiers

Composite Scoring

Configuration

Minimal (defaults)

YAML config file

Local models (no API key)

Environment variables

Agent Integration Pattern

API Reference

MemorySystem

Return Types

Custom Prompts

Custom Adapters

Architecture

Development

Testing Philosophy

Design Decisions

License

Acknowledgements

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes