Online associative memory for LLMs

Project description

Modular Dynamic Architecture (MDA)

Online associative memory for LLMs. Learns during inference. No backpropagation.

What is MDA?

Large language models can reason but cannot remember. RAG partially addresses this but cannot update during a conversation or learn from it.

MDA fills precisely these gaps.

It encodes knowledge as 512-dimensional Holographic Distributed Representations (HDRs), connects concepts through a sparse synapse graph, and retrieves context by activating entity networks not by text-chunk similarity search. New knowledge is integrated immediately, without rebuilding any index.

MDA is not a RAG replacement. It is the persistent learning layer that RAG and LLMs are missing.

MDA Process

Key Properties

Token-free — no tokenizer, no vocabulary
Attention-free — no transformer encoder required
Online learning — learns during inference via the Oja rule
No catastrophic forgetting — entities are independent; new knowledge never overwrites old
CPU-first — runs on numpy; GPU acceleration via PyTorch when available
Model-agnostic — works with Ollama, OpenAI, Anthropic, llama.cpp, or any LLM

Quick Start

pip install mda-memory

from mda.integrations.engine import MDAEngine

engine = MDAEngine(model="qwen3:4b", user_id="demo")
engine.learn("Solaris Station was founded by Dr. Mira Voss in 2041.")

response = engine.chat("Who founded Solaris?")
print(response)
# Dr. Mira Voss founded Solaris Station in 2041.

CLI

mda --model qwen3:4b
mda --model claude-haiku-4-5-20251001 --provider anthropic

How Memory Works

Each turn, MDA retrieves relevant context from previous turns and injects it into the LLM prompt with confidence scores, inferred connections, and structured events. Memory grows and strengthens across the conversation without any reindexing.

MDA Example

Benchmark Results

Benchmark

Evaluated against a strong RAG baseline (bge-large-en-v1.5 + ChromaDB, top-6 retrieval) across 80 questions spanning 8 cognitive categories:

Category	RAG	MDA	Δ
ATOMIC_RECALL	100%	85%	−15%
MULTI_HOP	90%	90%	0%
CROSS_DOCUMENT	80%	70%	−10%
REASONING	70%	90%	+20%
INCREMENTAL_LEARNING	0%	60%	+60%
NOISE_RESISTANCE	100%	100%	0%
MEMORY_COMPRESSION	90%	70%	−20%
BOUNDARY	100%	100%	0%
OVERALL	78.8%	83.1%	+4.3%

MDA uses 3.1× less context per query than RAG while achieving higher overall accuracy.

Long-context retention (200 turns): RAG 0% — MDA 92%.

How It Works

Entity & W Matrix

Every concept is an Entity with a 512-dim identity vector v and a lazy-initialized weight matrix W (512×512). W is None until first activation — memory overhead is proportional to usage.

Online Learning (Oja Rule)

ΔW = η(yxᵀ − y²W)

No backpropagation. No gradient descent. Runs in O(d²) per entity per turn.

AssociativeChain

Query → origin entity → BFS synapse traversal (depth 3-6) → context assembly. Dynamic inhibition threshold cached per entity count.

BrocaModule

score = 0.35·s_query + 0.45·s_W + 0.20·s_sense

Open WebUI Integration

MDA works as a native Open WebUI Filter Function — zero pipeline server required.

Copy mda/integrations/owui_function.py contents
Open WebUI → Admin Panel → Functions → "+" → paste → Save
Enable globally

AgentMemory

Tool-call aware memory layer for agentic loops. No LLM call inside — pure memory read/write with typed, step-indexed observations.

from mda import AgentMemory

agent_mem = AgentMemory()

# After each tool call
agent_mem.observe("http_get", "status 200 OK", step=1)
agent_mem.error("payments API /v2/charge returned 410 Gone", step=7)
agent_mem.decision("using batch mode because rate limit is 100/min", step=12)
agent_mem.environment("API base URL is https://api.example.com/v3")

# Before next decision
context = agent_mem.query("payments API")
# → [ERROR@7] payments API /v2/charge returned 410 Gone
# → [DECISION@12] using batch mode because rate limit is 100/min

# Detect repeated attempts
if agent_mem.is_looping("retry charge endpoint"):
    agent_mem.decision("switching to /v3/charge endpoint", step=15)

# Find related past errors
related = agent_mem.similar_errors("connection refused")

# Persist & restore across runs
agent_mem.save()
agent_mem.load()

Context tags: [OBSERVE@N tool], [ERROR@N], [DECISION@N], [ENV]

Checkpoint is written to .memory/{user_id}/agent_{run_id}.json relative to the current working directory — not the library install path.

Batch Engine

For multi-agent workloads or large context windows:

from mda.integrations.engine import MDABatchEngine

engine = MDABatchEngine(depth=6, top_k_branches=5)
contexts = engine.build_context_batch([
    "legal contract risk analysis",
    "MDA memory architecture",
])
# depth=6 → 15,625 associative paths per query

GPU acceleration activates automatically when PyTorch + CUDA is available.

Roadmap

GPU acceleration — EntityMatrix matmul, persistent tensor cache
512-dim HDR — higher representation capacity
Open WebUI integration — native Filter Function
Batch engine — N queries in single GPU pass
AgentMemory — tool-call aware step-indexed memory for agentic loops
mda.cloud API — persistent memory as a service
MDA + RAG hybrid — offline corpus retrieval + online learning
Low-rank W — W ≈ A×B for higher-dimensional HDRs
Independent benchmark — community-constructed evaluation set

License

SSPL 1.0 — free for research and personal use. Commercial use requires a separate agreement.

For commercial licensing: mert@kairfy.com

Project details

Release history Release notifications | RSS feed

This version

0.2.0

May 30, 2026

0.1.3

Apr 30, 2026

0.1.2

Apr 26, 2026

0.1.1

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mda_memory-0.2.0.tar.gz (113.5 kB view details)

Uploaded May 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mda_memory-0.2.0-py3-none-any.whl (97.4 kB view details)

Uploaded May 30, 2026 Python 3

File details

Details for the file mda_memory-0.2.0.tar.gz.

File metadata

Download URL: mda_memory-0.2.0.tar.gz
Upload date: May 30, 2026
Size: 113.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mda_memory-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`802ff325eb06b6ebed8257612e082b6c4a08f07138d4f272ec2bd1b4ce23988d`
MD5	`432d1e028f04f2f7e63aa040ef81c2e7`
BLAKE2b-256	`6456b325238167d8467e318f9d0252ea57809c0bd83c84aa011d9ae32e69738f`

See more details on using hashes here.

File details

Details for the file mda_memory-0.2.0-py3-none-any.whl.

File metadata

Download URL: mda_memory-0.2.0-py3-none-any.whl
Upload date: May 30, 2026
Size: 97.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mda_memory-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1d671ebbbcff14dfb9de8ad6a7b1474b1b65498ff5be85debdf88e323fc0f4ad`
MD5	`ddc3173481a15233f5947d0eb34eeaa3`
BLAKE2b-256	`e1095f63ecb5b969d513a8f26f492811046074b6b52ec081ea5ac96973ff91e1`

See more details on using hashes here.

mda-memory 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Modular Dynamic Architecture (MDA)

What is MDA?

Key Properties

Quick Start

CLI

How Memory Works

Benchmark Results

How It Works

Entity & W Matrix

Online Learning (Oja Rule)

AssociativeChain

BrocaModule

Open WebUI Integration

AgentMemory

Batch Engine

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes