Skip to main content

Online associative memory for LLMs

Project description

Modular Dynamic Architecture (MDA)

Online associative memory for LLMs. Learns during inference. No backpropagation.

PyPI License: SSPL Python 3.10+


What is MDA?

Large language models can reason but cannot remember. RAG partially addresses this but cannot update during a conversation or learn from it.

MDA fills precisely these gaps.

It encodes knowledge as 512-dimensional Holographic Distributed Representations (HDRs), connects concepts through a sparse synapse graph, and retrieves context by activating entity networks not by text-chunk similarity search. New knowledge is integrated immediately, without rebuilding any index.

MDA is not a RAG replacement. It is the persistent learning layer that RAG and LLMs are missing.

MDA Process


Key Properties

  • Token-free — no tokenizer, no vocabulary
  • Attention-free — no transformer encoder required
  • Online learning — learns during inference via the Oja rule
  • No catastrophic forgetting — entities are independent; new knowledge never overwrites old
  • CPU-first — runs on numpy; GPU acceleration via PyTorch when available
  • Model-agnostic — works with Ollama, OpenAI, Anthropic, llama.cpp, or any LLM

Quick Start

pip install mda-memory
from mda.integrations.engine import MDAEngine

engine = MDAEngine(model="qwen3:4b", user_id="demo")
engine.learn("Solaris Station was founded by Dr. Mira Voss in 2041.")

response = engine.chat("Who founded Solaris?")
print(response)
# Dr. Mira Voss founded Solaris Station in 2041.

CLI

mda --model qwen3:4b
mda --model claude-haiku-4-5-20251001 --provider anthropic

How Memory Works

Each turn, MDA retrieves relevant context from previous turns and injects it into the LLM prompt with confidence scores, inferred connections, and structured events. Memory grows and strengthens across the conversation without any reindexing.

MDA Example


Benchmark Results

Benchmark

Evaluated against a strong RAG baseline (bge-large-en-v1.5 + ChromaDB, top-6 retrieval) across 80 questions spanning 8 cognitive categories:

Category RAG MDA Δ
ATOMIC_RECALL 100% 85% −15%
MULTI_HOP 90% 90% 0%
CROSS_DOCUMENT 80% 70% −10%
REASONING 70% 90% +20%
INCREMENTAL_LEARNING 0% 60% +60%
NOISE_RESISTANCE 100% 100% 0%
MEMORY_COMPRESSION 90% 70% −20%
BOUNDARY 100% 100% 0%
OVERALL 78.8% 83.1% +4.3%

MDA uses 3.1× less context per query than RAG while achieving higher overall accuracy.

Long-context retention (200 turns): RAG 0% — MDA 92%.


How It Works

Entity & W Matrix

Every concept is an Entity with a 512-dim identity vector v and a lazy-initialized weight matrix W (512×512). W is None until first activation — memory overhead is proportional to usage.

Online Learning (Oja Rule)

ΔW = η(yxᵀ − y²W)

No backpropagation. No gradient descent. Runs in O(d²) per entity per turn.

AssociativeChain

Query → origin entity → BFS synapse traversal (depth 3-6) → context assembly. Dynamic inhibition threshold cached per entity count.

BrocaModule

score = 0.35·s_query + 0.45·s_W + 0.20·s_sense

Open WebUI Integration

MDA works as a native Open WebUI Filter Function — zero pipeline server required.

  1. Copy mda/integrations/owui_function.py contents
  2. Open WebUI → Admin Panel → Functions → "+" → paste → Save
  3. Enable globally

AgentMemory

Tool-call aware memory layer for agentic loops. No LLM call inside — pure memory read/write with typed, step-indexed observations.

from mda import AgentMemory

agent_mem = AgentMemory()

# After each tool call
agent_mem.observe("http_get", "status 200 OK", step=1)
agent_mem.error("payments API /v2/charge returned 410 Gone", step=7)
agent_mem.decision("using batch mode because rate limit is 100/min", step=12)
agent_mem.environment("API base URL is https://api.example.com/v3")

# Before next decision
context = agent_mem.query("payments API")
# → [ERROR@7] payments API /v2/charge returned 410 Gone
# → [DECISION@12] using batch mode because rate limit is 100/min

# Detect repeated attempts
if agent_mem.is_looping("retry charge endpoint"):
    agent_mem.decision("switching to /v3/charge endpoint", step=15)

# Find related past errors
related = agent_mem.similar_errors("connection refused")

# Persist & restore across runs
agent_mem.save()
agent_mem.load()

Context tags: [OBSERVE@N tool], [ERROR@N], [DECISION@N], [ENV]

Checkpoint is written to .memory/{user_id}/agent_{run_id}.json relative to the current working directory — not the library install path.


Batch Engine

For multi-agent workloads or large context windows:

from mda.integrations.engine import MDABatchEngine

engine = MDABatchEngine(depth=6, top_k_branches=5)
contexts = engine.build_context_batch([
    "legal contract risk analysis",
    "MDA memory architecture",
])
# depth=6 → 15,625 associative paths per query

GPU acceleration activates automatically when PyTorch + CUDA is available.


Roadmap

  • GPU acceleration — EntityMatrix matmul, persistent tensor cache
  • 512-dim HDR — higher representation capacity
  • Open WebUI integration — native Filter Function
  • Batch engine — N queries in single GPU pass
  • AgentMemory — tool-call aware step-indexed memory for agentic loops
  • mda.cloud API — persistent memory as a service
  • MDA + RAG hybrid — offline corpus retrieval + online learning
  • Low-rank W — W ≈ A×B for higher-dimensional HDRs
  • Independent benchmark — community-constructed evaluation set

License

SSPL 1.0 — free for research and personal use. Commercial use requires a separate agreement.

For commercial licensing: mert@kairfy.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mda_memory-0.2.0.tar.gz (113.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mda_memory-0.2.0-py3-none-any.whl (97.4 kB view details)

Uploaded Python 3

File details

Details for the file mda_memory-0.2.0.tar.gz.

File metadata

  • Download URL: mda_memory-0.2.0.tar.gz
  • Upload date:
  • Size: 113.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mda_memory-0.2.0.tar.gz
Algorithm Hash digest
SHA256 802ff325eb06b6ebed8257612e082b6c4a08f07138d4f272ec2bd1b4ce23988d
MD5 432d1e028f04f2f7e63aa040ef81c2e7
BLAKE2b-256 6456b325238167d8467e318f9d0252ea57809c0bd83c84aa011d9ae32e69738f

See more details on using hashes here.

File details

Details for the file mda_memory-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mda_memory-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 97.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for mda_memory-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1d671ebbbcff14dfb9de8ad6a7b1474b1b65498ff5be85debdf88e323fc0f4ad
MD5 ddc3173481a15233f5947d0eb34eeaa3
BLAKE2b-256 e1095f63ecb5b969d513a8f26f492811046074b6b52ec081ea5ac96973ff91e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page