Give your AI a real brain - Agents today store text. MnemeBrain stores beliefs with evidence, confidence, provenance, and revision logic.
Project description
MnemeBrain Lite
The belief layer for AI agents.
⭐ Building AI agents? Run the BMB benchmark on your memory stack.
pip install mnemebrain-benchmark bmb runNo API keys. No LLM calls. See how your system handles contradictions, belief revision, and temporal decay — in 60 seconds.
Why This Exists
Most AI agent memory systems treat memory as text retrieval.
This works for storing information. It fails when beliefs evolve.
User: "I'm vegetarian"
[later]
User: "I ate steak yesterday"
A RAG system either silently overwrites the first statement — or returns both without acknowledging the conflict. It cannot represent contradictions.
MnemeBrain was built to measure and close this gap. It stores beliefs, not text — with evidence, confidence, provenance, and revision logic baked in.
Benchmark Results
Belief Maintenance Benchmark (BMB) — 48 tasks · 8 categories · ~100 checks
mnemebrain ████████████████████ 100%
structured_memory ███████ 36%
mem0 (real API) █████ 29%
openai_rag (real) 0%
langchain_buffer 0%
naive_baseline 0%
rag_baseline 0%
| System | Score | Notes |
|---|---|---|
| MnemeBrain | 100% | 62/62 checks, all 30 scenarios |
| Structured Memory | 36% | No Belnap logic, no polarity tracking |
| Mem0 (real API) | 29% | Always truth_state=true, aggressive dedup |
| OpenAI RAG (real API) | 0% | truth_state=None, overwrites on conflict |
| LangChain Buffer | 0% | Store + query only |
| RAG Baseline | 0% | Store + query only |
Every RAG-based system scored 0% on contradiction detection. They overwrite instead of tracking conflicting evidence.
Full results: BMB_REPORT.md
How It Works
Evidence (supports / attacks)
↓
Belief Node
↓
TruthState (TRUE / FALSE / BOTH / NEITHER)
↓
Confidence + Temporal Decay
↓
Agent API
TruthState uses Belnap's four-valued logic. Instead of overwriting on conflict, the system represents the contradiction explicitly with BOTH — then lets you resolve it with new evidence.
Evidence Ledger is append-only. Evidence is never deleted, only invalidated. Every belief carries a full justification chain: what supports it, what attacks it, and what has expired.
Temporal Decay degrades evidence weight by belief type:
| BeliefType | Half-life |
|---|---|
| FACT | 365 days |
| PREFERENCE | 90 days |
| INFERENCE | 30 days |
| PREDICTION | 3 days |
Quick Start
pip install mnemebrain-lite # core only (no embeddings)
pip install mnemebrain-lite[embeddings] # + local sentence-transformers
pip install mnemebrain-lite[openai] # + OpenAI embeddings (set OPENAI_API_KEY)
pip install mnemebrain-lite[all] # everything
from mnemebrain_core.memory import BeliefMemory
from mnemebrain_core.providers.base import EvidenceInput
memory = BeliefMemory(db_path="./my_data")
# Store a belief
belief = memory.believe(
claim="user is vegetarian",
evidence=[EvidenceInput(
source_ref="msg_12",
content="They said no meat please",
polarity="supports",
weight=0.8,
reliability=0.9,
)]
)
# Introduce conflicting evidence — belief becomes BOTH (contradiction)
memory.revise(
belief_id=belief.id,
new_evidence=EvidenceInput(
source_ref="msg_47",
content="User ordered steak",
polarity="attacks",
weight=0.9,
reliability=0.95,
)
)
# Explain the contradiction
result = memory.explain(claim="user is vegetarian")
# → truth_state=BOTH, supporting_count=1, attacking_count=1
# Full install (Linux / Apple Silicon)
uv sync --extra dev --extra embeddings
# OpenAI embeddings (any platform, requires OPENAI_API_KEY)
uv sync --extra dev --extra openai
# Without embeddings (Intel Mac — torch 2.3+ has no x86_64 wheels)
uv sync --extra dev
# Run tests
uv run pytest tests/ -v
# Start API server
uv run python -m mnemebrain_core
# Default DB_PATH: ./mnemebrain_data
# Listens on 0.0.0.0:8000
Intel Mac note:
sentence-transformersrequires PyTorch, which no longer ships x86_64 macOS wheels (torch 2.3+). Usemnemebrain-lite[openai]instead — it works on any platform. Without any embedding provider,/believeand/explainreturn 501 Not Implemented. All other endpoints work.
Core Operations
| Operation | Description | Embeddings? |
|---|---|---|
believe() |
Store a belief with evidence. Merges duplicates via embedding similarity. | Yes |
retract() |
Invalidate evidence and recompute affected beliefs. | No |
explain() |
Return full justification chain — supporting, attacking, and expired evidence. | Yes |
revise() |
Add new evidence to an existing belief and recompute. | No |
Formal Model
MnemeBrain is grounded in two well-established theories from knowledge representation and belief revision:
- Belnap four-valued logic (1977) — used to represent contradictory evidence without collapsing the belief system. Instead of overwriting, the system holds
BOTHas a valid, stable state. - AGM belief revision (Alchourrón, Gärdenfors, Makinson, 1985) — defines how a rational agent updates beliefs when new evidence arrives, with minimal disturbance to existing knowledge.
TruthState is computed over the evidence ledger using Belnap's lattice:
TruthState ∈ { TRUE, FALSE, BOTH, NEITHER }
TRUE — net supporting evidence dominates
FALSE — net attacking evidence dominates
BOTH — significant supporting AND attacking evidence (contradiction)
NEITHER — insufficient evidence to determine
Confidence is derived from weighted, time-decayed evidence:
confidence = Σ(support_weight × decay(t)) / (Σ(support_weight × decay(t)) + Σ(attack_weight × decay(t)))
where decay(t) = 0.5 ^ (t / half_life) and half_life varies by belief type (3 days for PREDICTION → 365 days for FACT).
Belief ranking uses a composite score across three signals:
rank_score = 0.60 × similarity # semantic relevance to query
+ 0.25 × confidence # evidence strength
+ 0.15 × stability # inverse of revision volatility
Stability is 1 / (1 + revision_count) — beliefs that have been revised frequently rank lower than beliefs that have been stable, even at equal confidence. This prevents contradicted high-confidence beliefs from polluting retrieval.
Revision policy follows AGM minimal change: when new evidence contradicts an existing belief, the system retracts the minimum set of evidence necessary to restore consistency. Pluggable policies (recency, confidence-weighted, entrenchment-based) determine selection order.
Counterfactual reasoning uses copy-on-write sandbox isolation: hypothetical evidence is applied to a forked belief graph, leaving the canonical state unchanged.
REST API
# Start the server
uv run python -m mnemebrain_core
# Store a belief
curl -X POST http://localhost:8000/believe \
-H "Content-Type: application/json" \
-d '{
"claim": "user is vegetarian",
"evidence": [{
"source_ref": "msg_12",
"content": "They said no meat please",
"polarity": "supports",
"weight": 0.8,
"reliability": 0.9
}]
}'
# Explain a belief (returns truth_state, confidence, full evidence chain)
curl "http://localhost:8000/explain?claim=user+is+vegetarian"
# Revise with new evidence
curl -X POST http://localhost:8000/revise \
-H "Content-Type: application/json" \
-d '{
"belief_id": "<uuid>",
"evidence": {
"source_ref": "msg_50",
"content": "User ordered steak",
"polarity": "attacks",
"weight": 0.9,
"reliability": 0.95
}
}'
# Retract evidence
curl -X POST http://localhost:8000/retract \
-H "Content-Type: application/json" \
-d '{"evidence_id": "<uuid>"}'
# Semantic search with ranked scoring
curl "http://localhost:8000/search?query=vegetarian&limit=5&alpha=0.7"
# List beliefs by state
curl "http://localhost:8000/beliefs?truth_state=BOTH&min_confidence=0.5"
Working Memory Frames
Frames are active context buffers for multi-step reasoning episodes.
# Open a frame
curl -X POST http://localhost:8000/frame/open \
-H "Content-Type: application/json" \
-d '{"query_id": "<uuid>", "top_k": 20, "ttl_seconds": 300}'
# Add a belief to the frame
curl -X POST http://localhost:8000/frame/<frame_id>/add \
-H "Content-Type: application/json" \
-d '{"claim": "user is vegetarian"}'
# Get context for LLM prompt injection
curl http://localhost:8000/frame/<frame_id>/context
# Commit results back to the belief graph
curl -X POST http://localhost:8000/frame/<frame_id>/commit \
-H "Content-Type: application/json" \
-d '{"new_beliefs": [], "revisions": []}'
Full endpoint docs: docs/integration-api.md
Architecture
src/mnemebrain_core/
├── models.py # Belief, Evidence, TruthState, BeliefType
├── engine.py # Pure functions: compute_truth_state, confidence, decay
├── store.py # KuzuGraphStore — embedded graph DB
├── memory.py # BeliefMemory — 4 core operations + search/list
├── working_memory.py # WorkingMemoryFrame — active context for multi-step reasoning
├── triple_relations.py # TripleRelation — typed inter-triple edges (attacks, supports, depends_on)
├── providers/
│ ├── base.py # Abstract EmbeddingProvider
│ └── embeddings/ # sentence-transformers or OpenAI (optional)
└── api/
├── app.py # FastAPI application factory
├── routes.py # REST endpoints
└── schemas.py # Request/response models
Architecture phases:
| Phase | Adds | Status |
|---|---|---|
| 1 | EvidenceLedger + TruthState + 4 core operations | ✅ Shipped |
| 1.5 | Confidence ranking + stability score + TruthState multiplier | ✅ Shipped |
| 2 | WorkingMemoryFrame (context cache) | ✅ Shipped |
| 2.5 | BeliefSandbox (copy-on-write hypothetical reasoning) | ✅ Shipped |
| 3 | AGM revision policies + ATTACKS edges | ✅ Shipped |
| 4 | Reconsolidation windows + GoalNode | ✅ Shipped |
| 4.5 | PolicyNode + EWMA learning + blame attribution | ✅ Shipped |
| 5 | ConsolidationDaemon + HippoRAG retrieval + pattern separation | Planned (see mnemebrain) |
BMB Leaderboard
The Belief Maintenance Benchmark is an open evaluation for agent memory systems. 48 tasks, 8 categories — contradiction detection, belief revision, evidence tracking, temporal decay, retraction, dedup, extraction, and lifecycle.
| System | Score |
|---|---|
| MnemeBrain | 100% |
| Structured Memory | 36% |
| Mem0 (real API) | 29% |
| OpenAI RAG (real API) | 0% |
| LangChain Buffer | 0% |
| RAG Baseline | 0% |
Add your system. Implement the MemorySystemAdapter interface, drop it in adapters/, and run:
pip install mnemebrain-benchmark
bmb run --adapters your_adapter
All tests are deterministic. All scoring is open-source. We publish every result — including systems that outscore ours.
Adapter docs: mnemebrain-benchmark README · Full results: BMB_REPORT.md
References
- Belnap, N. D. (1977). A useful four-valued logic. In Modern Uses of Multiple-Valued Logic. Reidel.
- Alchourrón, C. E., Gärdenfors, P., & Makinson, D. (1985). On the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic, 50(2), 510–530.
- Lewis, D. (1973). Counterfactuals. Harvard University Press.
- Gutierrez, B. J., et al. (2024). HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. NeurIPS 2024.
Tech Stack
Python 3.12+, uv, FastAPI, Kuzu, Pydantic v2, pytest, sentence-transformers or OpenAI embeddings (optional)
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mnemebrain_lite-0.1.0a4.tar.gz.
File metadata
- Download URL: mnemebrain_lite-0.1.0a4.tar.gz
- Upload date:
- Size: 111.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb8eede55ab4f3f517847dd908b97bab9aa0500bcd947e95e65b070cfcf77d6e
|
|
| MD5 |
32ca411b42c9b08802609b59aee45104
|
|
| BLAKE2b-256 |
f10f1280fca3bf9ec1849a5e7351d3d023de624b2829e1818dbcd4124e8826cd
|
Provenance
The following attestation bundles were made for mnemebrain_lite-0.1.0a4.tar.gz:
Publisher:
release.yml on mnemebrain/mnemebrain-lite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mnemebrain_lite-0.1.0a4.tar.gz -
Subject digest:
eb8eede55ab4f3f517847dd908b97bab9aa0500bcd947e95e65b070cfcf77d6e - Sigstore transparency entry: 1059800402
- Sigstore integration time:
-
Permalink:
mnemebrain/mnemebrain-lite@73ee33a6612a8a786a5b1dbaf35e9540da034dbb -
Branch / Tag:
refs/tags/v0.1.0a4 - Owner: https://github.com/mnemebrain
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@73ee33a6612a8a786a5b1dbaf35e9540da034dbb -
Trigger Event:
push
-
Statement type:
File details
Details for the file mnemebrain_lite-0.1.0a4-py3-none-any.whl.
File metadata
- Download URL: mnemebrain_lite-0.1.0a4-py3-none-any.whl
- Upload date:
- Size: 27.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af1aff2a12ffa6bbb0e2ce7267e2c7c2b68f461c813c3741fe79e6fd45cfc843
|
|
| MD5 |
b93c1455152a56023dffc0fcf0c08e71
|
|
| BLAKE2b-256 |
bcd7044cd6a847fc8ee191e8f93ab1aa312a4a58e7c117ba2154be629029c3bc
|
Provenance
The following attestation bundles were made for mnemebrain_lite-0.1.0a4-py3-none-any.whl:
Publisher:
release.yml on mnemebrain/mnemebrain-lite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mnemebrain_lite-0.1.0a4-py3-none-any.whl -
Subject digest:
af1aff2a12ffa6bbb0e2ce7267e2c7c2b68f461c813c3741fe79e6fd45cfc843 - Sigstore transparency entry: 1059800403
- Sigstore integration time:
-
Permalink:
mnemebrain/mnemebrain-lite@73ee33a6612a8a786a5b1dbaf35e9540da034dbb -
Branch / Tag:
refs/tags/v0.1.0a4 - Owner: https://github.com/mnemebrain
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@73ee33a6612a8a786a5b1dbaf35e9540da034dbb -
Trigger Event:
push
-
Statement type: