Probabilistic agent state layer. Bayesian belief tracking, surprise-driven updates, and delta logs for AI agents. Paper: arXiv:2606.22030
Project description
nous-state
A probabilistic agent state layer for long-running personal AI agents.
"Knowledge is prediction, not storage."
Paper
This repository accompanies the preprint:
Nous: A Predictive World Model for Long-Term Agent Memory Pranav Singh — Indian Institute of Technology Ropar arXiv:2606.22030 · cs.AI · June 2026
If you use this work, please cite:
@article{singh2026nous,
title = {Nous: A Predictive World Model for Long-Term Agent Memory},
author = {Singh, Pranav},
journal = {arXiv preprint arXiv:2606.22030},
year = {2026}
}
To reproduce the LoCoMo benchmark results reported in the paper, see benchmark/README.md.
The Problem
Every long-running AI agent eventually hits the same wall:
User (Month 1): "I work at Sarvam AI using Mistral for NyayaSahayak."
User (Month 4): "I switched to GPT-4 and joined Google DeepMind."
Agent (Month 5): *confidently tells someone Pranav uses Mistral at Sarvam AI*
Vector databases store both facts. Knowledge graphs require manual conflict resolution. Neither gives you a mathematically principled answer to which fact is currently true.
nous-state solves this with Bayesian probability distributions — the same math used in GPS navigation, spam filters, and medical diagnostics.
How It Works
Instead of storing facts, Nous maintains belief distributions over entity attributes:
Pranav.employer = { "Sarvam AI": 0.82, "unknown": 0.18 }
When new evidence arrives, it performs a Bayesian update:
"Pranav joined Google DeepMind" →
Pranav.employer = { "Google DeepMind": 0.86, "Sarvam AI": 0.12, "unknown": 0.02 }
Every update is recorded as an immutable delta — a change in understanding, not just a fact. This means:
- Contradictions are resolved mathematically, not heuristically
- History is queryable — "What did the agent believe about Pranav in March?"
- Forgetting is principled — unused beliefs decay toward uncertainty via entropy decay
- Identity hints — two entity names with high mutual information are flagged as likely the same person
Install
pip install nous-state
Zero runtime dependencies. Pure Python stdlib only (math, sqlite3, json, urllib).
⚠️ Honest caveat: The engine's math is only as good as the extraction feeding it. The built-in rule-based extractor handles simple patterns. For messy real-world text, use the
LLMExtractor— or bring your own parser. Garbage extraction → garbage beliefs, no matter how elegant the Bayesian update.
Quickstart
Rule-based extraction (no LLM needed)
from nous import Nous
memory = Nous("agent_memory.db")
# Session 1
memory.observe("Pranav works at Sarvam AI as an ML engineer.")
memory.observe("He is building NyayaSahayak using Mistral.")
# Session 4 — things changed
memory.observe("Pranav left Sarvam AI and joined Google DeepMind.")
memory.observe("NyayaSahayak now uses GPT-4 for better legal reasoning.")
# Query current beliefs
print(memory.predict("Pranav", "employer"))
# → {"Google DeepMind": 0.86, "Sarvam AI": 0.12}
print(memory.predict("NyayaSahayak", "model"))
# → {"GPT-4": 0.86, "Mistral": 0.12}
With LLM extraction (natural language → structured beliefs)
from nous import Nous
from nous.llm_extractor import LLMExtractor
extractor = LLMExtractor(
api_key="sk-or-...", # Any OpenRouter key
user_context={"name": "Pranav"} # Resolves "I/me/my" → "Pranav"
)
memory = Nous("agent_memory.db", extractor=extractor)
memory.observe("I switched from Mistral to GPT-4 because legal reasoning improved.")
memory.observe("Actually wait, someone said Pranav is still at Sarvam AI?")
memory.observe("No confirmed, he's definitely at Google DeepMind on Gemini.")
print(memory.predict("Pranav", "employer"))
# → {"Google DeepMind": 0.97, "Sarvam AI": 0.03}
Explainability
# Full auditable history
for delta in memory.history("Pranav", "employer"):
print(f"Surprise: {delta.surprise:.1f} bits | {delta.evidence[:60]}")
# → Surprise: 4.3 bits | Actually, I left Sarvam AI and joined Google...
# → Surprise: 0.2 bits | Pranav is definitely at Google DeepMind, I saw...
# Time-travel: what did the agent believe 30 days ago?
past_belief = memory.query_at("Pranav", "employer", at_time=timestamp_30_days_ago)
Surprise scoring
bits = memory.surprise("Pranav is still at Sarvam AI.")
# → 5.1 bits (high — contradicts current belief)
bits = memory.surprise("Pranav works at Google DeepMind.")
# → 0.1 bits (low — we already know this)
API Reference
Nous(db_path, extractor=None)
| Method | Description |
|---|---|
observe(text, source, reliability) |
Ingest text, update beliefs |
predict(entity, attribute) |
Get current probability distribution |
query_at(entity, attribute, timestamp) |
Time-travel query |
history(entity, attribute) |
Full delta log for an attribute |
explain(entity, attribute, value) |
Why does the agent believe X? |
surprise(text) |
Information content in bits before observing |
get_coupling(entity_a, entity_b) |
Identity similarity hint (0–1) |
get_entity_profile(entity) |
All attributes for an entity |
apply_decay(current_time) |
Apply forgetting to stale dimensions |
LLMExtractor(api_key, model, user_context)
Works with any OpenAI-compatible API endpoint (OpenRouter, OpenAI, local via LM Studio).
| Parameter | Default | Description |
|---|---|---|
api_key |
required | Your API key |
model |
google/gemini-2.5-flash |
Any model on OpenRouter |
user_context |
{} |
Dict with name key to resolve "I/me/my" |
Architecture
Natural Language
↓
LLMExtractor (or rule-based Extractor)
↓
(entity, attribute, value) tuples
↓
BayesianUpdater → surprise score → posterior distribution
↓
Dimension (probability distribution) + Delta (immutable history)
↓ ↓
WorldModel (in-memory cache) DeltaLog (SQLite)
↓
PersistenceLayer (SQLite — survives restarts)
Key properties:
- O(1) per-attribute reads — dictionary lookup, not vector search
- O(k) writes — multiply k floats, normalise (k = number of known values, typically < 10)
- Append-only delta log
- Zero external dependencies
Where Nous Fits (and Where It Doesn't)
| Problem | Vector DB | Knowledge Graph | nous-state |
|---|---|---|---|
| Contradictory facts | Stores both, LLM decides | Manual conflict rules | Bayesian update (automatic) |
| Stale high-confidence facts | Still retrieved | Still in graph | Probability mass shifts |
| "Why does agent believe X?" | Not possible | Requires audit log | Native (delta history) |
| Multi-hop relational queries | ❌ | ✅ Native | ❌ |
| Semantic document retrieval | ✅ Native | ❌ | ❌ |
| Per-attribute read cost | O(n) ANN search | O(edges) traversal | O(1) dict lookup |
Research Status
This is research-stage code. The preprint (arXiv:2606.22030) describes this as a first public report — not a final or fully audited result. Ablations, a second benchmark (LongMemEval), and broader backbone evaluation are planned as immediate next steps and will be reported in a future revision.
The benchmark/ directory contains the evaluation code that produced the LoCoMo numbers in
the paper. See benchmark/README.md for reproduction instructions.
License
MIT — see LICENSE.
Contributing
Issues and PRs welcome. If you hit a real-world edge case, opening an issue is genuinely valuable.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nous_state-0.2.0.tar.gz.
File metadata
- Download URL: nous_state-0.2.0.tar.gz
- Upload date:
- Size: 52.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac6524a2b9efb823979eba2368e80dbd628fb284f0e64d819e3afce01a3365e5
|
|
| MD5 |
c9e5ffe952a7bf2835cbe9027d52f9c5
|
|
| BLAKE2b-256 |
dd8978f33727776a30cb35ca07bb51dedd8b600b2f94eeba9ffcd0a2d0999296
|
File details
Details for the file nous_state-0.2.0-py3-none-any.whl.
File metadata
- Download URL: nous_state-0.2.0-py3-none-any.whl
- Upload date:
- Size: 49.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad5be91ab5e0e68c6b228a2e484a3ebc4af2beda3dac45ebb1a4a0c9dff500b0
|
|
| MD5 |
02ff4deb6fb2db47fa565c1ba67014d4
|
|
| BLAKE2b-256 |
950d5b003169d3ce0f5eb68852b779e106e7b6de29480fe109c63994dcbb97f8
|