dhee

Cognition layer for AI agents — persistent memory, performance tracking, and insight synthesis

These details have not been verified by PyPI

Project links

Project description

Dhee

The cognition layer that turns your agent into a HyperAgent.

4-operation API. Deferred enrichment. Minimal hot-path cost.
Your agent remembers, learns from outcomes, and predicts what you need next.

What is Dhee?

Most memory layers are glorified vector stores. Store text, retrieve text. Your agent is still stateless — it doesn't learn, doesn't track what worked, doesn't warn you when something is regressing.

Dhee is a cognition layer. It gives any agent — Claude, GPT, Gemini, custom — four capabilities that turn it into a self-improving HyperAgent:

Capability	What Dhee does	What your agent gets
Persistent memory	Stores facts with echo-augmented retrieval (paraphrases, keywords, question-forms)	"What theme does the user prefer?" matches "User likes dark mode" even though the words are different
Performance tracking	Records task outcomes, detects trends automatically	Knows it's regressing on code reviews, warns you before you notice
Insight synthesis	Extracts causal hypotheses from outcomes — not raw data, synthesized learnings	"What worked: checking git blame first" transfers to the next bug fix
Prospective memory	Stores future triggers — "remember to X when Y"	Surfaces intentions when the trigger context matches

Benchmark: LongMemEval

Dhee is being evaluated on LongMemEval, the standard benchmark for long-term conversational memory — temporal reasoning, multi-session aggregation, knowledge updates, and counterfactual tracking across 500+ questions. Preliminary results are promising.

Full methodology and results will be published in the benchmark report.

Status

Dhee is experimental software under active development. The core 4-operation API (remember/recall/context/checkpoint) is stable. Advanced subsystems (belief tracking, policy extraction, episodic indexing) are functional but evolving.

Use it. Build on it. But know that internals will change.

Quick Start

pip install dhee[openai,mcp]
export OPENAI_API_KEY=sk-...

MCP (Claude Code, Cursor — zero code)

{
  "mcpServers": {
    "dhee": { "command": "dhee-mcp" }
  }
}

Your agent now has 4 tools. It will use them automatically.

Python SDK

from dhee import Dhee

d = Dhee()
d.remember("User prefers dark mode")
d.recall("what theme does the user like?")
d.context("fixing auth bug")
d.checkpoint("Fixed it", what_worked="git blame first")

CLI

dhee remember "User prefers Python"
dhee recall "programming language"
dhee checkpoint "Fixed auth bug" --what-worked "checked logs"

Docker

docker compose up -d   # uses OPENAI_API_KEY from env

The 4 Tools

Every interface — MCP, Python, CLI, JS — exposes the same 4 operations.

`remember(content)`

Store a fact, preference, or observation.

Hot path: 0 LLM calls, 1 embedding (~$0.0002 typical). The memory is stored immediately. Echo enrichment (paraphrases, keywords, question-forms that make future recall dramatically better) is deferred to checkpoint.

d.remember("User prefers FastAPI over Flask")
d.remember("Project uses PostgreSQL 15 with pgvector")

`recall(query)`

Search memory. Returns top-K results ranked by relevance.

Hot path: 0 LLM calls, 1 embedding (~$0.0002 typical). Pure vector search with echo-boosted re-ranking.

results = d.recall("what database does the project use?")
# [{"memory": "Project uses PostgreSQL 15 with pgvector", "score": 0.94}]

`context(task_description)`

HyperAgent session bootstrap. Call once at the start of a conversation.

Returns everything the agent needs to be effective immediately:

Last session state — pick up where you left off, zero cold start
Performance trends — improving or regressing on this task type
Synthesized insights — "What worked for bug_fix: checking git blame first"
Triggered intentions — "Remember to run auth tests after modifying login.py"
Proactive warnings — "Performance on code_review is declining"
Relevant memories — top matches for the task

ctx = d.context("fixing the auth bug in login.py")
# ctx["warnings"] → ["Performance on 'bug_fix' declining (trend: -0.05)"]
# ctx["insights"] → [{"content": "What worked: git blame → found breaking commit"}]
# ctx["intentions"] → [{"description": "run auth tests after login.py changes"}]

`checkpoint(summary, ...)`

Save session state before ending. This is where the cognition happens:

Session digest — saved for cross-agent handoff (Claude Code crashes? Cursor picks up instantly)
Batch enrichment — 1 LLM call per ~10 memories stored since last checkpoint. Adds echo paraphrases and keywords that make recall work across phrasings
Outcome recording — tracks score per task type, auto-detects regressions and breakthroughs
Insight synthesis — "what worked" and "what failed" become transferable learnings
Intention storage — "remember to X when Y" fires when the trigger matches

d.checkpoint(
    "Fixed auth bug in login.py",
    task_type="bug_fix",
    outcome_score=1.0,
    what_worked="git blame showed the exact commit that broke auth",
    what_failed="grep was too slow on the monorepo",
    remember_to="run auth tests after any login.py change",
    trigger_keywords=["login", "auth"],
)

Cost

Operation	LLM calls	Embed calls	Cost
`remember`	0	1	~$0.0002
`recall`	0	1	~$0.0002
`context`	0	0-1	~$0.0002
`checkpoint`	1 per ~10 memories	0	~$0.001
Typical session	1	~15	~$0.004

Costs assume OpenAI text-embedding-3-small at current pricing. Actual costs vary by provider, model, and configuration.

How It Works (Under the Hood)

Dhee has two layers: the memory store and the cognition engine.

Memory Store — Engram

Stores memories in SQLite + a vector index. On the hot path (remember/recall), zero LLM calls — just embedding. At checkpoint, unified enrichment runs in a single batched LLM call:

Echo encoding — generates paraphrases, keywords, and question-forms so "User prefers dark mode" also matches queries like "what theme?" or "UI preferences"
Category inference — auto-tags for filtering
Fact decomposition — splits compound statements into atomic, searchable facts
Entity + profile extraction — builds a knowledge graph of people, tools, projects

All of this happens in 1 LLM call per ~10 memories. Not 4 calls per memory. One batched call.

Memory decays naturally (Ebbinghaus curve). Frequently accessed memories get promoted from short-term to long-term. Unused ones fade. Storage naturally reduces over time as unused memories decay, unlike systems that keep everything indefinitely.

Cognition Engine — Buddhi

A parallel intelligence layer that observes the memory pipeline and builds meta-knowledge:

Performance tracking — records outcomes per task type, computes trends (moving average). Auto-generates regression warnings and breakthrough insights.
Insight synthesis — stores causal hypotheses ("what worked", "what failed"), not raw data. Insights have confidence scores that update on validation/invalidation.
Prospective memory — stores future triggers with keyword matching. "Remember to run tests after modifying auth" fires when the next query mentions "auth".
Intention detection — auto-detects "remember to X when Y" patterns in stored memories.

Zero LLM calls on the hot path. Pure pattern matching + statistics. Persistence via JSONL files (~3 files total).

Inspired by Meta's DGM-Hyperagents — agents that emergently develop persistent memory and performance tracking achieve self-accelerating improvement that transfers across domains. Dhee provides these capabilities as infrastructure.

Experimental Extensions

Beyond the core cognition engine, Dhee includes experimental subsystems that are functional but still evolving:

Belief store — confidence-tracked facts with Bayesian updates and contradiction detection
Policy store — outcome-linked condition→action rules extracted from task completions
Episodic indexing — structured event extraction for temporal and aggregation queries
Contrastive pairs & heuristic distillation — learning from what worked vs. what failed

These are surfaced through context() and checkpoint() automatically when enabled.

Architecture

Agent (Claude, GPT, Cursor, custom)
  │
  ├── remember(content)     → Engram: embed + store (0 LLM)
  ├── recall(query)         → Engram: embed + vector search (0 LLM)
  ├── context(task)         → Buddhi: performance + insights + intentions + memories
  └── checkpoint(summary)   → Engram: batch enrich (1 LLM/10 mems)
                            → Buddhi: outcome + reflect + intention

~/.dhee/
├── history.db              # SQLite: memories, history, entities
├── zvec/                   # Vector index (embeddings)
└── buddhi/
    ├── insights.jsonl      # Synthesized learnings
    ├── intentions.jsonl    # Future triggers
    └── performance.json    # Task type scores + trends

Advanced

Full MCP Server (24 tools)

Power users who need granular control over skills, trajectories, structural search, and enrichment:

dhee-mcp-full    # exposes all 24 tools

Python — Direct Memory Access

from dhee import FullMemory

m = FullMemory()
m.add("conversation content", user_id="u1", infer=True)
m.search("query", user_id="u1", limit=10)
m.think("complex question requiring reasoning across memories")

Provider Options

pip install dhee[openai,mcp]     # OpenAI (recommended, cheapest embeddings)
pip install dhee[gemini,mcp]     # Google Gemini
pip install dhee[ollama,mcp]     # Ollama (local inference, no API costs)

Contributing

git clone https://github.com/Sankhya-AI/Dhee.git
cd Dhee

./scripts/bootstrap_dev_env.sh
source .venv-dhee/bin/activate

# optional if you prefer manual bootstrap:
# python3 -m venv .venv-dhee
# .venv-dhee/bin/python -m pip install -e ./dhee-accel -e ./engram-bus -e ".[dev]"

pytest

# live vendor-backed suites are explicit opt-in:
# DHEE_RUN_LIVE_TESTS=1 pytest -q tests/test_e2e_all_features.py tests/test_power_packages.py

# manual smoke scripts live under scripts/manual/

4 operations. Deferred enrichment. Your agent remembers, learns, and predicts.

GitHub · PyPI · Issues

MIT License — Sankhya AI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

6.1.0

May 8, 2026

6.0.2

Apr 24, 2026

6.0.1

Apr 24, 2026

6.0.0

Apr 24, 2026

5.0.0

Apr 20, 2026

4.0.0

Apr 18, 2026

3.4.0

Apr 16, 2026

3.3.0

Apr 16, 2026

3.2.0

Apr 6, 2026

This version

3.1.0

Apr 4, 2026

3.0.1

Apr 1, 2026

3.0.0

Apr 1, 2026

2.0.0

Mar 30, 2026

1.0.0

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dhee-3.1.0.tar.gz (677.8 kB view details)

Uploaded Apr 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dhee-3.1.0-py3-none-any.whl (653.6 kB view details)

Uploaded Apr 4, 2026 Python 3

File details

Details for the file dhee-3.1.0.tar.gz.

File metadata

Download URL: dhee-3.1.0.tar.gz
Upload date: Apr 4, 2026
Size: 677.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for dhee-3.1.0.tar.gz
Algorithm	Hash digest
SHA256	`4471ea7e1b1d16eb084019259374a16cba222f63d89e082816e6ecb9a602417a`
MD5	`21f18a0ebfb74de06af19de4267b9a3d`
BLAKE2b-256	`566be6cff706a6f6a8e7a943ec2f747be9768786d99249032315ba9737fee9ed`

See more details on using hashes here.

File details

Details for the file dhee-3.1.0-py3-none-any.whl.

File metadata

Download URL: dhee-3.1.0-py3-none-any.whl
Upload date: Apr 4, 2026
Size: 653.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for dhee-3.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8120e7fcebda3263ce3159f6dffba00869ca23f092a8000b5eb0b5848aab058d`
MD5	`e092701c411cb5a70397a5e41e0c9b7a`
BLAKE2b-256	`e887e83bdb2c12a848811597fdcdce4dbc0775790816ebec4c57a55008d1103a`

See more details on using hashes here.

dhee 3.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

The cognition layer that turns your agent into a HyperAgent.

What is Dhee?

Benchmark: LongMemEval

Status

Quick Start

MCP (Claude Code, Cursor — zero code)

Python SDK

CLI

Docker

The 4 Tools

remember(content)

recall(query)

context(task_description)

checkpoint(summary, ...)

Cost

How It Works (Under the Hood)

Memory Store — Engram

Cognition Engine — Buddhi

Experimental Extensions

Architecture

Advanced

Full MCP Server (24 tools)

Python — Direct Memory Access

Provider Options

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`remember(content)`

`recall(query)`

`context(task_description)`

`checkpoint(summary, ...)`