LLM context contradiction detector and pruner — give your AI a sleep cycle
Project description
delta-prune
LLM context contradiction detector and pruner.
Problem: When LLMs accumulate contradictory information in their context, reasoning accuracy collapses (GPT-4o-mini: 100% → 10%, Gemini: 100% → 0%). This isn't a context length problem — it's a contradiction problem. Making the window bigger doesn't help.
Solution: Scan (1) chat messages or (2) RAG retrieval chunks for contradictions before you call the downstream LLM. Remove or annotate them. Same strategies (prune / annotate / report) for both.
When to use which API
| Use case | API | Input |
|---|---|---|
| Agents, chatbots, multi-turn dialogue | prune(messages) |
[{"role":"user","content":"..."}, ...] |
| RAG, enterprise search, doc Q&A | prune.filter_chunks(chunks) |
list[str] (one string per retrieved chunk) |
v0.3.0: filter_chunks() and ChunkPruneResult — RAG path shares the same conflict detector and optional embedding pre-filter as chat mode.
Paper: "Cognitive Sleep for LLMs: How Contradiction Metabolism Prevents Context Rot"
Install
pip install delta-prune # core (zero dependencies)
pip install delta-prune[fast] # with embedding pre-filter (recommended for long contexts)
pip install delta-prune[openai] # with OpenAI backend
pip install delta-prune[ollama] # with Ollama Python client (local models)
Quick Start
from delta_prune import DeltaPrune
from delta_prune.llm import ClaudeCLI
prune = DeltaPrune(llm=ClaudeCLI())
messages = [
{"role": "user", "content": "My favorite food is curry."},
{"role": "assistant", "content": "Curry, nice!"},
{"role": "user", "content": "I hate curry. I like ramen."},
{"role": "assistant", "content": "Ramen it is!"},
{"role": "user", "content": "What's my favorite food?"},
]
result = prune(messages)
clean_messages = result.messages # contradictions annotated
print(f"delta = {result.delta}") # contradiction density
print(f"conflicts = {len(result.conflicts)}")
RAG: retrieved chunks
Use the same DeltaPrune instance and strategy for retrieval chunks (plain strings). Each non-empty chunk is treated as one factual unit — no per-chunk claim-extraction LLM (unlike chat mode). Pairwise checks use the same conflict detector; pass embedding= and max_llm_pairs when you retrieve many chunks.
from delta_prune import DeltaPrune, ChunkPruneResult
from delta_prune.llm import ClaudeCLI
prune = DeltaPrune(llm=ClaudeCLI(), strategy="prune", locale="en")
chunks = [
"Product ships in 3–5 business days.",
"All orders arrive next day guaranteed.", # may contradict previous chunk
]
result = prune.filter_chunks(chunks)
assert isinstance(result, ChunkPruneResult)
context = "\n\n".join(result.filtered_chunks) # feed to your answer-generation LLM
# result.delta, result.conflicts, result.has_conflicts
| Strategy | filter_chunks behavior |
|---|---|
"annotate" (default) |
Prepend one chunk that lists contradictions; then original chunks |
"prune" |
Drop older conflicting chunks; order preserved for the rest |
"report" |
Return the input list unchanged (detect only) |
Strategies
| Strategy | Behavior |
|---|---|
"annotate" (default) |
Add a system message listing detected contradictions |
"prune" |
Remove messages containing the older side of contradictions |
"report" |
Detect only, return original messages unchanged |
prune = DeltaPrune(llm=llm, strategy="prune") # remove old contradictions
prune = DeltaPrune(llm=llm, strategy="annotate") # add context annotation
prune = DeltaPrune(llm=llm, strategy="report") # detect only, no changes
LLM Backends
from delta_prune.llm import ClaudeCLI, OllamaLLM, OpenAILLM
# Claude CLI (subscription, $0)
prune = DeltaPrune(llm=ClaudeCLI(model="sonnet"))
# Local Ollama
prune = DeltaPrune(llm=OllamaLLM(model="gemma3:27b"))
# OpenAI API
prune = DeltaPrune(llm=OpenAILLM(model="gpt-4o-mini"))
Performance: Embedding Pre-Filter
For long chats or many RAG chunks, use an embedding to avoid O(n²) LLM calls (same DeltaPrune constructor applies to both __call__ and filter_chunks):
from delta_prune.embedding import SentenceTransformerEmbedding
prune = DeltaPrune(
llm=ClaudeCLI(),
embedding=SentenceTransformerEmbedding(), # local, free
similarity_threshold=0.7, # only check similar pairs
max_llm_pairs=30, # hard cap on LLM calls
)
| Mode | 20 claims | 50 claims | 100 claims |
|---|---|---|---|
| No embedding (O(n²)) | 190 LLM calls | 1,225 calls | 4,950 calls |
| With embedding | ~10-30 LLM calls | ~10-30 calls | ~10-30 calls |
Install with: pip install delta-prune[fast]
Language
English is the default. Japanese is available:
prune = DeltaPrune(llm=llm, locale="ja") # Japanese prompts
How It Works
Chat (prune(messages)):
messages (raw conversation)
↓
① Extract: pull factual claims from each message (LLM)
→ [("likes curry", turn=0), ("hates curry, likes ramen", turn=2)]
↓
② Pre-filter (optional): embedding similarity → keep only similar pairs
↓
③ Detect: LLM checks filtered pairs for contradictions
↓
④ Resolve: prune / annotate / report → PruneResult
RAG (filter_chunks(chunks)):
list[str] (retrieved chunks)
↓
① Each non-empty chunk → one claim (no extraction LLM)
↓
② Pre-filter (optional): same as chat
↓
③ Detect / ④ Resolve → ChunkPruneResult.filtered_chunks
Delta Score
result.delta = contradiction density (conflicts / claims). 0.0 = clean, higher = more contradictions.
Based on the survival equation: S = μ × e^(-δ×k). Reducing δ has an exponential effect on reasoning quality.
Background
Based on research showing that context rot is caused by contradiction accumulation, not context length. Tested across 8 LLM models with statistically significant results (Kruskal-Wallis p=0.027, complete rank separation). See DeltaZero for the full research.
Changelog
0.3.0
- RAG API:
DeltaPrune.filter_chunks(chunks) -> ChunkPruneResultwithfiltered_chunks,delta,conflicts(same strategies and optional embedding pre-filter as chat mode). - README: use-case table (chat vs RAG), install extra
delta-prune[ollama].
0.2.x
- Chat-only pipeline: claim extraction, conflict detection, prune / annotate / report; EN/JA prompts; optional embedding pre-filter.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file delta_prune-0.3.0.tar.gz.
File metadata
- Download URL: delta_prune-0.3.0.tar.gz
- Upload date:
- Size: 22.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f2bfe005d56793d4d6fa6564254c7c13680e486ef15aa96462754264439abc2
|
|
| MD5 |
366154134feeef4eedeec3dd9f903a34
|
|
| BLAKE2b-256 |
ff696a26ab00a2472624ecfd401d940a80c8cbf7e03cdbf2f9a50560100095ee
|
File details
Details for the file delta_prune-0.3.0-py3-none-any.whl.
File metadata
- Download URL: delta_prune-0.3.0-py3-none-any.whl
- Upload date:
- Size: 18.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0cc2485e0623511b3f0a6778424253650b0514740816eae3b02cfffe987e9db
|
|
| MD5 |
e14128facd854a7872c0e5b58e0ea975
|
|
| BLAKE2b-256 |
bd7b26af47348f1531b35309bb2a3860ef28f9fe243769ad45df3111a83dee9b
|