Skip to main content

LLM context contradiction detector and pruner — give your AI a sleep cycle

Project description

delta-prune

LLM context contradiction detector and pruner.

Problem: When LLMs accumulate contradictory information in their context, reasoning accuracy collapses (GPT-4o-mini: 100% → 10%, Gemini: 100% → 0%). This isn't a context length problem — it's a contradiction problem. Making the window bigger doesn't help.

Solution: Scan (1) chat messages or (2) RAG retrieval chunks for contradictions before you call the downstream LLM. Remove or annotate them. Same strategies (prune / annotate / report) for both.

When to use which API

Use case API Input
Agents, chatbots, multi-turn dialogue prune(messages) [{"role":"user","content":"..."}, ...]
RAG, enterprise search, doc Q&A prune.filter_chunks(chunks) list[str] (one string per retrieved chunk)

v0.3.0: filter_chunks() and ChunkPruneResult — RAG path shares the same conflict detector and optional embedding pre-filter as chat mode.

Paper: "Cognitive Sleep for LLMs: How Contradiction Metabolism Prevents Context Rot"

Install

pip install delta-prune                  # core (zero dependencies)
pip install delta-prune[fast]            # with embedding pre-filter (recommended for long contexts)
pip install delta-prune[openai]        # with OpenAI backend
pip install delta-prune[ollama]          # with Ollama Python client (local models)

Quick Start

from delta_prune import DeltaPrune
from delta_prune.llm import ClaudeCLI

prune = DeltaPrune(llm=ClaudeCLI())

messages = [
    {"role": "user", "content": "My favorite food is curry."},
    {"role": "assistant", "content": "Curry, nice!"},
    {"role": "user", "content": "I hate curry. I like ramen."},
    {"role": "assistant", "content": "Ramen it is!"},
    {"role": "user", "content": "What's my favorite food?"},
]

result = prune(messages)
clean_messages = result.messages  # contradictions annotated
print(f"delta = {result.delta}")  # contradiction density
print(f"conflicts = {len(result.conflicts)}")

RAG: retrieved chunks

Use the same DeltaPrune instance and strategy for retrieval chunks (plain strings). Each non-empty chunk is treated as one factual unit — no per-chunk claim-extraction LLM (unlike chat mode). Pairwise checks use the same conflict detector; pass embedding= and max_llm_pairs when you retrieve many chunks.

from delta_prune import DeltaPrune, ChunkPruneResult
from delta_prune.llm import ClaudeCLI

prune = DeltaPrune(llm=ClaudeCLI(), strategy="prune", locale="en")

chunks = [
    "Product ships in 3–5 business days.",
    "All orders arrive next day guaranteed.",  # may contradict previous chunk
]
result = prune.filter_chunks(chunks)
assert isinstance(result, ChunkPruneResult)

context = "\n\n".join(result.filtered_chunks)  # feed to your answer-generation LLM
# result.delta, result.conflicts, result.has_conflicts
Strategy filter_chunks behavior
"annotate" (default) Prepend one chunk that lists contradictions; then original chunks
"prune" Drop older conflicting chunks; order preserved for the rest
"report" Return the input list unchanged (detect only)

Strategies

Strategy Behavior
"annotate" (default) Add a system message listing detected contradictions
"prune" Remove messages containing the older side of contradictions
"report" Detect only, return original messages unchanged
prune = DeltaPrune(llm=llm, strategy="prune")     # remove old contradictions
prune = DeltaPrune(llm=llm, strategy="annotate")   # add context annotation
prune = DeltaPrune(llm=llm, strategy="report")     # detect only, no changes

LLM Backends

from delta_prune.llm import ClaudeCLI, OllamaLLM, OpenAILLM

# Claude CLI (subscription, $0)
prune = DeltaPrune(llm=ClaudeCLI(model="sonnet"))

# Local Ollama
prune = DeltaPrune(llm=OllamaLLM(model="gemma3:27b"))

# OpenAI API
prune = DeltaPrune(llm=OpenAILLM(model="gpt-4o-mini"))

Performance: Embedding Pre-Filter

For long chats or many RAG chunks, use an embedding to avoid O(n²) LLM calls (same DeltaPrune constructor applies to both __call__ and filter_chunks):

from delta_prune.embedding import SentenceTransformerEmbedding

prune = DeltaPrune(
    llm=ClaudeCLI(),
    embedding=SentenceTransformerEmbedding(),  # local, free
    similarity_threshold=0.7,                  # only check similar pairs
    max_llm_pairs=30,                          # hard cap on LLM calls
)
Mode 20 claims 50 claims 100 claims
No embedding (O(n²)) 190 LLM calls 1,225 calls 4,950 calls
With embedding ~10-30 LLM calls ~10-30 calls ~10-30 calls

Install with: pip install delta-prune[fast]

Language

English is the default. Japanese is available:

prune = DeltaPrune(llm=llm, locale="ja")  # Japanese prompts

How It Works

Chat (prune(messages)):

messages (raw conversation)
    ↓
① Extract: pull factual claims from each message (LLM)
    → [("likes curry", turn=0), ("hates curry, likes ramen", turn=2)]
    ↓
② Pre-filter (optional): embedding similarity → keep only similar pairs
    ↓
③ Detect: LLM checks filtered pairs for contradictions
    ↓
④ Resolve: prune / annotate / report → PruneResult

RAG (filter_chunks(chunks)):

list[str] (retrieved chunks)
    ↓
① Each non-empty chunk → one claim (no extraction LLM)
    ↓
② Pre-filter (optional): same as chat
    ↓
③ Detect / ④ Resolve → ChunkPruneResult.filtered_chunks

Delta Score

result.delta = contradiction density (conflicts / claims). 0.0 = clean, higher = more contradictions.

Based on the survival equation: S = μ × e^(-δ×k). Reducing δ has an exponential effect on reasoning quality.

Background

Based on research showing that context rot is caused by contradiction accumulation, not context length. Tested across 8 LLM models with statistically significant results (Kruskal-Wallis p=0.027, complete rank separation). See DeltaZero for the full research.

Changelog

0.3.0

  • RAG API: DeltaPrune.filter_chunks(chunks) -> ChunkPruneResult with filtered_chunks, delta, conflicts (same strategies and optional embedding pre-filter as chat mode).
  • README: use-case table (chat vs RAG), install extra delta-prune[ollama].

0.2.x

  • Chat-only pipeline: claim extraction, conflict detection, prune / annotate / report; EN/JA prompts; optional embedding pre-filter.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delta_prune-0.3.0.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

delta_prune-0.3.0-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file delta_prune-0.3.0.tar.gz.

File metadata

  • Download URL: delta_prune-0.3.0.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for delta_prune-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1f2bfe005d56793d4d6fa6564254c7c13680e486ef15aa96462754264439abc2
MD5 366154134feeef4eedeec3dd9f903a34
BLAKE2b-256 ff696a26ab00a2472624ecfd401d940a80c8cbf7e03cdbf2f9a50560100095ee

See more details on using hashes here.

File details

Details for the file delta_prune-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: delta_prune-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for delta_prune-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e0cc2485e0623511b3f0a6778424253650b0514740816eae3b02cfffe987e9db
MD5 e14128facd854a7872c0e5b58e0ea975
BLAKE2b-256 bd7b26af47348f1531b35309bb2a3860ef28f9fe243769ad45df3111a83dee9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page