Skip to main content

A real-time, provenance-invalidated cognitive cache layer for AI agents and RAG.

Project description

Coalent

Real-time, provenance-invalidated context for AI agents & RAG.
Build understanding once. Reuse it everywhere. Keep it fresh — automatically.

pypi python license typed tests

📖 Documentation  ·  coalent.ai

Quickstart · How it works · Bring your own stack · Benchmark · CLI


Your agent re-reads the same sources on every call — and the moment a source changes, every cached answer is silently wrong.

Coalent builds the understanding once, caches it by what the query means, and invalidates it surgically the instant an underlying source changes. As correct as re-reading everything, at a fraction of the cost — and never stale.

Why Coalent

Every context layer is forced to trade off three things. Coalent is built to hold all three at once:

  • 🧠 Understanding, not chunks. It caches the decision-ready understanding your LLM produced — and retains the raw evidence with every unit, so it can never return less than plain retrieval.
  • ♻️ Reuse across queries and agents. A semantic cache keyed by query meaning: ask again, or from another agent, and it's a warm hit — no re-retrieval, no re-synthesis.
  • 🌿 Fresh by provenance. Every unit remembers the exact sources it used. When one changes, only the units that actually used it go stale — precisely, automatically, and lazily.

Coalent sits above retrieval — bring any retriever (vector DB, hybrid search, GraphRAG, tools, APIs). It's the freshness-and-reuse layer, not another retriever.

Install

pip install coalent          # the core has zero required dependencies

Quickstart

Runs as-is — StubSynthesizer needs no API key, so you can feel the loop in ten seconds:

from coalent import SemanticCache, InMemoryRetriever, StubSynthesizer

# 1. Any retriever — a vector DB, a tool, an API. (In-memory here for the demo.)
retriever = InMemoryRetriever()
retriever.add("confluence:hr", "Leave policy: 21 days of annual leave per year.")

# 2. Build the cache. Swap StubSynthesizer for a real LLM below.
cache = SemanticCache(retriever, StubSynthesizer())

# 3. Ask. The first call builds understanding and caches it; the next is a warm hit.
result = cache.get("what is our leave policy?")
print(result.context["understanding"])
print(result.cache_hit)        # False (cold) -> True on the next call

# 4. A source changed? Only the units that used it go stale — surgically.
cache.source_changed("confluence:hr", text="Leave policy: now 25 days.")
# the next matching read rebuilds just that one unit, lazily

Wire in a real model — any text-in / text-out LLM works:

from coalent import SemanticCache, LLMSynthesizer, OpenAIProvider

cache = SemanticCache(retriever, LLMSynthesizer(OpenAIProvider(), model="gpt-4o-mini"))

How it works

        query ──► embed ──► semantic cache
                               │  hit & fresh?  ──► serve cached understanding  (no retrieval, no LLM)
                               │  miss / stale? ─┐
                               ▼                 ▼
                          your Retriever ──► your Synthesizer ──► Cognition unit
                          (vector/tool/API)   (LLM or passthrough)  { understanding
                               ▲                                      + raw evidence
                               │                                      + provenance }
   source changed ────────────┘   dirties ONLY the units that used that source
  1. Embed the query and look for an existing unit with similar meaning.
  2. Hit + fresh → return the cached understanding (no retrieval, no LLM call).
  3. Miss or stale → retrieve, synthesize understanding, retain the raw evidence, record provenance (the exact sources used), and cache it.
  4. A source changessource_changed(id) marks only the units whose provenance includes that id; they rebuild lazily on the next read.

Unchanged content is skipped via a content-hash compare, so a no-op change costs nothing.

Bring your own stack

Coalent owns a tiny contract and passes everything else through to your tools.

Retrievers — a ladder from one-liner to full control:

You have… Use
Qdrant / Chroma / pgvector a shipped adapter (bring-your-own-client)
another vector DB extend BaseVectorRetriever
an existing search function FunctionRetriever
several sources to fuse CompositeRetriever
anything else implement Retriever (one method)
from coalent import QdrantRetriever

retriever = QdrantRetriever(client=my_client, collection="docs", embed=my_embed)

Synthesizers — turn evidence into understanding:

  • LLMSynthesizer — structured, citation-grounded understanding via your LLM (OpenAI, Anthropic, or any provider). You own the instruction and fields; Coalent owns the source / strict-JSON / citation envelope, so provenance is captured no matter what you ask for.
  • JSONPassthroughSynthesizer — for already-structured tool/API JSON: caches it as the understanding, no LLM call.

Stores — durable and restart-safe (the invalidation graph rebuilds on startup):

from coalent import SemanticCache, SQLiteCognitionStore   # stdlib, no server
from coalent import RedisCognitionStore                   # shared across processes / hosts

cache = SemanticCache(retriever, synthesizer, store=SQLiteCognitionStore("coalent.db"))

Any agent framework — the read API is a single call, so it drops in anywhere. Shipped helpers for graph nodes and MCP tools:

from coalent import make_cognition_node, build_mcp_tools

node = make_cognition_node(cache)     # a graph node: state -> { context: fresh understanding }
tools = build_mcp_tools(cache)        # expose the cache as an MCP tool

Benchmark

A real-LLM, quality-first benchmark (gpt-4o-mini, graded by an independent gpt-4o judge) on number-dense documents — answering from Coalent's understanding vs the full raw context, with a source change midway:

System Accuracy Stays fresh Context tokens / read
Full-context RAG 100% 283
Normal cache (raw chunks) 86% ✗ stale 283
Coalent 100% 96

Coalent matches full-context RAG accuracy (independently graded), never goes stale after a source change (a normal cache does), and sends ~66% fewer context tokens — up to 75% on large documents. Cost optimization without trading away quality. (gpt-4o corroborates within ~3%; full two-model breakdown in the docs.)

CLI

Installing Coalent gives you a coalent command — a redis-cli for your cognition cache (over a SQLite store):

$ coalent ls
STATUS  HITS  AGE SRC  ID                  QUERY
fresh      6   2m   2  cog:c95a9d2897e0af  what is our leave policy?
dirty      1  12m   1  cog:7f1a0b9c3d2e4f  remote work rules

$ coalent show cog:c95a9d2897e0af      # understanding + provenance + raw evidence
$ coalent invalidate confluence:98231  # fire a change event
$ coalent stats

Documentation

📚 Full docs: coalent.ai/docs — concepts, provenance & freshness, retrievers, synthesizers, persistence, worked examples (vector search, MCP & tools, agents), and the complete get() / data-model reference.

Install options

pip install coalent                 # core, zero required deps
pip install "coalent[openai]"       # OpenAI provider      (also: anthropic)
pip install "coalent[qdrant]"       # vector adapters      (also: chroma, pgvector)
pip install "coalent[redis]"        # distributed store
pip install "coalent[dev]"          # tests + lint + types

Contributing

Issues and PRs welcome. Run the gate before pushing:

pip install -e ".[dev]"
pytest && ruff check src && mypy src

Status & license

Alpha — the API may change before 1.0. Fully typed (mypy --strict), linted, and tested.

Licensed under Apache-2.0.

Context that's trustworthy, not just cheap.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coalent-0.2.0.tar.gz (100.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coalent-0.2.0-py3-none-any.whl (47.6 kB view details)

Uploaded Python 3

File details

Details for the file coalent-0.2.0.tar.gz.

File metadata

  • Download URL: coalent-0.2.0.tar.gz
  • Upload date:
  • Size: 100.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for coalent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a9bd3481a316ad7692862b22d26f489dd2105eae5bb254da779eb97fccb25758
MD5 775caf14d367e0fcc85529b627717e6a
BLAKE2b-256 04ace4d7bc468510d2dce8331ebc176cb2b6cbd0e9eb60e9c968b0ad513210cc

See more details on using hashes here.

File details

Details for the file coalent-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: coalent-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 47.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for coalent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b38c3063051d8eacc036d1e8b1bdaa8310455e34735d60008effa142af2f3c20
MD5 05a548ade26717cd755defe3445e73ac
BLAKE2b-256 e334e93161213255687dc47429487c70c9a09131b845a65f5bb460741e0627bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page