Stateful RAG kernel with corpus nectar — agents know what's in the corpus before any query runs

These details have not been verified by PyPI

Project description

ragmake

RAG without corpus memory is blind guessing.

Standard retrieval pipelines give an agent whatever chunks scored highest for a query. The agent has no idea what the corpus actually contains — it reasons from fragments and hopes the retriever surfaced the right ones. Every run starts from zero, re-embedding everything, recomputing everything, knowing nothing.

ragmake changes the equation.

It compiles your documents into persistent state and distills a nectar — a corpus-level synthesis that tells the agent what the corpus is, not just what a query happened to surface. Nectar is compiled once and refreshed only when content changes. The agent walks into every conversation already knowing the terrain.

Compile plane and serve plane

The core idea: nectar

In a standard RAG pipeline the agent is reactive. It sees retrieved chunks and has to infer the shape of the corpus from those fragments alone. If retrieval misses something relevant, the agent never knew it existed.

ragmake adds a proactive layer. Before any query runs, the compiler reads the entire corpus and distills it into nectar: the scope, dominant concepts, recurring terminology, and document set — all in one compact, pre-built summary. Nectar is not a retrieval result. It exists independently of any query and persists across runs.

At query time ragmake assembles two things for the agent:

Layer	What it is	When it's built
Nectar	What the corpus contains — its scope, concepts, and shape	Compile time. Cached. Rebuilt only on corpus change.
Evidence	What's specifically relevant to this query	Query time. Retrieved from the vector store.

Together they give the agent both the map and the territory. The agent doesn't have to guess what the knowledge base covers — it already knows.

What "stateful" means

ragmake builds persistent state from your documents and reuses it. Nothing is recomputed unless content has actually changed:

Document hash check — unchanged documents are skipped entirely
Chunk embedding cache — unchanged chunks within a changed document reuse cached vectors
Corpus signature — nectar is rebuilt only when the set of document hashes changes

Only changed content costs anything. Everything else is free on re-ingest.

Artifact stack

This is the "make" in ragmake. Like a build system, it tracks what changed and only rebuilds those targets. The rest of the state is reused as-is.

Install

Python 3.11+ required.

pip install ragmake

Optional extras:

pip install 'ragmake[openai]'    # OpenAI-compatible embeddings and synopsis
pip install 'ragmake[azure]'     # Azure Blob + Azure AI Search
pip install 'ragmake[postgres]'  # Postgres + pgvector

Or from source:

pip install -e .

Quick start

from pathlib import Path

from stateful_rag import (
    FileArtifactStore,
    HashingEmbedder,
    HeuristicSynopsisCompiler,
    InMemoryVectorStore,
    SourceDocument,
    StatefulRAGCompiler,
    StatefulRAGRuntime,
    WordChunker,
)

# Build the compiler — this is what creates and maintains corpus state
compiler = StatefulRAGCompiler(
    artifact_store=FileArtifactStore(Path(".ragmake_state")),
    vector_store=InMemoryVectorStore(),
    embedder=HashingEmbedder(),
    synopsis_compiler=HeuristicSynopsisCompiler(),
    chunker=WordChunker(max_words=120, overlap_words=20),
)

# Ingest documents — nectar is compiled after this
compiler.ingest_documents([
    SourceDocument(
        corpus_id="support-kb",
        document_id="refund-policy.txt",
        text="Enterprise refunds are allowed within 30 days when the onboarding pack is unused.",
    ),
    SourceDocument(
        corpus_id="support-kb",
        document_id="api-access.txt",
        text="Workspace admins can rotate API keys from the admin console.",
    ),
])

# Build the runtime — this is what the agent uses at query time
runtime = StatefulRAGRuntime(
    compiler=compiler,
    vector_store=compiler.vector_store,
    embedder=compiler.embedder,
)

# The agent receives nectar (what the corpus is) + evidence (what's relevant)
context = runtime.build_context(
    "What do I need for an enterprise refund?",
    corpus_id="support-kb",
)
payload = runtime.render_prompt_payload(context)

# payload["synopsis"] — the nectar: corpus-level memory, query-independent
# payload["sources"]  — the evidence: top matching chunks for this query

Reingest the same documents unchanged — zero embedder calls, nectar unchanged, state fully reused.

With a real embedder (OpenAI)

from openai import OpenAI
from stateful_rag.adapters import OpenAICompatibleEmbedder, OpenAISynopsisCompiler, SQLiteArtifactStore, SQLiteVectorStore

client = OpenAI(api_key="...")

compiler = StatefulRAGCompiler(
    artifact_store=SQLiteArtifactStore(".ragmake/state.db"),
    vector_store=SQLiteVectorStore(".ragmake/state.db"),
    embedder=OpenAICompatibleEmbedder(client, model="text-embedding-3-large"),
    synopsis_compiler=OpenAISynopsisCompiler(client, model="gpt-4o-mini"),
)

The OpenAISynopsisCompiler uses a chat completion to write the nectar from representative corpus chunks. The result is stored and reused until the corpus changes.

What the agent prompt payload looks like

{
  "query": "What do I need for an enterprise refund?",
  "corpus_id": "support-kb",
  "content_signature": "a3f9...",
  "synopsis": "Corpus covers enterprise billing, refund eligibility, and API key management...",
  "sources": [
    {
      "document_id": "refund-policy.txt",
      "chunk_id": 0,
      "score": 0.91,
      "text": "Enterprise refunds are allowed within 30 days..."
    }
  ]
}

synopsis is the nectar. It is always present, always current, always independent of the query. The agent can orient itself before reading a single chunk.

CLI

The demo entry point ingests plain UTF-8 text files into a persistent SQLite state directory and renders a prompt payload for a query:

ragmake-demo --query "What changed in the refund policy?" docs/refund.txt docs/api.txt

Options:

--corpus — corpus id (default: demo)
--state-dir — where the SQLite file lives

The benchmark compares stateful ingest against a stateless rebuild across cold, warm, and changed-document phases:

ragmake-benchmark --documents 200 --change-fraction 0.1
ragmake-benchmark --documents 200 --change-fraction 0.1 --backend sqlite --format json

Benchmark: stateful vs stateless

Included implementations

Core interfaces — implement any of these to swap a backend:

ArtifactStore — read/write/iterate JSON artifacts
Embedder — embed a batch of texts
VectorStore — upsert, delete, list, search chunks
SynopsisCompiler — compile nectar from corpus chunks

Local defaults (zero extra dependencies):

FileArtifactStore — JSON files on disk, easy to inspect
InMemoryArtifactStore — tests and ephemeral runs
InMemoryVectorStore — tests and ephemeral runs
HashingEmbedder — deterministic fallback for local demos and tests
HeuristicSynopsisCompiler — LLM-free nectar for local demos and tests
SessionStateManager — per-session learned concepts, entities, and intents

Optional adapters:

OpenAICompatibleEmbedder — OpenAI or Azure OpenAI embeddings
OpenAISynopsisCompiler — LLM-backed nectar via chat completions
SQLiteArtifactStore / SQLiteVectorStore — durable local backend, one file
AzureBlobArtifactStore / AzureAISearchVectorStore — Azure cloud backend
PostgresArtifactStore / PgVectorStore — Postgres + pgvector backend

Session memory

ragmake can optionally carry per-session state into the prompt payload — learned concepts, discussed entities, recent intents, and recent decisions. This lets the agent layer short-term conversational memory on top of the long-term corpus memory (nectar).

from stateful_rag import SessionStateManager

session_manager = SessionStateManager(artifact_store)
runtime = StatefulRAGRuntime(..., session_manager=session_manager)

session_manager.record_learning(
    "customer-42",
    summary="User is handling enterprise billing questions.",
    concepts=["enterprise refunds", "invoice workflow"],
    entities=["Acme Corp"],
)

context = runtime.build_context(
    "What do I need for an enterprise refund?",
    corpus_id="support-kb",
    session_id="customer-42",
)
# payload["session"] now carries learned_concepts, discussed_entities,
# recent_intents, recent_decisions

Current limits

WordChunker is word-based, not tokenizer-aware.
HashingEmbedder and HeuristicSynopsisCompiler are for tests and local demos, not production.
Omitting a document from a later ingest does not delete it; the compiler only updates documents you pass in.
SQLiteVectorStore uses Python-side cosine search — suitable for local and moderate corpus sizes.

Reference

Full architecture, artifact model, adapter notes, persistent local walkthrough, and benchmark behavior:

docs/reference.md

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragmake-0.1.0.tar.gz (29.6 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragmake-0.1.0-py3-none-any.whl (32.5 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file ragmake-0.1.0.tar.gz.

File metadata

Download URL: ragmake-0.1.0.tar.gz
Upload date: May 5, 2026
Size: 29.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for ragmake-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c0370855e2f42da24913dda5179511360f516ce9bd6a60b371177a9171c7a239`
MD5	`6c8a7f10d6549459997781fb84c2cb8c`
BLAKE2b-256	`36975b90dd27161cc6d567cd522d7a35fd9d013033756cf925ca906260a2f8bd`

See more details on using hashes here.

File details

Details for the file ragmake-0.1.0-py3-none-any.whl.

File metadata

Download URL: ragmake-0.1.0-py3-none-any.whl
Upload date: May 5, 2026
Size: 32.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for ragmake-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`183bd63fbdf1cbdd0f7988decd5e6a2f2948986ece2d6ccf625bef684e99a311`
MD5	`17f7bfa814b5eb888d6db07a1a3d5145`
BLAKE2b-256	`16671ecc923a490dbfe6a97d3e4e1b1faa7bf11ec754fbddcebff6b036374ef6`

See more details on using hashes here.

ragmake 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

ragmake

The core idea: nectar

What "stateful" means

Install

Quick start

With a real embedder (OpenAI)

What the agent prompt payload looks like

CLI

Included implementations

Session memory

Current limits

Reference

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes