Skip to main content

Pipeline instrumentation SDK for ragradar observability system

Project description

ragradar-capture

Developer-side SDK that instruments a RAG or agentic pipeline at key stages and writes structured run records to a local SQLite store (~/.ragradar/runs.db). Capturing is the verb — the stored data are runs, addressed as sNrN ids (e.g. s2r3).

pip install ragradar-capture

Stdlib-only at runtime (plus the tiny ragradar-core kernel it shares with the other ragradar packages).

The one-liner

import ragradar_capture

run_id = ragradar_capture.capture("what is RRF?", "RRF fuses rankings.")
print(run_id)   # "s1r1"

Two fields are enough. Everything else is optional keyword-only: chunks, final_prompt, token_budget, history_pre, history_post, eviction_reason, cache_events, tool_calls, model, token_usage, pipeline. A misspelled keyword fails immediately with TypeError — the signature is explicit, not **kwargs.

The staged pattern

start() returns a Capture — the action object for one pipeline run. Every argument below takes plain Python — dicts, tuples, a bare int — never a schema type you have to import:

import ragradar_capture

cap = ragradar_capture.start(query="what is RRF?", pipeline="my_project")

cap.chunks([                                    # retrieval stage — only
    {"content": "RRF combines rankings from multiple retrievers.",   # "content" is required
     "retrieval_score": 0.9, "rerank_score": 0.95},
])
cap.context(                                    # assembly stage — a bare
    "System: answer using context.\nContext: ...\nQuery: what is RRF?",
    4096,                                        # int = total token limit; headroom is derived
)
cap.history(                                    # history management stage
    pre=[{"user": "hello"}],
    post=[{"user": "hello"}],
    eviction_reason="token_budget",
)
cap.cache({"c1": True})                            # any time pre-commit
cap.tool_call({"tool_name": "rerank",              # appends, once per call
               "arguments": {"chunk_ids": ["c1"]}})
run_id = cap.response(                          # LLM output stage — auto-commits
    "RRF merges ranked lists into one.",
    token_usage={"input_tokens": 300, "output_tokens": 40},  # total derived
)
print(run_id)

The schema dataclasses (ChunkRecord, TokenBudget, Turn, CacheEvent, TokenUsage, ToolCallRecord — tables below) still exist for callers who want static typing or already have data in that shape; pass one in anywhere a dict is shown above and it's used as-is.

cap.response() auto-commits and returns the run id; cap.commit() is only needed if you skip response(). Commit is idempotent — a second call returns the same id without writing again.

The returned run id

Every committed capture hands back the run's sNrN id — also available as cap.run_id (which is None before commit). Feed it straight to the other tools:

ragradar explain s2r3          # analyze the run
ragradar-evaluate run s2r3     # score it

If an internal failure was swallowed (see below), the id is None instead — check for that before passing it on.

Thread-local proxies

After ragradar_capture.start(), module-level functions (ragradar_capture.chunks(), .context(), .history(), .cache(), .tool_call(), .response(), .commit()) route to the same capture from anywhere on that thread — no need to pass the Capture object through every function signature. With no active capture they log an error and no-op.

Dataclasses

These are the advanced/typed path — every capture call above accepts plain dicts (with sensible defaults filled in — e.g. chunks only needs content) or shorthand forms ({"user": "..."} turns, a bare int budget), not these types directly. Reach for them only if you want static typing or are round-tripping data already in this shape. All of them tolerate unknown keyword arguments (silently dropped), so adding fields later never breaks existing instrumentation.

RunRecord

Field Type Notes
query str required
response str required
chunks list[ChunkRecord] optional
final_prompt str optional
token_budget TokenBudget optional
history_pre / history_post list[Turn] optional
eviction_reason str optional
cache_events list[CacheEvent] optional
tool_calls list[ToolCallRecord] optional
model str optional
token_usage TokenUsage optional

ChunkRecord

Field Type Notes
chunk_id str required
source_doc_id str required
content str required
token_count int required
retrieval_score float optional
rerank_score float optional
retrieval_path str optional — e.g. "bm25", "ann", "hybrid"
truncated bool default False
cache_hit bool optional

TokenBudget

Field Type
total_limit int
chunks_allocated int
history_allocated int
system_allocated int
headroom int

TokenUsage

Field Type
input_tokens int
output_tokens int
total_tokens int

Turn

Field Type Notes
role str e.g. "user", "assistant"
content str
tokens int optional

CacheEvent

Field Type Notes
chunk_id str
hit bool
cache_source str optional — e.g. "disk"

ToolCallRecord

Field Type Notes
tool_name str
arguments dict
result str optional
error str optional
latency_ms float optional

The never-raise philosophy (and strict mode)

In production, instrumentation must never take down the pipeline it observes. Every capture call is wrapped internally: conversion errors, store failures — all swallowed, logged to ~/.ragradar/errors.log, and the call returns (None where a run id was expected). Your pipeline never sees an exception from ragradar-capture.

During development you usually want the opposite. Strict mode makes those same errors raise:

import ragradar_capture

ragradar_capture.set_strict(True)          # in code
# or, without a code change:
# RAGRADAR_CAPTURE_STRICT=1 python my_pipeline.py

The only error that raises regardless of mode is a misspelled keyword to capture() — that's a bug at the call site, caught by Python itself as TypeError.

Scaffold CLI

ragradar-capture init

Generates a starter ctx_pipeline.py in the current directory with capture calls pre-positioned at the right pipeline stages (refuses to overwrite an existing one).

Store

First capture creates ~/.ragradar/runs.db (schema managed by ragradar-core; migrations are automatic). Sessions group runs automatically on a 30-minute idle gap — no developer action needed. Browse with ragradar, score with ragradar-evaluate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragradar_capture-0.1.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragradar_capture-0.1.0-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file ragradar_capture-0.1.0.tar.gz.

File metadata

  • Download URL: ragradar_capture-0.1.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ragradar_capture-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4ce40558f55bf0b029a8537f0628c77fec1573ba6f9990b6ed4a1e21c27c18c7
MD5 fc86abb554fd6dac23be6e8e3b4aeccb
BLAKE2b-256 ab51a0e18307df8660a007a106a4fb3a3b13d5b2482b0970a16396e87f1239e4

See more details on using hashes here.

File details

Details for the file ragradar_capture-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ragradar_capture-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b640c968f54422b8599831df71c38643be7ce4478aa150efd95d6930cb0b4bd1
MD5 48f30530ffc9d4ea5341aceb0e51d5b1
BLAKE2b-256 5e17e9643929ba7d2742a34b1fc590022a16731a494149eff589d3fbd90de2f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page