Zero-config runtime observability for LangGraph agents: checkpoint, prompt-supply-chain, tool, memory, PII, goal-hijack, inter-agent, and SQL-injection detectors emitted as structured events.
Project description
langgraph-lens
Zero-config runtime observability for LangGraph agents, with opt-in interventions for teams that need to block, redact, or rate-limit.
Try it in 30 seconds
A checked-in synthetic CVE-2026-34070 canary lives at demo/malicious-prompt/. From a clean Python 3.10+ environment:
git clone https://github.com/glenfmessenger/langgraph-lens
cd langgraph-lens
pip install .
langgraph-lens scan-prompt demo/malicious-prompt/
You'll get a supply_chain/jinja_ssti detection at severity critical and a non-zero exit code. Re-run against any normal prompt directory and the same command exits cleanly. See demo/README.md for the full canary write-up.
What it does
langgraph-lens runs as a callback handler inside the LangGraph runtime. The primary path is a BaseCallbackHandler subclass registered globally via LANGGRAPH_LENS=1; the fallback path is a manual Lens instance attached to a specific compiled graph via graph.with_config({"callbacks": [LensCallback(lens)]}).
There are two tiers:
- Tier 1 (observability) is on by default. Detectors inspect every node entry and exit, every checkpoint write and read, every tool call, every memory write, and every prompt load, and emit structured events. They never modify the state, the message list, or the tool call.
- Tier 2 (interventions) is off by default. Each intervention has its own
enabled: falseflag. When enabled, an intervention may block a node, rewrite its state (PII redaction), throttle tool calls, refuse to deserialise a checkpoint, or attachX-Lens-Triggeredheaders to the response.
LANGGRAPH_LENS=1 with no config gets you Tier 1 only. Tier 2 requires an explicit YAML opt-in per feature. Nothing is suppressed without you asking for it.
What you actually get with LANGGRAPH_LENS=1 today
The LangChain callback handler only fires on chain entry / exit, tool calls, and LLM calls. Checkpoints, memory-store writes, and prompt loads happen outside that surface. Since v0.3.0, the global install also patches BaseCheckpointSaver and BaseStore subclasses so the integration gap is closed for the common cases.
| Detector | Fires automatically? | Trigger |
|---|---|---|
| PII (node ingress + final egress) | ✅ Yes | LensCallback.on_chain_start / on_chain_end |
| Goal hijack (system-prompt drift, tool-call drift) | ✅ Yes | on_chain_start |
| Tool misuse (shell metachar, SSRF, allowlist, enumeration) | ✅ Yes | on_tool_start |
| Supply chain — rendered prompt at LLM call | ✅ Yes (new in v0.3) | on_llm_start — scans the prompt text the LLM actually sees |
| Supply chain — static prompt files at load time | ⚠️ Manual | lens.scan_prompt(path) or langgraph-lens scan-prompt … |
| Checkpoint anomaly + SQL injection in metadata | ✅ Yes (new in v0.3) | Auto-patched BaseCheckpointSaver.put/aput/get_tuple/aget_tuple. Set LANGGRAPH_LENS_AUTO_PROTECT=0 to opt out. |
| Memory poisoning | ✅ Yes (new in v0.3) | Auto-patched BaseStore.put/aput. Same opt-out env var. |
Comms — recursion_exceeded, oversized_state_growth |
✅ Yes | on_chain_start |
Comms — undeclared_edge, send_to_undeclared_target |
⚠️ Manual | lens.attach_graph(app) once, then automatic |
| Attack surface (boot scan) | ✅ Yes (new in v0.3) | Fires once on the first node inspection per process with auto-detected env hints |
Honest about the limit: the checkpoint dict that put() receives is not yet serialised, so the byte-level unsafe_pickle_opcode rule does not fire on the write path. It only fires when you hand the lens raw bytes (langgraph-lens scan-checkpoint thread.jsonl) or when a saver's serde returns bytes the lens can inspect. The dict-level rules (schema_drift, missing_thread_id, oversized_blob, SQL injection in metadata.thread_id / checkpoint_ns / checkpoint_id) DO fire on every write.
Why
February 2026 LangGraph checkpoint RCEs. On 25 February 2026, CVE-2026-27794 was disclosed — a remote code execution vulnerability in the LangGraph checkpoint caching layer caused by unsafe pickle fallback in JsonPlusSerializer. A follow-up issue (CVE-2026-28277) exposed unsafe msgpack deserialization in checkpoint loading. Any operator using persistent checkpoints (Postgres, SQLite, Redis, etc.) who allowed untrusted or multi-tenant thread resumption was affected. langgraph-lens detects and logs unsafe pickle opcodes and unknown serializer kinds in every checkpoint it sees, before the runtime hands them to the deserialiser.
Supply-chain risk in shared prompt registries — CVE-2026-34070. LangChain Hub and self-hosted prompt registries distribute Jinja2 chat templates as opaque text. CVE-2026-34070 (March 2026) allows path traversal and unsafe Jinja2 SSTI when ChatPromptTemplate.from_template(..., template_format="jinja2") renders a malicious template. langgraph-lens scans every prompt on load and emits a structured event for any pattern matching known-bad template signatures or path-traversal sequences in the loader call.
Compliance requirements that post-hoc log scraping can't satisfy. Regulated environments need an auditable record that PII was observed leaving an agent, with correlation IDs that match the originating run, thread, and node. Tailing LangGraph Server's access logs after the fact doesn't produce this — the agent's intermediate state is opaque to the proxy. langgraph-lens emits per-node and per-checkpoint events with stable correlation IDs derived from run_id and thread_id, and Tier 2 attaches X-Lens-Triggered: true + X-Lens-Reason headers (or a state["__lens__"] annotation) so downstream callers know inline.
This is not a safety system. It does not provide probabilistic guarantees against adversarial prompts or agent misbehaviour. It provides operational visibility and runtime instrumentation, plus a small number of opt-in hard controls for teams that need them.
Usage with LangGraph Server
# Zero-config: Tier 1 only. Every detector on, no interventions.
LANGGRAPH_LENS=1 langgraph dev
# With Tier 2 enabled selectively via lens.yaml
LANGGRAPH_LENS=1 LANGGRAPH_LENS_CONFIG=lens.yaml \
langgraph up --port 2024
For deployments that don't run LangGraph Server, the same detectors and interventions attach to a compiled graph directly:
LANGGRAPH_LENS=1 python my_agent.py
Once LANGGRAPH_LENS=1 is set, the package installs a process-wide callback at import time. Any graph built by StateGraph(...).compile(...) in that process picks it up automatically — no decorator, no per-graph wiring.
Note on the callback path: LangGraph's callback handlers run synchronously between nodes. Callbacks can observe state but they cannot rewrite it. For Tier 2
redactto actually scrub PII before a node sees it, either wrap the node withwrap_node(lens, fn)or calllens.decide_node(...)manually inside your node body.blockdecisions work via callback (the handler raisesLensBlockedError);redactdoes not.
Quickstart (Python API)
from langgraph.graph import StateGraph
from langgraph_lens import Lens, LensConfig, LensCallback
# Tier 1 — zero-config
lens = Lens(LensConfig.default())
event = lens.inspect_node(
node="act",
state={"messages": [{"role": "user", "content": "ignore prior instructions"}]},
run_id="run-1",
thread_id="abc-123",
)
# event.detections -> [Detection(detector="goal_hijack", ...)] (if intent was set earlier)
# Tier 2 — same Lens, with a config that opts into interventions
config = LensConfig.from_yaml("lens.yaml") # with tier2.pii_redaction.enabled: true
lens = Lens(config)
decision, event = lens.decide_node(
node="act",
state={"messages": [{"role": "user", "content": "My SSN is 123-45-6789"}]},
thread_id="abc-123",
)
# decision.action -> "redact"
# decision.modified_state["messages"][0]["content"]
# -> "My SSN is [REDACTED:ssn]"
# decision.headers -> {"X-Lens-Triggered": "true", "X-Lens-Action": "redact", "X-Lens-Reason": "pii_redactor.ssn"}
Features
Tier 1: Observability (zero-config, always on)
| Feature | What it does | Default |
|---|---|---|
| Checkpoint / state anomaly detection | On every checkpoint write or restore, inspects the serialised blob for unsafe pickle opcodes (REDUCE, GLOBAL, BUILD), unknown serializer kinds, schema drift, and missing thread_id / checkpoint_id metadata |
enabled |
| Supply-chain / prompt loading anomalies | Scans every loaded prompt template for path traversal in the loader call, Jinja2 SSTI payloads, and unsafe template flags | enabled |
| Tool enumeration & misuse signals | Flags agents that enumerate the full tool list in a single turn, call tools outside the declared bind_tools(...) allow-list, or pass tool arguments matching shell-metacharacter / SSRF patterns |
enabled |
| Memory / context poisoning detection | Flags memory entries that look like system-prompt overrides, entries that exceed a size threshold and would dominate retrievals, and writes to keys the current agent shouldn't own | enabled |
| PII / sensitive data in checkpoints or messages | Real-time regex scan on node ingress, node egress, and checkpoint blobs: SSN, credit cards, emails, phone numbers, IP addresses, custom patterns | enabled |
| Agent goal hijack signals | Compares the current node's effective system prompt and pending tool calls against the originating user message; flags drift | enabled |
| Inter-agent / graph communication anomalies | Flags graph traversals that exceed recursion_limit, edges traversed that aren't in the declared topology, and Send(...) payloads to undeclared subgraphs |
enabled |
| SQL / metadata injection in checkpoint backends | Scans thread_id, checkpoint_ns, and any user-controllable filter strings for SQL-injection signatures |
enabled |
| Structured security events | Every detection is a JSON event with correlation_id, run_id, thread_id, node, timestamp, state hash, and reason |
enabled |
Tier 2: Interventions (off by default, opt-in per feature)
| Feature | What it does | Default |
|---|---|---|
| Hard PII redaction | Replaces matched PII in the state's message list and string fields with [REDACTED:<type>] before forwarding to the next node. Mode: redact or block. |
disabled |
| Tool allow-list / misuse defense | Per-graph allow-list of permitted tools + hard block on Tier 1 shell_metachar / ssrf_pattern / oversized_args matches. Mode: block (raises LensBlockedError) or log. |
disabled |
| Checkpoint integrity protection | Refuses to load a checkpoint blob containing unsafe pickle opcodes. Optionally HMAC-signs blobs on write and verifies on read. Mode: enforce (raises) or log. |
disabled |
| Agent goal / prompt guard | Turns Tier 1 system_prompt_drift / tool_call_drift detections into a terminal block. Mode: block or log. |
disabled |
| Rate limiting on tool calls | Token-bucket per tenant | thread | tool, args-size-aware cost. Mode: throttle (returns retry_after) or block (returns 429-equivalent). |
disabled |
| Circuit breaker for cascading failures | Auto-opens on upstream error rate; optionally opens preemptively when an attack is in progress. | disabled |
| Audit-proof signaling | Stamps X-Lens-Triggered, X-Lens-Reason, X-Lens-Action headers on every Tier 2 decision, and optionally writes the same fields into state["__lens__"] for downstream nodes. |
disabled |
Every Tier 2 block in the YAML carries its own enabled flag. Turning on one does not turn on any other. Run any new intervention in log / throttle mode against production traffic before flipping to block / enforce.
When events fire
Every detector emits a JSON event when it matches. Events go to the configured destination (stderr by default) and to Prometheus counters. The shape is stable across detectors:
{"event": "node_inspected", "run_id": "run-1", "thread_id": "abc-123", "node": "act", "correlation_id": "8f3a...", "state_hash": "sha256:9b1d...", "detections": [{"detector": "goal_hijack", "rule": "system_prompt_drift", "severity": "high"}], "timestamp": 1769420401.3}
{"event": "checkpoint_inspected", "run_id": "run-1", "thread_id": "abc-123", "checkpoint_id": "01J9...", "correlation_id": "8f3a...", "detections": [{"detector": "checkpoint", "rule": "unsafe_pickle_opcode", "opcode": "REDUCE", "severity": "critical"}], "timestamp": 1769420402.1}
{"event": "tool_call_inspected", "run_id": "run-1", "thread_id": "abc-123", "tool": "shell", "correlation_id": "8f3a...", "detections": [{"detector": "tool", "rule": "shell_metachar", "match": "; rm -rf", "severity": "high"}], "timestamp": 1769420402.4}
{"event": "attack_surface_scan", "correlation_id": "boot-1769420400", "detections": [{"detector": "attack_surface", "rule": "pickle_checkpoint_backend", "saver": "PostgresSaver", "severity": "high"}], "timestamp": 1769420400.0}
{"event": "prompt_scan", "correlation_id": "load-1769420400", "prompt_path": "/prompts/system.jinja2", "detections": [{"detector": "supply_chain", "rule": "jinja_ssti", "file": "system.jinja2", "severity": "critical"}], "timestamp": 1769420400.2}
correlation_id is stable across every event from the same (run_id, thread_id) so the chain can be reconstructed. state_hash is a SHA-256 of the canonicalised state dict at the moment of inspection — useful for deduping retries and for matching against external audit logs without keeping the state contents themselves.
Inline signaling — Tier 2
When a Tier 2 intervention fires, the lens also signals to the caller inline:
| Action | Behaviour | Headers set on the decision |
|---|---|---|
allow (Tier 1 detection only) |
Pass through | X-Lens-Triggered: true, X-Lens-Reason: <detector>.<rule>,... (if audit_signaling.enabled) |
redact (PII redactor) |
decision.modified_state is the scrubbed state; caller forwards that instead |
X-Lens-Triggered: true, X-Lens-Action: redact, X-Lens-Reason: pii_redactor.<type> |
throttle (rate limiter) |
decision.retry_after is set; caller sleeps and retries, or returns it to the user |
X-Lens-Triggered: true, X-Lens-Action: throttle, Retry-After: <s> |
block (allowlist, goal guard, circuit, checkpoint protector, rate limit in block mode) |
LensBlockedError raised through the callback; decision.status_code is the HTTP-equivalent |
X-Lens-Triggered: true, X-Lens-Action: block, X-Lens-Reason: <rule>, Retry-After: <s> (for rate limit / circuit) |
From a plain Python entry point, the headers live on decision.headers for the caller to use however they want — there is no built-in HTTP middleware in this release, so the caller is responsible for relaying them onto the outgoing response if they want HTTP-level signaling. With audit_signaling.stamp_state: true, the same fields are written into state["__lens__"] so downstream nodes can read them programmatically without HTTP at all.
Limitations
- Callbacks observe, they don't rewrite. Tier 2
redactrequireswrap_node(lens, fn)or a manuallens.decide_node(...)call inside the node body; theLensCallbackalone can't substitute a modified state. - Checkpoint protection is structural. It refuses unsafe pickle opcodes and (optionally) HMAC-mismatched blobs. It does not validate the content of an otherwise-well-formed checkpoint against any schema beyond what Tier 1 already inspects.
- Goal-guard is heuristic. The underlying Tier 1 goal-hijack detector compares the originating user message to the current node's effective system prompt; it will produce false positives when an agent legitimately broadens its scope mid-run. The Tier 2 wrapper only blocks on
system_prompt_driftandtool_call_driftby default —off_topic_subgoal(medium severity) is intentionally excluded. - Rate limiting is in-process. The token bucket lives in the lens instance. In a multi-worker LangGraph Server deployment, each worker has its own bucket. For a shared limiter, run the lens behind a single ingress.
Configuration
YAML config
Tier 1 stays at its defaults if you don't override. Tier 2 stays off if you don't override. The example below shows the shape of every block; see lens.yaml in the repo for the fully-commented version.
# lens.yaml
# Tier 1 — observability (defaults shown)
attack_surface: { enabled: true }
checkpoint: { enabled: true, scan_on_write: true, scan_on_read: true }
supply_chain: { enabled: true, scan_on_load: true }
tool: { enabled: true }
memory: { enabled: true }
pii: { enabled: true, scan_ingress: true, scan_egress: true }
goal_hijack: { enabled: true, user_intent_similarity_threshold: 0.35 }
comms: { enabled: true }
sql_injection: { enabled: true }
prometheus: { enabled: true, port: 9092 }
logging: { enabled: true, destination: stderr, format: json }
alerts: { enabled: false, slack_webhook: "" }
# Tier 2 — interventions (every block defaults to disabled)
tier2:
pii_redaction:
enabled: false
mode: redact # redact | block
patterns:
- type: ssn
- type: credit_card
- type: email
tool_allowlist:
enabled: false
mode: block # block | log
allowed_tools: ["search", "calculator"]
block_on_rules: ["shell_metachar", "ssrf_pattern", "oversized_args"]
checkpoint_protector:
enabled: false
mode: enforce # enforce | log
block_on_rules: ["unsafe_pickle_opcode"]
require_hmac: false
signing_key: ""
goal_guard:
enabled: false
mode: block # block | log
block_on_rules: ["system_prompt_drift", "tool_call_drift"]
rate_limit:
enabled: false
mode: throttle # throttle | block
capacity: 60
refill_per_second: 1.0
key_by_tenant: true
key_by_thread: true
key_by_tool: false
circuit_breaker:
enabled: false
window_seconds: 30
min_samples: 20
error_rate_threshold: 0.5
cooldown_seconds: 30
fail_closed_on_attack: false
audit_signaling:
enabled: false
stamp_state: false
Inline config
from langgraph_lens.config import (
LensConfig, Tier2Config,
PIIRedactionConfig, PIIPattern,
ToolAllowlistConfig,
GoalGuardConfig,
)
config = LensConfig(
tier2=Tier2Config(
pii_redaction=PIIRedactionConfig(
enabled=True,
mode="redact",
patterns=[PIIPattern(type="ssn"), PIIPattern(type="email")],
),
tool_allowlist=ToolAllowlistConfig(
enabled=True,
mode="block",
allowed_tools=["search", "calculator"],
),
goal_guard=GoalGuardConfig(enabled=True, mode="block"),
),
)
One-line launches
# Zero-config Tier 1 only.
LANGGRAPH_LENS=1 langgraph dev
# Tier 2 enabled — every flag stays where you put it in lens.yaml.
LANGGRAPH_LENS=1 LANGGRAPH_LENS_CONFIG=lens.yaml langgraph up --port 2024
# Same lens.yaml for a script-mode agent.
LANGGRAPH_LENS=1 LANGGRAPH_LENS_CONFIG=lens.yaml python my_agent.py
Python — Tier 2 around a compiled graph:
from langgraph_lens import Lens, LensCallback, LensConfig, wrap_node, LensBlockedError
lens = Lens(LensConfig.from_yaml("lens.yaml"))
graph.add_node("act", wrap_node(lens, act_node, node="act")) # for redaction
app = graph.compile(checkpointer=MemorySaver())
try:
result = app.invoke(
state,
config={
"configurable": {"thread_id": "abc-123"},
"callbacks": [LensCallback(lens, enforce_blocks=True)],
},
)
except LensBlockedError as e:
print(f"blocked: {e.decision.reason}", e.decision.headers)
PII patterns
Built-in patterns for common PII types. The same set is used by the Tier 1 detector and the Tier 2 redactor.
| Type | Example match |
|---|---|
ssn |
123-45-6789 |
credit_card |
4111 1111 1111 1111 (Luhn-validated) |
phone_us |
(555) 867-5309 |
phone_intl |
+44 7911 123456 |
email |
user@example.com |
ip_address |
192.168.1.1 |
Limitations: detection is regex-based and runs on the decoded state dict, message list, and checkpoint blob (after the lens decodes msgpack/JSON-Plus). Binary tensors and BLOB columns are not scanned. A pattern that straddles a streaming-chunk boundary in astream_events is inspected at the next checkpoint, not per chunk.
Observability
Prometheus metrics
Scrape at http://localhost:9092/metrics.
Tier 1:
langgraph_lens_attack_surface_detections_total{rule="pickle_checkpoint_backend|..."}
langgraph_lens_checkpoint_detections_total{rule="unsafe_pickle_opcode|schema_drift|..."}
langgraph_lens_supply_chain_detections_total{rule="jinja_ssti|path_traversal|unsafe_chat_template"}
langgraph_lens_tool_detections_total{rule="shell_metachar|enumeration|out_of_allowlist|..."}
langgraph_lens_memory_detections_total{rule="system_prompt_override|oversized_entry|..."}
langgraph_lens_pii_detections_total{type="ssn|email|...",direction="ingress|egress|checkpoint"}
langgraph_lens_goal_hijack_detections_total{rule="system_prompt_drift|tool_call_drift"}
langgraph_lens_comms_detections_total{rule="undeclared_edge|recursion_exceeded|..."}
langgraph_lens_sql_injection_detections_total{rule="union_select|comment_terminator|..."}
langgraph_lens_nodes_inspected_total
langgraph_lens_checkpoints_inspected_total
langgraph_lens_inspection_duration_seconds{stage="node_ingress|node_egress|checkpoint|tool|memory"}
Tier 2 (stays at zero unless an intervention is enabled):
langgraph_lens_tier2_blocked_total{reason="tool_blocked|rate_limited|goal_hijack|checkpoint_rejected|circuit_open|..."}
langgraph_lens_tier2_redacted_total{reason="pii_redactor|..."}
langgraph_lens_tier2_throttled_total{reason="rate_limited"}
langgraph_lens_circuit_state # 0=closed, 1=half_open, 2=open
Multiprocess server: if LangGraph Server forks workers, set PROMETHEUS_MULTIPROC_DIR before starting so metrics from all workers are merged:
mkdir -p /tmp/prometheus_multiproc
export PROMETHEUS_MULTIPROC_DIR=/tmp/prometheus_multiproc
OpenTelemetry
pip install "langgraph-lens[otel]"
otel:
enabled: true
endpoint: http://localhost:4318
service_name: langgraph-agent
export_traces: true
export_metrics: true
Each emitted event becomes its own span (node_inspected, checkpoint_inspected, tool_call_inspected, prompt_scan, attack_surface_scan). Detections within an event are attached as span events on that span. The lens's correlation_id, run_id, thread_id, and node are set as span attributes (langgraph.correlation_id, langgraph.run_id, langgraph.thread_id, langgraph.node). OpenTelemetry's trace_id is generated by the SDK independently of the lens's correlation_id — use the attribute to join with the lens's structured-event log.
Verified end-to-end against a mock OTLP HTTP collector — the collector receives the POST, the protobuf parses to a
node_inspectedspan withlanggraph.correlation_id/run_id/thread_id/nodeattributes set. Reproduce with the script inbench/verify_otel.py.
Slack / webhook alerts
alerts:
enabled: true
slack_webhook: https://hooks.slack.com/services/...
cooldown_seconds: 300
alert_on:
- supply_chain
- attack_surface
- checkpoint
- goal_hijack
Alerts default to supply_chain, attack_surface, and checkpoint only. PII and tool detections are intentionally excluded from default alerts because they fire often and create noise — log them, dashboard them, but don't page on them.
Verified end-to-end against a mock incoming webhook — the lens POSTs a Slack-shaped JSON body (
{"text": "[langgraph-lens] supply_chain detection — rules: jinja_ssti | correlation_id: ..."}) to the configured URL, the cooldown logic suppresses repeats withincooldown_seconds. Reproduce withbench/verify_slack.py. Whether Slack accepts the message is between you and your workspace's webhook configuration.
Performance
The lens adds a fixed per-invoke overhead — +0.39 ms for Tier 1 callback, +0.95 ms with all Tier 2 node-path features (Apple M2; +0.73 ms / +2.02 ms on Linux A100, both measured). That cost doesn't grow with how much real work your nodes do, so the percentage impact shrinks as the workload grows. Two measured rows:
| Per-node work | Invoke time | Tier 1 callback | All Tier 2 |
|---|---|---|---|
| Counter bump (synthetic, no-op) | 1.36 ms | +0.39 ms / +21.6% | +0.95 ms / +40.3% |
| 10 ms simulated work per node | 67.19 ms | +1.21 ms / +1.9% | +3.09 ms / +4.4% |
For a workload heavier than 10 ms / node, you can compute the impact from the fixed cost (lens adds 0.4–1 ms total per invoke regardless of node work). A 100 ms / node workload — typical for a small DB query or embedding call — works out to roughly +0.4% / +1% drop. A 1 s / node workload — a single LLM call — works out to roughly +0.04% / +0.1%. Run bench/bench.py against your own graph if you need exact numbers; the rows above are the only ones with a measured baseline in this repo.
A real LangGraph deployment — anything that calls an LLM — sees the lens overhead disappear into the LLM round-trip. The synthetic worst case (+22%) measures the lens against nodes that do nothing.
Full per-rule numbers, microbenchmarks, and the test-rig spec are in bench/RESULTS.md. Reproduce with:
pip install -e ".[dev]"
python bench/bench.py --markdown
Integrations
When LANGGRAPH_LENS=1 is set, the package patches BaseCheckpointSaver and BaseStore subclasses at import time so existing graph.compile(checkpointer=PostgresSaver(...)) and BaseStore.put(...) calls flow through lens.decide_checkpoint(...) and lens.inspect_memory_write(...) automatically. No source-code changes required.
Usage with real savers
# Zero-config — auto-protected via the global install.
import os
os.environ["LANGGRAPH_LENS"] = "1"
import langgraph_lens # noqa: F401 -- import side-effect installs the patches
from langgraph.checkpoint.postgres import PostgresSaver
saver = PostgresSaver.from_conn_string("postgresql://...")
# `saver` is now auto-protected. Every put/aput/get_tuple/aget_tuple
# call goes through the lens. No further changes needed.
# Explicit per-instance wrap — useful when you want a specific lens
# bound to a specific saver, or you don't want the global patch.
from langgraph_lens import Lens, LensConfig
from langgraph_lens.integrations import protect_saver
from langgraph.checkpoint.sqlite import SqliteSaver
lens = Lens(LensConfig.from_yaml("lens.yaml"))
saver = protect_saver(SqliteSaver.from_conn_string("checkpoints.db"), lens)
app = graph.compile(checkpointer=saver)
To turn off auto-protection (e.g. you want manual control or you're seeing a class-patching incompatibility), set LANGGRAPH_LENS_AUTO_PROTECT=0.
Supply chain protection
The supply-chain detector now has two trigger paths:
- Automatic — rendered prompt at LLM call.
LensCallback.on_llm_startscans the prompt text the LLM is about to see for Jinja2 SSTI signatures. Catches anything that survived template rendering. - Manual — static prompt files at load time.
lens.scan_prompt("./prompts/")or the CLIlanggraph-lens scan-prompt ./prompts/. Catches malicious templates before they hit the runtime — the recommended path for prompt-registry intake / CI.
# In your prompt-registry-sync script
from langgraph_lens import Lens, LensConfig
lens = Lens(LensConfig.default())
event = lens.scan_prompt("./prompts/")
critical = [d for d in event.detections if d.severity.value == "critical"]
if critical:
raise RuntimeError(f"refusing to sync prompts: {critical}")
Memory-store integration
# Zero-config — auto-protected via the global install.
from langgraph.store.memory import InMemoryStore # or any BaseStore subclass
store = InMemoryStore()
store.put(("agent", "memory"), "user_pref", {"text": "..."})
# That call already ran through lens.inspect_memory_write.
# Explicit per-instance wrap.
from langgraph_lens.integrations import protect_store
store = protect_store(InMemoryStore(), lens)
Topology checks for the comms detector
The undeclared_edge and send_to_undeclared_target rules need the static graph topology. One line opts in:
from langgraph_lens import Lens, LensCallback, LensConfig
lens = Lens(LensConfig.default())
app = graph.compile(checkpointer=MemorySaver())
lens.attach_graph(app) # extract declared edges from the compiled graph
After attach_graph, the comms rules fire automatically on every node entry.
Prometheus binding
Defaults to 127.0.0.1:9092. If you need to scrape from another host, set prometheus.bind_address: 0.0.0.0 in lens.yaml and put the port behind a reverse proxy or a network ACL — the exporter has no built-in auth.
CLI
langgraph-lens validate lens.yaml # validate config before deploying
langgraph-lens scan-prompt /path/to/prompts # one-shot supply-chain scan, no runtime needed
langgraph-lens scan-checkpoint thread.jsonl # one-shot checkpoint blob scan
langgraph-lens check # check that the lens is loaded and metrics are up
langgraph-lens version
scan-prompt is the most useful entry point during prompt-registry intake: point it at a freshly pulled prompt directory and get a structured event for anything suspicious before you wire the prompt into a graph.
scan-checkpoint accepts a JSON-lines export of a checkpoint table (one blob per line) and is useful for sweeping a database of existing threads before upgrading to a hardened serializer.
Requirements
- Python ≥ 3.10 (tested locally on 3.13; CI runs 3.10 / 3.11 / 3.12)
- Verified against LangGraph 1.2.x + LangChain Core 1.4.0 on Python 3.10 / 3.11 / 3.12 / 3.13. The
pyproject.tomlconstraint oflanggraph>=1.0reflects the tested range, not a verified compatibility floor — older 1.0 / 1.1 versions may work but are not exercised in CI. - Optional:
langgraph-checkpoint-postgresorlanggraph-checkpoint-sqliteif you want the SQL-injection detector wired into the actual saver call. The detector is unit-tested against synthetic metadata; the real-saver path is not tested.
Maintenance and compatibility
This is a v0.2.0 release. The end-to-end paths verified are: the global LANGGRAPH_LENS=1 callback install on LangChain Core 1.4.0, the per-graph LensCallback(lens) attachment, and the wrap_node(lens, fn) redaction helper against a compiled StateGraph + MemorySaver. The Postgres/SQLite/Redis savers, LangGraph Server (langgraph dev, langgraph up), and multi-worker deployments are not exercised in CI or the benchmark.
If you find it works on other versions, PRs and issue reports are welcome. If you find it breaks, open an issue with the LangGraph version and error — but fixes depend on available time.
Development
git clone https://github.com/glenfmessenger/langgraph-lens
cd langgraph-lens
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v
ruff check src/
mypy src/langgraph_lens/
88 pytest cases in total. Coverage is uneven across rules — every detector has at least one positive test, but not every rule within a detector does. The honest breakdown:
- Tier 1 detectors — every detector module has positive tests for its most-load-bearing rules. The following rules ship without an explicit positive test, exercised only via the static rule list in the detector code:
unsafe_chat_template,unsigned_hub_pull,oversized_blob,unknown_serializer_kind,tool_call_drift,send_to_undeclared_target,oversized_state_growth, and three of the four SQL-injection rules (comment_terminator,stacked_query,metadata_escape). A contribution adding direct tests is welcome. - Lens orchestrator (
tests/test_lens.py,test_config.py) — correlation IDs, state hashing, YAML roundtrip, defaults invariant. - CLI surface (
tests/test_cli.py) — every subcommand:validate,version,scan-prompt(clean directory + the canary),scan-checkpoint(clean + pickle-tainted JSONL),check(live HTTP stub + metrics-absent + unreachable-port). - Tier 2 interventions (
tests/interventions/) — every intervention has positive tests for both modes (block/logorredact/throttle) and the disabled-passthrough case. The PII redactor specifically verifies multi-pattern messages and the deep-copy property (caller's state is not mutated). The checkpoint protector exercises the HMAC sign/verify roundtrip plus the mismatched-HMAC block path. - Decision composition (
tests/test_decide.py) — the orchestration path throughLens.decide_node/decide_tool_call/decide_checkpoint: short-circuit on block, header merging, audit-headers-absent-when-nothing-fires,wrap_noderedaction round-trip (including the context-var thread_id fallback),wrap_noderaisingLensBlockedError, and the attack-signal feed into the circuit breaker. - Real-graph end-to-end —
bench/bench.pybuilds an actual compiledStateGraphwithMemorySaverand exercises the callback path, the directinspect_nodepath, and thewrap_noderedaction path for every Tier 2 feature. A full pass runs ~16 kapp.invoke(...)calls (2000 timed + 200 warmup per synthetic row across 7 rows, plus 200 + 200 per realistic row across 3 rows). - Real-LLM end-to-end —
examples/with_real_llm.pyverified against live OpenAIgpt-4o-mini: a user message containing an SSN reaches the wrappedchatnode as[REDACTED:ssn], the model's response confirms it could not see or echo the original value. Not in CI (no API key); reproducible withOPENAI_API_KEY=... python examples/with_real_llm.py.
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langgraph_lens-0.3.0.tar.gz.
File metadata
- Download URL: langgraph_lens-0.3.0.tar.gz
- Upload date:
- Size: 89.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
092a2a7f178d31d58e182986f87494e49cceca03d5c9560d9ddc0af33315e63a
|
|
| MD5 |
431c99476420505e8cce4a533d56d34a
|
|
| BLAKE2b-256 |
e9a32a9718c30f470bfea0a14509f42ff8433aa29fc30cdda2bd492bfa22a47f
|
Provenance
The following attestation bundles were made for langgraph_lens-0.3.0.tar.gz:
Publisher:
release.yml on glenfmessenger/langgraph-lens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langgraph_lens-0.3.0.tar.gz -
Subject digest:
092a2a7f178d31d58e182986f87494e49cceca03d5c9560d9ddc0af33315e63a - Sigstore transparency entry: 1640007848
- Sigstore integration time:
-
Permalink:
glenfmessenger/langgraph-lens@b38ccfcb499fe20bfd40c3072bed019647e51311 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/glenfmessenger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b38ccfcb499fe20bfd40c3072bed019647e51311 -
Trigger Event:
push
-
Statement type:
File details
Details for the file langgraph_lens-0.3.0-py3-none-any.whl.
File metadata
- Download URL: langgraph_lens-0.3.0-py3-none-any.whl
- Upload date:
- Size: 73.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e73ffd573415064ea811376d42795dc87b3652998315aa75ad3af006b481cbb6
|
|
| MD5 |
36b63534661914e0d4926ffeeaebdf0c
|
|
| BLAKE2b-256 |
949a8736dd7da793c4b3c04eb3be71c7c6492efe8bec87c0dbcb9c3ed47ad57c
|
Provenance
The following attestation bundles were made for langgraph_lens-0.3.0-py3-none-any.whl:
Publisher:
release.yml on glenfmessenger/langgraph-lens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langgraph_lens-0.3.0-py3-none-any.whl -
Subject digest:
e73ffd573415064ea811376d42795dc87b3652998315aa75ad3af006b481cbb6 - Sigstore transparency entry: 1640007978
- Sigstore integration time:
-
Permalink:
glenfmessenger/langgraph-lens@b38ccfcb499fe20bfd40c3072bed019647e51311 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/glenfmessenger
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b38ccfcb499fe20bfd40c3072bed019647e51311 -
Trigger Event:
push
-
Statement type: