<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>PyPI recent updates for llm-guard-kit</title>
    <link>https://pypi.org/project/llm-guard-kit/</link>
    <description>Recent updates to the Python Package Index for llm-guard-kit</description>
    <language>en</language>    <item>
      <title>0.66.1</title>
      <link>https://pypi.org/project/llm-guard-kit/0.66.1/</link>
      <description>v0.66.0: sycophancy_score() direction corrected — AUROC 0.9296 [0.8754, 0.9703] on Perez 2022 (was 0.0704, wrong direction). AgentShield judge_risk field fix.</description>
      <pubDate>Wed, 25 Mar 2026 17:22:51 GMT</pubDate>
    </item>    <item>
      <title>0.66.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.66.0/</link>
      <description>v0.66.0: sycophancy_score() direction corrected — AUROC 0.9296 [0.8754, 0.9703] on Perez 2022 (was 0.0704, wrong direction). AgentShield judge_risk field fix.</description>
      <pubDate>Wed, 25 Mar 2026 16:09:20 GMT</pubDate>
    </item>    <item>
      <title>0.65.2</title>
      <link>https://pypi.org/project/llm-guard-kit/0.65.2/</link>
      <description>v0.65.2: README corrected — SC_OLD HP 0.817→0.625 (live chains), J5 Haiku fresh-holdout numbers, DomainRouter and cross-arch ensemble added to validated AUROC table.</description>
      <pubDate>Wed, 25 Mar 2026 10:18:17 GMT</pubDate>
    </item>    <item>
      <title>0.65.1</title>
      <link>https://pypi.org/project/llm-guard-kit/0.65.1/</link>
      <description>v0.65.1: P(True) NQ corrected from 0.810→0.623 (tie-handling bug in custom auroc()); gate FAIL documented; DomainRouter NQ signal unchanged (mfe_sc_pair 0.667).</description>
      <pubDate>Wed, 25 Mar 2026 09:51:02 GMT</pubDate>
    </item>    <item>
      <title>0.65.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.65.0/</link>
      <description>v0.65.0: Corrected AUROC claims — FARL-chain vs live-chain distinction; sc_old HP live=0.557-0.625 (not 0.817); MiniJudge transfer=0.752; all numbers sklearn-verified.</description>
      <pubDate>Tue, 24 Mar 2026 18:57:46 GMT</pubDate>
    </item>    <item>
      <title>0.64.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.64.0/</link>
      <description>v0.64.0: DomainRouter — validated per-domain oracle signal routing (mean AUROC 0.728 vs 0.633 baseline), QuestionTypeClassifier, use_domain_routing in score_chain().</description>
      <pubDate>Tue, 24 Mar 2026 18:18:33 GMT</pubDate>
    </item>    <item>
      <title>0.63.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.63.0/</link>
      <description>v0.63.0: Library reorganization — 69 modules into 11 domain subdirs, full unit test suite (1945 tests), module headers, exception narrowing.</description>
      <pubDate>Tue, 24 Mar 2026 11:34:31 GMT</pubDate>
    </item>    <item>
      <title>0.62.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.62.0/</link>
      <description>v0.62.0: Cost Phase D C1-C3: AgentGuard.enable_3tier_routing(), ResponseCache (SQLite TTL cache), MiniJudge v3 retrain experiment.</description>
      <pubDate>Mon, 23 Mar 2026 20:32:26 GMT</pubDate>
    </item>    <item>
      <title>0.58.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.58.0/</link>
      <description>v0.57.0: Wave 2 gap remediation: VisualCaptionBridge.score_risk() (correct AUROC convention); MultilingualSCScorer far-shuffle validated (AR=0.864, ES=0.897, ZH=0.873); VCB AUROC 0.927 on natural images.</description>
      <pubDate>Mon, 23 Mar 2026 17:09:03 GMT</pubDate>
    </item>    <item>
      <title>0.57.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.57.0/</link>
      <description>v0.57.0: Wave 2 gap remediation: VisualCaptionBridge.score_risk() (correct AUROC convention); MultilingualSCScorer far-shuffle validated (AR=0.864, ES=0.897, ZH=0.873); VCB AUROC 0.927 on natural images.</description>
      <pubDate>Mon, 23 Mar 2026 16:13:37 GMT</pubDate>
    </item>    <item>
      <title>0.56.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.56.0/</link>
      <description>v0.56.0: Research wave 1+2: DomainInvariantSelector (MMD), IntraChainDriftDetector (CUSUM), ProofStepVerifier (SymPy+Z3, 83.6% coverage MATH L3-5, AUROC 0.964), MultilingualSCScorer, VisualCaptionBridge (BLIP-2), MultiAgentSimulator, MinPerturbationAttacker.</description>
      <pubDate>Mon, 23 Mar 2026 06:37:09 GMT</pubDate>
    </item>    <item>
      <title>0.55.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.55.0/</link>
      <description>v0.55.0: SelfHealer RESOLVE_CONFLICT+ANCHOR_TO_EVIDENCE; MFE fit_domain_calibration(); QppgMonitor add_label()+track_async()+RL auto-adapt; SQLSemanticValidator AUROC 0.89 (PASS); throughput 79 req/s post-warmup; FailureTaxonomist README sync.</description>
      <pubDate>Sun, 22 Mar 2026 12:33:43 GMT</pubDate>
    </item>    <item>
      <title>0.54.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.54.0/</link>
      <description>v0.53.0: score_factual_turn() AUROC 0.687 (PASS); browser chain AUROC 0.636 (PASS); MFE Phase B Sonnet 0.467 (gap model-agnostic); SelfHealer heal_outcome MCP; FailureTaxonomist domain thresholds; A2A disclaimer; sycophancy premise-flip gated AUROC 0.480 FAIL.</description>
      <pubDate>Sun, 22 Mar 2026 09:38:39 GMT</pubDate>
    </item>    <item>
      <title>0.53.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.53.0/</link>
      <description>v0.53.0: score_factual_turn() AUROC 0.687 (PASS); browser chain AUROC 0.636 (PASS); MFE Phase B Sonnet 0.467 (gap model-agnostic); SelfHealer heal_outcome MCP; FailureTaxonomist domain thresholds; A2A disclaimer; sycophancy premise-flip gated AUROC 0.480 FAIL.</description>
      <pubDate>Sun, 22 Mar 2026 08:52:21 GMT</pubDate>
    </item>    <item>
      <title>0.52.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.52.0/</link>
      <description>v0.52.0: DeepChainScorer+RetrievalCascade for 5-8 step chains (AUROC 0.629 vs SC_OLD 0.545 on 2wiki); browser agent step adapter (Playwright/OpenHands/browser-use); MultiTurnGuard.score_factual_turn() P(True) factual path.</description>
      <pubDate>Sun, 22 Mar 2026 07:26:53 GMT</pubDate>
    </item>    <item>
      <title>0.51.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.51.0/</link>
      <description>v0.51.0: Fix P(True) cross-model warning (tracks ptrue Haiku model, not Sonnet judge); MTG score_turn() surfaces quality_preference MT-Bench caveat as UserWarning.</description>
      <pubDate>Sun, 22 Mar 2026 07:10:40 GMT</pubDate>
    </item>    <item>
      <title>0.50.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.50.0/</link>
      <description>v0.50.0: Mamba D-skip ablation experiment (exp_mamba_dskip_ablation.py), OpenAI-compatible proxy adapter (openai_proxy.py) for Open WebUI/LibreChat/Chatbot UI with X-LLMGuard-Risk/Alert headers.</description>
      <pubDate>Sun, 22 Mar 2026 06:58:03 GMT</pubDate>
    </item>    <item>
      <title>0.49.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.49.0/</link>
      <description>v0.49.0: README fixes — changelog header updated to v0.49.0, Phase B AUROC table added (0.559 model-generated vs 0.964 curated, medical domain excluded at 0.412), latency c=50/100 populated (906ms/1697ms p50 at 55-59 req/s).</description>
      <pubDate>Sun, 22 Mar 2026 06:48:24 GMT</pubDate>
    </item>    <item>
      <title>0.48.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.48.0/</link>
      <description>v0.48.0: sycophancy_score() gated with validation-fail warning (AUROC=0.0452 on real Perez 2022 data, anti-correlated). SelfHealer test fixed (NO_ACTION not NO_ACTION_VALIDATED). No API changes.</description>
      <pubDate>Sun, 22 Mar 2026 06:01:02 GMT</pubDate>
    </item>    <item>
      <title>0.47.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.47.0/</link>
      <description>v0.47.0: Phase A extended to 8 datasets; DeBERTa-v3-small encoder option; EnsembleMechanismClassifier (mechanism+SC_OLD joint features); 3 new loaders (pubmed_qa, context_drift, arc_research). Finding: text signals detect hallucination (AUROC=0.96) not reasoning errors (AUROC~0.53). Mamba-CPC workshop paper draft.</description>
      <pubDate>Sat, 21 Mar 2026 22:20:48 GMT</pubDate>
    </item>    <item>
      <title>0.46.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.46.0/</link>
      <description>v0.46.0: MechanismFeatureExtractor — 12 text-level signals (sem_sim, certainty, hedge, lexical overlap, polarity, TTR, numeric/entity density, context sim) + 3 Phase B SC signals. Phase A: 4/5 datasets gate PASS (halueval2_qa AUROC=0.964, summary=0.787, dialogue=0.702, truthfulqa=0.668). Bootstrap CI, stratified CV, cross-dataset transfer test. Phase B: f13-f15 + code execution oracle.</description>
      <pubDate>Sat, 21 Mar 2026 21:37:17 GMT</pubDate>
    </item>    <item>
      <title>0.45.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.45.0/</link>
      <description>v0.39.0: AdversarialDetector natural-chain caveat (AUROC 0.3974 negatively correlated); FailureTaxonomist LLM agreement study (RETRIEVAL_FAILURE F1=0.701 only); P(True) cross-model AUROC 0.648 (below gate); MultiTurnGuard few-shot FAIL (zero-shot preferred); HumanEval scope-limited (execution-based verification needed); A2A routing requires judge signal; causal drift requires strong base signal.</description>
      <pubDate>Sat, 21 Mar 2026 20:07:47 GMT</pubDate>
    </item>    <item>
      <title>0.44.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.44.0/</link>
      <description>v0.39.0: AdversarialDetector natural-chain caveat (AUROC 0.3974 negatively correlated); FailureTaxonomist LLM agreement study (RETRIEVAL_FAILURE F1=0.701 only); P(True) cross-model AUROC 0.648 (below gate); MultiTurnGuard few-shot FAIL (zero-shot preferred); HumanEval scope-limited (execution-based verification needed); A2A routing requires judge signal; causal drift requires strong base signal.</description>
      <pubDate>Sat, 21 Mar 2026 08:23:39 GMT</pubDate>
    </item>    <item>
      <title>0.43.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.43.0/</link>
      <description>v0.39.0: AdversarialDetector natural-chain caveat (AUROC 0.3974 negatively correlated); FailureTaxonomist LLM agreement study (RETRIEVAL_FAILURE F1=0.701 only); P(True) cross-model AUROC 0.648 (below gate); MultiTurnGuard few-shot FAIL (zero-shot preferred); HumanEval scope-limited (execution-based verification needed); A2A routing requires judge signal; causal drift requires strong base signal.</description>
      <pubDate>Fri, 20 Mar 2026 19:00:19 GMT</pubDate>
    </item>    <item>
      <title>0.42.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.42.0/</link>
      <description>v0.39.0: AdversarialDetector natural-chain caveat (AUROC 0.3974 negatively correlated); FailureTaxonomist LLM agreement study (RETRIEVAL_FAILURE F1=0.701 only); P(True) cross-model AUROC 0.648 (below gate); MultiTurnGuard few-shot FAIL (zero-shot preferred); HumanEval scope-limited (execution-based verification needed); A2A routing requires judge signal; causal drift requires strong base signal.</description>
      <pubDate>Fri, 20 Mar 2026 18:40:08 GMT</pubDate>
    </item>    <item>
      <title>0.41.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.41.0/</link>
      <description>v0.39.0: AdversarialDetector natural-chain caveat (AUROC 0.3974 negatively correlated); FailureTaxonomist LLM agreement study (RETRIEVAL_FAILURE F1=0.701 only); P(True) cross-model AUROC 0.648 (below gate); MultiTurnGuard few-shot FAIL (zero-shot preferred); HumanEval scope-limited (execution-based verification needed); A2A routing requires judge signal; causal drift requires strong base signal.</description>
      <pubDate>Fri, 20 Mar 2026 18:11:06 GMT</pubDate>
    </item>    <item>
      <title>0.39.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.39.0/</link>
      <description>v0.39.0: AdversarialDetector natural-chain caveat (AUROC 0.3974 negatively correlated); FailureTaxonomist LLM agreement study (RETRIEVAL_FAILURE F1=0.701 only); P(True) cross-model AUROC 0.648 (below gate); MultiTurnGuard few-shot FAIL (zero-shot preferred); HumanEval scope-limited (execution-based verification needed); A2A routing requires judge signal; causal drift requires strong base signal.</description>
      <pubDate>Fri, 20 Mar 2026 17:26:56 GMT</pubDate>
    </item>    <item>
      <title>0.38.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.38.0/</link>
      <description>v0.38.0: SelfHealer validated (RETRIEVAL_FAILURE +11.1pp, EXCESSIVE_SEARCH +20pp); ModelTransferCalibrator Brier-only (cross-model AUROC not improved); UserWarnings on unvalidated components (MultiTurnGuard, AdversarialDetector, LabelFreeScorer, ToolCallExtractor); SC_OLD feature ablation; MT-Bench full n=2575 AUROC 0.667; ChatGuard methodology doc; ChatGuard judge AUROC 0.7138 (n=100, fixed seed).</description>
      <pubDate>Fri, 20 Mar 2026 16:24:37 GMT</pubDate>
    </item>    <item>
      <title>0.37.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.37.0/</link>
      <description>v0.33.0: ToolCallExtractor, GuardPipeline, ModelTransferCalibrator, ActiveSamplingStrategy, ChatGuard unfitted guard + metadata. v0.32.0: ChatGuard.score_chat_with_judge() (AUROC 0.7704), GuardState (SQLite), MetricsExporter (Prometheus).</description>
      <pubDate>Fri, 20 Mar 2026 12:11:34 GMT</pubDate>
    </item>    <item>
      <title>0.36.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.36.0/</link>
      <description>v0.33.0: ToolCallExtractor, GuardPipeline, ModelTransferCalibrator, ActiveSamplingStrategy, ChatGuard unfitted guard + metadata. v0.32.0: ChatGuard.score_chat_with_judge() (AUROC 0.7704), GuardState (SQLite), MetricsExporter (Prometheus).</description>
      <pubDate>Fri, 20 Mar 2026 09:20:27 GMT</pubDate>
    </item>    <item>
      <title>0.35.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.35.0/</link>
      <description>v0.33.0: ToolCallExtractor, GuardPipeline, ModelTransferCalibrator, ActiveSamplingStrategy, ChatGuard unfitted guard + metadata. v0.32.0: ChatGuard.score_chat_with_judge() (AUROC 0.7704), GuardState (SQLite), MetricsExporter (Prometheus).</description>
      <pubDate>Fri, 20 Mar 2026 05:09:53 GMT</pubDate>
    </item>    <item>
      <title>0.34.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.34.0/</link>
      <description>v0.33.0: ToolCallExtractor, GuardPipeline, ModelTransferCalibrator, ActiveSamplingStrategy, ChatGuard unfitted guard + metadata. v0.32.0: ChatGuard.score_chat_with_judge() (AUROC 0.7704), GuardState (SQLite), MetricsExporter (Prometheus).</description>
      <pubDate>Thu, 19 Mar 2026 23:18:54 GMT</pubDate>
    </item>    <item>
      <title>0.33.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.33.0/</link>
      <description>v0.33.0: ToolCallExtractor, GuardPipeline, ModelTransferCalibrator, ActiveSamplingStrategy, ChatGuard unfitted guard + metadata. v0.32.0: ChatGuard.score_chat_with_judge() (AUROC 0.7704), GuardState (SQLite), MetricsExporter (Prometheus).</description>
      <pubDate>Thu, 19 Mar 2026 14:23:25 GMT</pubDate>
    </item>    <item>
      <title>0.32.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.32.0/</link>
      <description>v0.31.0: AdversarialChainDetector — $0 behavioral detection of confident-wrong FARL chains (AUROC 0.996, Precision@FPR10 0.955). v0.30.0: FederatedCalibrator (Laplace DP, federated merge ≤5pp MAE). v0.29.0: ConformalRiskBudget, QuestionComplexityScorer (AUROC 0.79), DebateVerifier (precision 85.4%).</description>
      <pubDate>Thu, 19 Mar 2026 08:58:09 GMT</pubDate>
    </item>    <item>
      <title>0.31.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.31.0/</link>
      <description>v0.31.0: AdversarialChainDetector — $0 behavioral detection of confident-wrong FARL chains (AUROC 0.996, Precision@FPR10 0.955). v0.30.0: FederatedCalibrator (Laplace DP, federated merge ≤5pp MAE). v0.29.0: ConformalRiskBudget, QuestionComplexityScorer (AUROC 0.79), DebateVerifier (precision 85.4%).</description>
      <pubDate>Thu, 19 Mar 2026 08:19:40 GMT</pubDate>
    </item>    <item>
      <title>0.30.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.30.0/</link>
      <description>v0.30.0: FederatedCalibrator — privacy-preserving shared isotonic calibration with Laplace DP (ε=1.0); federated merge ≤5pp MAE vs centralised. v0.29.0: ConformalRiskBudget, QuestionComplexityScorer (AUROC 0.79), DebateVerifier (precision 85.4%), GuardPhaseController.</description>
      <pubDate>Thu, 19 Mar 2026 07:19:15 GMT</pubDate>
    </item>    <item>
      <title>0.29.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.29.0/</link>
      <description>v0.29.0: ConformalRiskBudget (per-step risk allocation), QuestionComplexityScorer (cross-domain OOD, AUROC 0.79), DebateVerifier (multi-agent debate triage, precision 85.4%), GuardPhaseController (judge→guard bootstrap with starvation injection).</description>
      <pubDate>Wed, 18 Mar 2026 22:29:48 GMT</pubDate>
    </item>    <item>
      <title>0.28.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.28.0/</link>
      <description>v0.28.0: ComprehensiveMathVerifier — programmatic math verification (SymPy + exec sandbox + step-pattern analysis), no LLM required.</description>
      <pubDate>Wed, 18 Mar 2026 19:56:36 GMT</pubDate>
    </item>    <item>
      <title>0.27.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.27.0/</link>
      <description>v0.27.0: ChatGuard.auto_fit() zero-label bootstrapping (AUROC 0.8948 HaluEval), 16-feature CoV-extended fit() (AUROC 0.9930), ChatCoVFeatureExtractor (4 CoV risk features). Components A+B gate-validated.</description>
      <pubDate>Wed, 18 Mar 2026 17:04:16 GMT</pubDate>
    </item>    <item>
      <title>0.26.0</title>
      <link>https://pypi.org/project/llm-guard-kit/0.26.0/</link>
      <description>v0.25.0: StepTransformerVerifier (2-layer Transformer, best HP+TV joint training AUROC 0.705), ZeroShotCalibrator (0-label conformal calibration precision 0.708), MathVerifier (TPR=100% FPR=15%), score_with_selfconsistency + score_with_selfcritique. NLI grounding negative result documented.</description>
      <pubDate>Tue, 17 Mar 2026 20:41:56 GMT</pubDate>
    </item>  </channel>
</rss>