Skip to main content

Guardrail regression testing for LLM agent tool calls

Project description

probitas

Guardrail regression testing for LLM agent tool calls. pytest for guardrails — runs deterministic policy rules against test cases, reports pass/fail and guardrail coverage.

Named after the Latin word for proved quality or integrity. Companion to frenum (the guardrail enforcement engine). Probitas tests the bridle; frenum is the bridle.

Why

You have guardrail rules — SQL injection blocks, PII detection, role-based entitlements, cost limits. But how do you know they actually work? Model updates, new tools, and config changes can silently break enforcement. There's no pytest for guardrails, and no "guardrail coverage" metric.

Probitas fills that gap:

  • Deterministic testing — no LLM calls, runs in CI, sub-second execution
  • Guardrail coveragerules_exercised / total_deterministic_rules, like code coverage for your policy
  • Evidence reports — HTML/JSON/text with SHA-256 tamper-evidence hashing
  • Frenum-compatible — same YAML policy schema, test your frenum rules without changes
  • Zero dependencies — stdlib only; pyyaml and jinja2 are optional

Quick Start

pip install probitas[yaml]

Write a policy (policy.yaml):

rules:
  - name: block_sql_injection
    type: regex_block
    applies_to: ["execute_sql"]
    params:
      fields: ["query"]
      patterns:
        - "(?i)(DROP|DELETE|TRUNCATE)\\s+TABLE"

  - name: detect_pii
    type: pii_detect
    applies_to: ["*"]
    params:
      detectors: [email, credit_card]
      action: block

Write test cases (tests/test_sql.yaml):

tests:
  - description: "Safe SELECT should be allowed"
    tool_call:
      name: execute_sql
      args:
        query: "SELECT * FROM accounts"
    expected: allow

  - description: "DROP TABLE should be blocked"
    tool_call:
      name: execute_sql
      args:
        query: "DROP TABLE users"
    expected: block
    expected_rule: block_sql_injection

Run:

probitas run --config policy.yaml --tests tests/
probitas — guardrail regression test report
==================================================
Results: 2/2 passed, 0 failed

  [PASS] Safe SELECT should be allowed
  [PASS] DROP TABLE should be blocked

Coverage: 50.0% (1/2 deterministic rules)
  Not exercised: detect_pii

Evidence hash: 7a3f9c1e2b4d...

CLI

# Text output (default, terminal-friendly)
probitas run --config policy.yaml --tests tests/

# JSON for CI pipelines
probitas run --config policy.yaml --tests tests/ --format json

# HTML evidence report
probitas run --config policy.yaml --tests tests/ --format html --output report.html

Exit codes: 0 = all pass, 1 = failures, 2 = config error.

Rule Types

Type Purpose Key Params
regex_block Block if field matches pattern fields, patterns
regex_require Block if required field missing/invalid fields, pattern
pii_detect Scan args for PII (email, phone, HKID, etc.) detectors, action
entitlement Role-based tool access control roles, default
budget Block if estimated cost exceeds threshold max_cost, cost_field
tool_allowlist Block if tool not in approved list allowed_tools

Guardrail Coverage

Coverage tracks which deterministic rules were exercised by your test suite:

Coverage: 83.3% (5/6 deterministic rules)
  Not exercised: require_confirmation
  Semantic (manual validation required): response_tone_check

Rules tagged kind: semantic are excluded from coverage and listed separately — honest about what can't be tested deterministically.

Test Case Schema

tests:
  - description: "Human-readable label"
    tool_call:
      name: tool_name
      args:
        key: value
      metadata:
        role: analyst
        estimated_cost: 0.50
    expected: allow | block
    expected_rule: rule_name  # optional: verify which rule triggers

Policy Schema

Same as frenum. Each rule can be tagged kind: deterministic (default) or kind: semantic:

rules:
  - name: block_sql_injection
    type: regex_block
    kind: deterministic  # tested automatically (default)
    applies_to: ["execute_sql"]
    params:
      fields: ["query"]
      patterns: ["(?i)DROP\\s+TABLE"]

  - name: response_tone_check
    kind: semantic  # listed as "manual validation required"
    type: regex_block
    applies_to: ["*"]
    params:
      fields: []
      patterns: []

Programmatic Use

from probitas import evaluate, run_tests, calculate_coverage
from probitas import RuleConfig, TestCase, ToolCall, Decision

rules = [
    RuleConfig(
        name="cost_limit",
        rule_type="budget",
        params={"max_cost": 1.00},
        applies_to=["*"],
    ),
]

result = evaluate(rules, ToolCall(name="gpt4", args={}, metadata={"estimated_cost": 5.0}))
assert result.decision == Decision.BLOCK

HTML Report

The HTML report includes:

  • Policy hash (SHA-256)
  • Pass/fail matrix with colour-coded status
  • Coverage percentage with progress bar
  • Rules not exercised
  • Semantic rules flagged for manual validation
  • Evidence bundle hash (SHA-256 of policy + results)

Generate with --format html --output report.html, or programmatically:

from probitas.report import generate_html
html = generate_html(results, coverage, policy_content)

Design Philosophy

  • Deterministic by default. No LLM calls in CI. Fast, reproducible, cacheable.
  • Coverage is honest. Semantic rules are explicitly excluded — no inflated numbers.
  • SHA-256 for tamper-evidence. Single hash of the evidence bundle. 90% of the tamper-evident story at 1% of the complexity.
  • Zero core deps. pyyaml for config loading, jinja2 for pretty HTML — both optional.
  • CI-native. Exit codes, JSON output, sub-second execution.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

probitas-0.1.1.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

probitas-0.1.1-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file probitas-0.1.1.tar.gz.

File metadata

  • Download URL: probitas-0.1.1.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for probitas-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2a10658f2b38dd99f36b36c599154f9bbdd73cf4a7521398800f45c6714d2c8f
MD5 28687840fd1a824bb19898b229fd415a
BLAKE2b-256 713c3d2987f5128e5ec045a215e6c42b06bc2c63d36123433498980610c92c15

See more details on using hashes here.

File details

Details for the file probitas-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: probitas-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for probitas-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 da13983e877dfbcc6f68a7f948dff2a8bcced501c750eb5c4924ae3565b2807b
MD5 adaebbb65791650737f7a9d67f9338bb
BLAKE2b-256 a4d27b166aac9fdcbf61e6acc1822cf20aa08e46660d9a7ccdfe49572ef38bcf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page