Skip to main content

Guardrail regression testing for LLM agent tool calls

Project description

probitas

Guardrail regression testing for LLM agent tool calls. pytest for guardrails — runs deterministic policy rules against test cases, reports pass/fail and guardrail coverage.

Named after the Latin word for proved quality or integrity. Companion to frenum (the guardrail enforcement engine). Probitas tests the bridle; frenum is the bridle.

Why

You have guardrail rules — SQL injection blocks, PII detection, role-based entitlements, cost limits. But how do you know they actually work? Model updates, new tools, and config changes can silently break enforcement. There's no pytest for guardrails, and no "guardrail coverage" metric.

Probitas fills that gap:

  • Deterministic testing — no LLM calls, runs in CI, sub-second execution
  • Guardrail coveragerules_exercised / total_deterministic_rules, like code coverage for your policy
  • Evidence reports — HTML/JSON/text with SHA-256 tamper-evidence hashing
  • Frenum-compatible — same YAML policy schema, test your frenum rules without changes
  • Zero dependencies — stdlib only; pyyaml and jinja2 are optional

Quick Start

pip install probitas[yaml]

Write a policy (policy.yaml):

rules:
  - name: block_sql_injection
    type: regex_block
    applies_to: ["execute_sql"]
    params:
      fields: ["query"]
      patterns:
        - "(?i)(DROP|DELETE|TRUNCATE)\\s+TABLE"

  - name: detect_pii
    type: pii_detect
    applies_to: ["*"]
    params:
      detectors: [email, credit_card]
      action: block

Write test cases (tests/test_sql.yaml):

tests:
  - description: "Safe SELECT should be allowed"
    tool_call:
      name: execute_sql
      args:
        query: "SELECT * FROM accounts"
    expected: allow

  - description: "DROP TABLE should be blocked"
    tool_call:
      name: execute_sql
      args:
        query: "DROP TABLE users"
    expected: block
    expected_rule: block_sql_injection

Run:

probitas run --config policy.yaml --tests tests/
probitas — guardrail regression test report
==================================================
Results: 2/2 passed, 0 failed

  [PASS] Safe SELECT should be allowed
  [PASS] DROP TABLE should be blocked

Coverage: 50.0% (1/2 deterministic rules)
  Not exercised: detect_pii

Evidence hash: 7a3f9c1e2b4d...

CLI

# Text output (default, terminal-friendly)
probitas run --config policy.yaml --tests tests/

# JSON for CI pipelines
probitas run --config policy.yaml --tests tests/ --format json

# HTML evidence report
probitas run --config policy.yaml --tests tests/ --format html --output report.html

Exit codes: 0 = all pass, 1 = failures, 2 = config error.

Rule Types

Type Purpose Key Params
regex_block Block if field matches pattern fields, patterns
regex_require Block if required field missing/invalid fields, pattern
pii_detect Scan args for PII (email, phone, HKID, etc.) detectors, action
entitlement Role-based tool access control roles, default
budget Block if estimated cost exceeds threshold max_cost, cost_field
tool_allowlist Block if tool not in approved list allowed_tools

Guardrail Coverage

Coverage tracks which deterministic rules were exercised by your test suite:

Coverage: 83.3% (5/6 deterministic rules)
  Not exercised: require_confirmation
  Semantic (manual validation required): response_tone_check

Rules tagged kind: semantic are excluded from coverage and listed separately — honest about what can't be tested deterministically.

Test Case Schema

tests:
  - description: "Human-readable label"
    tool_call:
      name: tool_name
      args:
        key: value
      metadata:
        role: analyst
        estimated_cost: 0.50
    expected: allow | block
    expected_rule: rule_name  # optional: verify which rule triggers

Policy Schema

Same as frenum. Each rule can be tagged kind: deterministic (default) or kind: semantic:

rules:
  - name: block_sql_injection
    type: regex_block
    kind: deterministic  # tested automatically (default)
    applies_to: ["execute_sql"]
    params:
      fields: ["query"]
      patterns: ["(?i)DROP\\s+TABLE"]

  - name: response_tone_check
    kind: semantic  # listed as "manual validation required"
    type: regex_block
    applies_to: ["*"]
    params:
      fields: []
      patterns: []

Programmatic Use

from probitas import evaluate, run_tests, calculate_coverage
from probitas import RuleConfig, TestCase, ToolCall, Decision

rules = [
    RuleConfig(
        name="cost_limit",
        rule_type="budget",
        params={"max_cost": 1.00},
        applies_to=["*"],
    ),
]

result = evaluate(rules, ToolCall(name="gpt4", args={}, metadata={"estimated_cost": 5.0}))
assert result.decision == Decision.BLOCK

HTML Report

The HTML report includes:

  • Policy hash (SHA-256)
  • Pass/fail matrix with colour-coded status
  • Coverage percentage with progress bar
  • Rules not exercised
  • Semantic rules flagged for manual validation
  • Evidence bundle hash (SHA-256 of policy + results)

Generate with --format html --output report.html, or programmatically:

from probitas.report import generate_html
html = generate_html(results, coverage, policy_content)

Design Philosophy

  • Deterministic by default. No LLM calls in CI. Fast, reproducible, cacheable.
  • Coverage is honest. Semantic rules are explicitly excluded — no inflated numbers.
  • SHA-256 for tamper-evidence. Single hash of the evidence bundle. 90% of the tamper-evident story at 1% of the complexity.
  • Zero core deps. pyyaml for config loading, jinja2 for pretty HTML — both optional.
  • CI-native. Exit codes, JSON output, sub-second execution.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

probitas-0.1.2.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

probitas-0.1.2-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file probitas-0.1.2.tar.gz.

File metadata

  • Download URL: probitas-0.1.2.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for probitas-0.1.2.tar.gz
Algorithm Hash digest
SHA256 7e5441fdded832072ec71c0d32b587d3a84bd70d2a65521d1987b1062769f9c8
MD5 97d9d06eab356c7b11ee29f2ef05c701
BLAKE2b-256 29a8cf57bfc3152892017d7f3016c9c6ff68c43e1594cbb908ed9720ba1c8d97

See more details on using hashes here.

File details

Details for the file probitas-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: probitas-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for probitas-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1afbb8569abfd310f2a235c3c0d743c15c86f0d5a0ad23551fea0ef699b82541
MD5 5f012efd3b2f03c40c4ff2eab23c3059
BLAKE2b-256 24de4a531d05fcd68593e9b8276c2da2fc5cafb1fde2e3543e14989b936ab551

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page