Skip to main content

Formal verification framework for LLM tool-calling. Z3 + Lean 4.

Project description

VeriTool

Z3 + Lean 4 formal verification framework for LLM tool-calling.

LLM → Interceptor → Z3 Policy Check → UNSAT = permit | SAT = block + counterexample
                    ↑
              Auto-generator — NL descriptions → formal specs (Lean + Z3)

Quick Install

pip install z3-solver pytest
pip install -e ".[all]"        # includes dashboard, pandas, streamlit
lean --version && lean Lean/Policy.lean
veritool --help

CLI Usage

veritool create "no file named payroll.csv may be deleted"     # NL → formal policy
veritool check tahoe --model Tahoe --price 45000               # decision for a case
veritool test tahoe                                            # run round-trip test matrix
veritool run Tahoe 40000 --verbose                             # unified check+explain
veritool status                                                # active policies summary
veritool hot-reload                                            # reload without restart
veritool rollback tahoe                                        # undo last revision
veritool verify all                                            # full round-trip sweep
veritool dashboard                                             # launch Streamlit UI
veritool wrap "def fn(**kw): ..." --tool-name submit_order     # wrap function as tool

Demo

python demo_tahoe.py       # price floor checks
python demo_deletion.py    # file scope checks

Supported Policy Types

Type Description Example Violation
price_floor Minimum price per item price < floor_price(model)
file_access / deletion Allowed file scope ¬in_scope(target)
sql_safety Allowed query patterns ¬allowed_query_pattern(query)
rate_limit Max calls per API key current_count ≥ max_per_minute(key)
role_hours Admin action time windows role=admin ∧ hour > 22
api_access Endpoint allowlist ¬endpoint_allowed(path)
generic Custom constraints user-defined

Lean 4 Ground Truth

Policies are dual-encoded: Z3 for runtime checks, Lean 4 theorems (Lean/Policy.lean) as the authoritative spec. The Lean theorem uses Option Nat — unknown inputs return none, making violations structurally unprovable. Any discrepancy between Z3 and Lean is caught by round-trip tests.

def floor_price : String  Option Nat
  | "Tahoe"  => some 45000
  | "Malibu" => some 25000
  | _        => none     -- unknown model: structurally unprovable

Multi-Agent Coordination

from verifier.coordination_policy import CoordinationVerifier, Invariant

verifier = CoordinationVerifier()
verifier.add_invariant("seq", Invariant.SEQUENTIAL_ACCESS, resource="db")
verifier.add_invariant("lock", Invariant.LOCK_REQUIRED, resource="db")

Supported invariants: SEQUENTIAL_ACCESS, LOCK_REQUIRED, APPROVAL_REQUIRED, MONOTONIC_ACCESS, ROLE_BASED_ACCESS.

LLM Integration

from integrations.langchain_interceptor import LangChainVerifier
from integrations.crewaI_guard import CrewAIVerifier
from integrations.autogen_middleware import AutoGenMiddleware

All three wrappers intercept tool calls and block violations before runtime.

Tests

make test           # all tests
make test-fast      # stop at first failure
make verify         # Lean theorem compilation check
make dashboard      # Streamlit monitoring UI

183 tests covering all modules.

Project Structure

├── cli/                    # CLI commands + auto-generator + round-trip
├── bridge/                 # PolicySpec type system + Z3 encoder
├── verifier/               # Policy checkers + coordination verifier
├── policy_store/           # Versioned store + audit trail
├── dashboard/              # Streamlit monitoring UI
├── integrations/           # LangChain, CrewAI, AutoGen wrappers
├── Lean/Policy.lean        # Ground-truth theorems
├── Makefile                # test, verify, demo, dashboard targets
└── tests/                  # 183 tests — run with `make test`

Fail-Closed Design

Unknown inputs are rejected at every layer:

Layer Unknown Behavior
Schema validation Rejected — out-of-enum
Z3 policy check Rejected — ForAll default-false
Lean theorem Unprovable — Option Nat returns none
Coordination verifier Rejected — unregistered agents

Architecture

LLM output → Interceptor → Schema validation → Z3 check → Coordinator → [block/permit]
                                                                 ↓
                                                          Policy Store (versioned)
                                                                 ↓
                                                          Audit Trail (JSONL)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

veritool-1.0.1.tar.gz (37.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

veritool-1.0.1-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file veritool-1.0.1.tar.gz.

File metadata

  • Download URL: veritool-1.0.1.tar.gz
  • Upload date:
  • Size: 37.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for veritool-1.0.1.tar.gz
Algorithm Hash digest
SHA256 04f8b15ef0713add6af9bb93d879f4da4bb07d08b821ac6e7181678763ad04c4
MD5 ec8cc5b9fcf3cb978a1b0b1ecc2711c5
BLAKE2b-256 3f1eeafdb8be77b836e3dcdebd1d29c6dd97a761690c7a6c68c7a086e877e046

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritool-1.0.1.tar.gz:

Publisher: publish.yml on Pranav-d33/veritool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file veritool-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: veritool-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for veritool-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4de06dd7f4d6922151bee431737711a415dc50dce3ff2143e8b4f8fc033f25c9
MD5 c23bc9a62c19245fedf8aabbdfb7c8af
BLAKE2b-256 b3c1a00af6dd1b7d623b68dc1be1d82c439309df96588bceb52900ddba6d6663

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritool-1.0.1-py3-none-any.whl:

Publisher: publish.yml on Pranav-d33/veritool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page