Skip to main content

Formal verification framework for LLM tool-calling. Z3 + Lean 4.

Project description

VeriTool

Formal verification framework for LLM tool-calling. Z3 + Lean 4.

LLM → Interceptor → Z3 Policy Check → UNSAT = permit | SAT = block + counterexample
                    ↑
              Auto-generator — NL descriptions → formal specs (Lean + Z3)

Quick Install

pip install z3-solver pytest
pip install -e ".[all]"        # includes dashboard, pandas, streamlit
lean --version && lean Lean/Policy.lean
veritool --help

CLI Usage

veritool create "no file named payroll.csv may be deleted"     # NL → formal policy
veritool check tahoe --model Tahoe --price 45000               # decision for a case
veritool test tahoe                                            # run round-trip test matrix
veritool run Tahoe 40000 --verbose                             # unified check+explain
veritool status                                                # active policies summary
veritool hot-reload                                            # reload without restart
veritool rollback tahoe                                        # undo last revision
veritool verify all                                            # full round-trip sweep
veritool dashboard                                             # launch Streamlit UI
veritool wrap "def fn(**kw): ..." --tool-name submit_order     # wrap function as tool

Demo

python demo_tahoe.py       # price floor checks
python demo_deletion.py    # file scope checks

Supported Policy Types

Type Description Example Violation
price_floor Minimum price per item price < floor_price(model)
file_access / deletion Allowed file scope ¬in_scope(target)
sql_safety Allowed query patterns ¬allowed_query_pattern(query)
rate_limit Max calls per API key current_count ≥ max_per_minute(key)
role_hours Admin action time windows role=admin ∧ hour > 22
api_access Endpoint allowlist ¬endpoint_allowed(path)
generic Custom constraints user-defined

Multi-Agent Coordination

from verifier.coordination_policy import CoordinationVerifier, Invariant

verifier = CoordinationVerifier()
verifier.add_invariant("seq", Invariant.SEQUENTIAL_ACCESS, resource="db")
verifier.add_invariant("lock", Invariant.LOCK_REQUIRED, resource="db")

Supported invariants: SEQUENTIAL_ACCESS, LOCK_REQUIRED, APPROVAL_REQUIRED, MONOTONIC_ACCESS, ROLE_BASED_ACCESS.

LLM Integration

from integrations.langchain_interceptor import LangChainVerifier
from integrations.crewaI_guard import CrewAIVerifier
from integrations.autogen_middleware import AutoGenMiddleware

All three wrappers intercept tool calls and block violations before runtime.

Tests

make test           # all tests
make test-fast      # stop at first failure
make verify         # Lean theorem compilation check
make dashboard      # Streamlit monitoring UI

183 tests covering all modules.

Project Structure

├── cli/                    # CLI commands + auto-generator + round-trip
├── bridge/                 # PolicySpec type system + Z3 encoder
├── verifier/               # Policy checkers + coordination verifier
├── policy_store/           # Versioned store + audit trail
├── dashboard/              # Streamlit monitoring UI
├── integrations/           # LangChain, CrewAI, AutoGen wrappers
├── Lean/Policy.lean        # Ground-truth theorems
├── Makefile                # test, verify, demo, dashboard targets
└── tests/                  # 183 tests — run with `make test`

Fail-Closed Design

Unknown inputs are rejected at every layer:

Layer Unknown Behavior
Schema validation Rejected — out-of-enum
Z3 policy check Rejected — ForAll default-false
Lean theorem Unprovable — Option Nat returns none
Coordination verifier Rejected — unregistered agents

Architecture

LLM output → Interceptor → Schema validation → Z3 check → Coordinator → [block/permit]
                                                                 ↓
                                                          Policy Store (versioned)
                                                                 ↓
                                                          Audit Trail (JSONL)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

veritool-1.0.0.tar.gz (37.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

veritool-1.0.0-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file veritool-1.0.0.tar.gz.

File metadata

  • Download URL: veritool-1.0.0.tar.gz
  • Upload date:
  • Size: 37.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for veritool-1.0.0.tar.gz
Algorithm Hash digest
SHA256 1b9798f136249ae0062545085742db2e8ff11f2114ac4908462a71c4f3164e3b
MD5 cb94509e46f31d3c07c6dc0df8c46bb1
BLAKE2b-256 8b0bf0051b5908d3a77021afa4bd1165d3c1292d985faf5971557b907272464b

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritool-1.0.0.tar.gz:

Publisher: publish.yml on Pranav-d33/veritool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file veritool-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: veritool-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for veritool-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb3c7251c8c175b50485e004eecfae33be30174c297a8e98d18751c9c744094b
MD5 9940296872e6770eac393a8c9c7089b2
BLAKE2b-256 45d3de787b2f835ef312cb5838b3445bbb1e77dd757afe9e13e4dcd2c98c4a5a

See more details on using hashes here.

Provenance

The following attestation bundles were made for veritool-1.0.0-py3-none-any.whl:

Publisher: publish.yml on Pranav-d33/veritool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page