Open-source reliability testing for AI agent tool chains. Catch cascading failures before production.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Haripypi

These details have not been verified by PyPI

Project description

🛡️ ToolGuard

Reliability testing for AI agent tool chains.

Catch cascading failures before production. Make agent tool calling as predictable as unit tests made software reliable.

🧠 What ToolGuard Actually Solves

Right now, developers don't deploy AI agents because they are fundamentally unstable. They crash.

There are two layers to AI:

Layer 1: Intelligence (evals, reasoning, accurate answers)
Layer 2: Execution (tool calls, chaining, JSON payloads, APIs)

ToolGuard does not test Layer 1. We do not care if your AI is "smart" or makes good decisions. That is what eval frameworks are for.

ToolGuard mathematically proves Layer 2. We solve the problem of agents crashing at 3 AM because the LLM hallucinated a JSON key, passed a string instead of an int, or an external API timed out.

"We don't make AI smarter. We make AI systems not break."

The Solution

Test your agent's tools against edge-cases before you deploy them. ToolGuard acts like unit tests for AI execution.

from toolguard import create_tool, test_chain, score_chain

@create_tool(schema="auto")
def parse_csv(raw_csv: str) -> dict:
    lines = raw_csv.strip().split("\n")
    headers = lines[0].split(",")
    records = [dict(zip(headers, line.split(","))) for line in lines[1:]]
    return {"headers": headers, "records": records, "row_count": len(records)}

@create_tool(schema="auto")
def compute_statistics(headers: list, records: list, row_count: int) -> dict:
    # Real computation — mean, median, std dev
    ...

@create_tool(schema="auto")
def generate_report(total_rows: int, stats: dict) -> dict:
    # Real report generation
    ...

# One line. Full visibility.
report = test_chain(
    [parse_csv, compute_statistics, generate_report],
    base_input={"raw_csv": "name,age,salary\nAlice,30,75000\nBob,35,92000"},
    test_cases=["happy_path", "null_handling", "malformed_data"],
)

score = score_chain(report)
print(score.summary())

Real Output (not mocked):

╔═══════════════════════════════════════════════════════════════════╗
║  Reliability Score: parse_csv → compute_statistics → generate_report
╠═══════════════════════════════════════════════════════════════════╣
║  Score:       50.0%                                               ║
║  Risk Level: 🟠 HIGH                                              ║
║  Deploy:     🚫 BLOCK                                             ║
║  Confidence:  45.1%                                               ║
╠═══════════════════════════════════════════════════════════════════╣
║  ⚠️  Top Risk: Schema validation failures                         ║
╠═══════════════════════════════════════════════════════════════════╣
║  Failure Distribution:                                            ║
║    schema_violation   █████████████░░░░░░░   4 (67%)              ║
║    type_mismatch      ██████░░░░░░░░░░░░░░   2 (33%)              ║
╠═══════════════════════════════════════════════════════════════════╣
║  ⚠️  Bottleneck Tools:                                            ║
║    → parse_csv       (50% success)                                ║
╚═══════════════════════════════════════════════════════════════════╝

💡 Suggestion:
Agent hallucinated payload. Schema mismatch:
  - Field 'age': Input should be a valid integer (Got: 'thirty' | Type: str)
  - Field 'salary': Field required (Got: <unknown> | Type: None)

---

## Quick Start

```bash
pip install toolguard

from toolguard import create_tool, test_chain

@create_tool(schema="auto")
def my_tool(query: str) -> dict:
    return {"result": query.upper()}

report = test_chain(
    [my_tool],
    base_input={"query": "hello"},
    test_cases=["happy_path", "null_handling", "malformed_data"],
    assert_reliability=0.80,
)

Or scaffold a full project:

toolguard init --name my_agent

Time to value: < 3 minutes.

Features

🔍 Schema Validation

Automatic Pydantic input/output validation from type hints. No manual schemas needed.

@create_tool(schema="auto")
def fetch_price(ticker: str) -> dict:
    ...

🔗 Chain Testing

Test multi-tool chains against 8 edge-case categories: null handling, type mismatches, missing fields, malformed data, large payloads, and more.

report = test_chain(
    [fetch_price, calculate_position, generate_alert],
    base_input={"ticker": "AAPL"},
    test_cases=["happy_path", "null_handling", "type_mismatch"],
)

⚡ Async Support

Works with both def and async def tools transparently. No special flags needed.

@create_tool(schema="auto")
async def fetch_from_api(url: str) -> dict:
    async with httpx.AsyncClient() as client:
        resp = await client.get(url)
        return resp.json()

# Same API — ToolGuard handles the async automatically
report = test_chain([fetch_from_api, process_data], assert_reliability=0.95)

📊 Reliability Scoring

Quantified trust with risk levels and deployment gates.

score = score_chain(report)
if score.deploy_recommendation.value == "BLOCK":
    sys.exit(1)  # CI/CD gate

🔄 Retry & Circuit Breaker

Production-grade resilience patterns built-in.

from toolguard import with_retry, RetryPolicy, CircuitBreaker, with_circuit_breaker

@with_retry(RetryPolicy(max_retries=3, backoff_base=0.5))
def call_api(data: dict) -> dict: ...

breaker = CircuitBreaker(failure_threshold=5, reset_timeout=60)

@with_circuit_breaker(breaker)
def call_flaky_service(data: dict) -> dict: ...

🖥️ CLI

toolguard test --chain my_chain.yaml           # Run chain tests
toolguard test --chain my_chain.yaml --html report.html  # HTML report
toolguard check --tools my_tools.py            # Check compatibility
toolguard observe --tools my_tools.py          # View tool stats
toolguard init --name my_project               # Scaffold project

🔌 Native Framework Integrations

If you are already using LangChain or CrewAI, you do not need to rewrite your tools to use ToolGuard.

ToolGuard provides native adapters that instantly convert your existing framework tools into GuardedTools so you can stress-test them immediately.

# 🦜🔗 LangChain
from toolguard.integrations.langchain import guard_langchain_tool
from my_app import my_langchain_tool

guarded_tool = guard_langchain_tool(my_langchain_tool)
report = test_chain([guarded_tool], ...)

# ⚙️ CrewAI
from toolguard.integrations.crewai import guard_crewai_tool
from my_app import my_crew_tool

guarded_tool = guard_crewai_tool(my_crew_tool)
report = test_chain([guarded_tool], ...)

# 🤖 OpenAI Function Calling
from toolguard.integrations.openai_func import to_openai_function
from my_app import my_python_tool

# Instantly export any ToolGuard tool to the strict OpenAI JSON schema format
openai_schema = to_openai_function(my_python_tool)

📡 Observability

OpenTelemetry tracing out of the box — works with Jaeger, Zipkin, Datadog, and more.

from toolguard.core.tracer import init_tracing, trace_tool

init_tracing(service_name="my-agent")

@trace_tool
def my_tool(data: dict) -> dict: ...

Architecture

toolguard/
├── core/
│   ├── validator.py      # @create_tool decorator + GuardedTool (sync + async)
│   ├── chain.py          # Chain testing engine (8 test types, async-aware)
│   ├── schema.py         # Auto Pydantic model generation
│   ├── scoring.py        # Reliability scoring + deploy gates
│   ├── report.py         # Failure analysis + suggestions
│   ├── errors.py         # Exception hierarchy + correlation IDs
│   ├── retry.py          # RetryPolicy + CircuitBreaker
│   ├── tracer.py         # OpenTelemetry integration
│   └── compatibility.py  # Schema conflict detection
├── cli/
│   └── commands/         # init, test, check, observe
├── reporters/
│   ├── console.py        # Rich terminal output
│   └── html.py           # Standalone HTML reports
├── integrations/
│   ├── langchain.py      # LangChain adapter
│   ├── crewai.py         # CrewAI adapter
│   └── openai_func.py    # OpenAI function calling
├── tests/                # 43 tests (sync + async + storage)
└── examples/
    ├── weather_chain/              # Working 3-tool example
    ├── demo_failing_chain/         # Intentionally buggy (aha moment)
    └── real_world_validation/      # Real CSV pipeline validation

Why ToolGuard?

	Without ToolGuard	With ToolGuard
Failure detection	Stack trace at 3 AM	Caught before deploy
Root cause	"TypeError in line 47"	"Tool A returned null for 'price'"
Fix guidance	None	"Add default value OR validate response"
Confidence	"It works on my machine"	"92% reliability, LOW risk"
CI/CD	Manual testing	`toolguard test` in your pipeline

Tech Stack

Component	Technology	Why
Core Language	Python 3.11 - 3.13	Agent ecosystem standard
Schema Validation	Pydantic v2	3.5× faster than JSON Schema
Async	Native asyncio	Enterprise-grade concurrency
Testing	pytest (43 tests)	CI/CD native
Observability	OpenTelemetry	Vendor-neutral
CLI	Click + Rich	Beautiful terminal UX
Distribution	PyPI	`pip install toolguard`

License

MIT — use it, fork it, ship it.

Built to make AI agents actually work in production.

GitHub

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Haripypi

These details have not been verified by PyPI

Release history Release notifications | RSS feed

6.1.1

Apr 6, 2026

6.1.0

Apr 4, 2026

6.0.0

Apr 3, 2026

5.1.2

Mar 30, 2026

5.1.1

Mar 28, 2026

5.0.1

Mar 27, 2026

5.0.0

Mar 27, 2026

3.1.1

Mar 22, 2026

3.1.0

Mar 22, 2026

This version

0.1.0

Mar 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_toolguard-0.1.0.tar.gz (69.4 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

py_toolguard-0.1.0-py3-none-any.whl (54.6 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file py_toolguard-0.1.0.tar.gz.

File metadata

Download URL: py_toolguard-0.1.0.tar.gz
Upload date: Mar 17, 2026
Size: 69.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for py_toolguard-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c67f11a5dfff90b1aeb67a025773dac0f286e5055bc8659bc82231abfa9dcea8`
MD5	`d1f18c2943514a1bec035c6f1e682a20`
BLAKE2b-256	`514db45baad5f2dce9017373ea246ee2f8339fb07b4562386bf8e015d02ed25c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_toolguard-0.1.0.tar.gz:

Publisher: publish.yml on Harshit-J004/toolguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: py_toolguard-0.1.0.tar.gz
- Subject digest: c67f11a5dfff90b1aeb67a025773dac0f286e5055bc8659bc82231abfa9dcea8
- Sigstore transparency entry: 1117834544
- Sigstore integration time: Mar 17, 2026
Source repository:
- Permalink: Harshit-J004/toolguard@af8a1147c49323cd2473b4ba9c76965ac9afc5b7
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Harshit-J004
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@af8a1147c49323cd2473b4ba9c76965ac9afc5b7
- Trigger Event: workflow_dispatch

File details

Details for the file py_toolguard-0.1.0-py3-none-any.whl.

File metadata

Download URL: py_toolguard-0.1.0-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 54.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for py_toolguard-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`732d7c091eea1bfe4ac4ac734a5ad554773344d9daa865eb8635b8df9ff56059`
MD5	`a0c4c10baed3d9fe91777ebdff41e664`
BLAKE2b-256	`62f4f3bac5fdfca1fe70939aeb549bde89e50e2f961af8579c2d90a7ea7e730c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for py_toolguard-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Harshit-J004/toolguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: py_toolguard-0.1.0-py3-none-any.whl
- Subject digest: 732d7c091eea1bfe4ac4ac734a5ad554773344d9daa865eb8635b8df9ff56059
- Sigstore transparency entry: 1117834564
- Sigstore integration time: Mar 17, 2026
Source repository:
- Permalink: Harshit-J004/toolguard@af8a1147c49323cd2473b4ba9c76965ac9afc5b7
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Harshit-J004
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@af8a1147c49323cd2473b4ba9c76965ac9afc5b7
- Trigger Event: workflow_dispatch

py-toolguard 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🛡️ ToolGuard

🧠 What ToolGuard Actually Solves

The Solution

Features

🔍 Schema Validation

🔗 Chain Testing

⚡ Async Support

📊 Reliability Scoring

🔄 Retry & Circuit Breaker

🖥️ CLI

🔌 Native Framework Integrations

📡 Observability

Architecture

Why ToolGuard?

Tech Stack

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance