AI firewall SDK — protect LLM applications from prompt injection, data leaks, and off-topic abuse
Project description
WonderwallAi
AI firewall SDK for LLM applications. Protect against prompt injection, data leaks, and off-topic abuse.
Why WonderwallAi?
| WonderwallAi | Hosted APIs (Lakera, etc.) | Heavy Frameworks | |
|---|---|---|---|
| Latency | <2ms in-process for 90% of threats · <300ms with full LLM scan | 50-200ms round trip | Varies |
| Privacy | Messages never leave your server | Sent to third-party | Varies |
| Integration | 3 lines of code | API key + HTTP calls | Wrap your entire pipeline |
| Cost | Free SDK, hosted API from $0/mo | $0.001+ per request | Free but complex |
| Offline | Works without internet (semantic router) | Requires internet | Varies |
What It Does
WonderwallAi sits between your users and your LLM, scanning messages in both directions:
Inbound (user to LLM):
- Semantic Router — Blocks off-topic queries using vector similarity against your allowed topics
- Sentinel Scan — Detects prompt injection attacks using a fast LLM classifier (Groq)
Outbound (LLM to user):
- Egress Filter — Catches leaked API keys, PII, and canary tokens before they reach the user
- File Sanitizer — Validates uploads by magic bytes and strips EXIF metadata
All layers are fail-open by default — errors allow messages through rather than blocking legitimate users.
Installation
# Lightweight (egress filter only — no ML dependencies)
pip install wonderwallai
# Full install (all layers including semantic routing + sentinel)
pip install wonderwallai[all]
# Individual layers
pip install wonderwallai[semantic] # + sentence-transformers + torch
pip install wonderwallai[sentinel] # + groq
pip install wonderwallai[files] # + Pillow + filetype
Quick Start
from wonderwallai import Wonderwall
from wonderwallai.patterns.topics import ECOMMERCE_TOPICS
wall = Wonderwall(
topics=ECOMMERCE_TOPICS,
sentinel_api_key="gsk_...",
bot_description="a customer service chatbot for an online store",
)
# Scan user input before it reaches your LLM
verdict = await wall.scan_inbound(user_message)
if not verdict.allowed:
return verdict.message # User-friendly rejection
# Generate a canary token and inject it into your LLM system prompt
canary = wall.generate_canary(session_id)
system_prompt += wall.get_canary_prompt(canary)
# Scan LLM output before it reaches the user
verdict = await wall.scan_outbound(llm_response, canary)
response_text = verdict.message # Cleaned text (API keys/PII redacted)
Configuration
All parameters have sensible defaults. Pass them as keyword arguments or use a WonderwallConfig object:
from wonderwallai import Wonderwall, WonderwallConfig
# Keyword arguments
wall = Wonderwall(
topics=["Order tracking", "Returns", "Product questions"],
similarity_threshold=0.35,
sentinel_api_key="gsk_...",
sentinel_model="llama-3.1-8b-instant",
bot_description="a customer service chatbot",
canary_prefix="MYAPP-",
fail_open=True,
block_message="I can only help with topics I'm designed for.",
)
# Or use a config object
config = WonderwallConfig(topics=["..."], ...)
wall = Wonderwall(config=config)
Key Parameters
| Parameter | Default | Description |
|---|---|---|
topics |
[] |
Allowed conversation topics. Empty disables semantic routing. |
similarity_threshold |
0.35 |
Cosine similarity threshold (0.0-1.0). |
embedding_model |
None |
Pre-loaded SentenceTransformer instance (saves memory). |
sentinel_api_key |
"" |
Groq API key. Falls back to GROQ_API_KEY env var. |
sentinel_model |
"llama-3.1-8b-instant" |
Model for the sentinel classifier. |
bot_description |
"an AI assistant" |
Used in the sentinel system prompt. |
canary_prefix |
"WONDERWALL-" |
Prefix for generated canary tokens. |
fail_open |
True |
Allow messages through on errors (vs block). |
block_message |
Generic | Message shown when semantic router blocks. |
block_message_injection |
Generic | Message shown when sentinel blocks. |
Pre-Built Topic Sets
from wonderwallai.patterns.topics import (
ECOMMERCE_TOPICS, # 18 shopping/order topics
SUPPORT_TOPICS, # 13 technical support topics
SAAS_TOPICS, # 14 SaaS product topics
)
# Combine topic sets
wall = Wonderwall(topics=ECOMMERCE_TOPICS + SUPPORT_TOPICS)
Custom Patterns
Extend the built-in API key and PII detection patterns:
import re
from wonderwallai.patterns.api_keys import DEFAULT_API_KEY_PATTERNS
wall = Wonderwall(
api_key_patterns=[re.compile(r'myapp_[a-zA-Z0-9]{32}')],
pii_patterns={"employee_id": re.compile(r'EMP-\d{6}')},
include_default_patterns=True, # Merge with built-in patterns
)
How the Verdict Works
Every scan returns a Verdict object:
verdict = await wall.scan_inbound(message)
verdict.allowed # bool — True if message passes
verdict.action # "allow" | "block" | "redact"
verdict.blocked_by # "semantic_router" | "sentinel_scan" | "egress_filter" | None
verdict.message # The (possibly cleaned) text or block message
verdict.violations # List of violation codes
verdict.scores # Layer scores, e.g. {"semantic": 0.72}
Architecture
User Message
|
v
[Semantic Router] ---> cosine similarity vs allowed topics (sub-ms)
|
v
[Sentinel Scan] -----> LLM binary classifier via Groq (~100ms)
|
v
Your LLM (GPT, Claude, Llama, etc.)
|
v
[Egress Filter] -----> canary tokens, API keys, PII detection
|
v
User Response (cleaned)
Hosted API
Don't want to self-host? Use the WonderwallAi hosted API:
curl -X POST https://api.wonderwallai.com/v1/scan/inbound \
-H "Authorization: Bearer ww_live_abc123..." \
-H "Content-Type: application/json" \
-d '{"message": "How do I track my order?"}'
Plans start at $0/month (1,000 scans). See pricing.
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wonderwallai-0.1.1.tar.gz.
File metadata
- Download URL: wonderwallai-0.1.1.tar.gz
- Upload date:
- Size: 92.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
782c8b43ae3557b3d9489047b26ede74d230d5a4a00765e417a5053741a754cc
|
|
| MD5 |
ecb1361e5352b12d392cfc14ce5d0162
|
|
| BLAKE2b-256 |
9f00c06e7ab74aa4a3d1ca94f51136ff39825f1d1b6587d9cd3926ed804475ce
|
File details
Details for the file wonderwallai-0.1.1-py3-none-any.whl.
File metadata
- Download URL: wonderwallai-0.1.1-py3-none-any.whl
- Upload date:
- Size: 19.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
740b526c22088954ce5913d4391dadfd2021c3b444209cde2303e790a05ac1b0
|
|
| MD5 |
1c0254489d748ee827097ee10bc7e1da
|
|
| BLAKE2b-256 |
57e2b3ead09cb076e412ef96c3b2e572a88ad6df4b99eae5a315ee4d08491368
|