Skip to main content

AI firewall SDK — protect LLM applications from prompt injection, data leaks, and off-topic abuse

Project description

WonderwallAi

CI Python 3.10+ License: MIT

AI firewall SDK for LLM applications. Protect against prompt injection, data leaks, and off-topic abuse.

Why WonderwallAi?

WonderwallAi Hosted APIs (Lakera, etc.) Heavy Frameworks
Latency <2ms in-process for 90% of threats · <300ms with full LLM scan 50-200ms round trip Varies
Privacy Messages never leave your server Sent to third-party Varies
Integration 3 lines of code API key + HTTP calls Wrap your entire pipeline
Cost Free SDK, hosted API from $0/mo $0.001+ per request Free but complex
Offline Works without internet (semantic router) Requires internet Varies

What It Does

WonderwallAi sits between your users and your LLM, scanning messages in both directions:

Inbound (user to LLM):

  • Semantic Router — Blocks off-topic queries using vector similarity against your allowed topics
  • Sentinel Scan — Detects prompt injection attacks using a fast LLM classifier (Groq)

Outbound (LLM to user):

  • Egress Filter — Catches leaked API keys, PII, and canary tokens before they reach the user
  • File Sanitizer — Validates uploads by magic bytes and strips EXIF metadata

All layers are fail-open by default — errors allow messages through rather than blocking legitimate users.

Installation

# Lightweight (egress filter only — no ML dependencies)
pip install wonderwallai

# Full install (all layers including semantic routing + sentinel)
pip install wonderwallai[all]

# Individual layers
pip install wonderwallai[semantic]   # + sentence-transformers + torch
pip install wonderwallai[sentinel]   # + groq
pip install wonderwallai[files]      # + Pillow + filetype

Quick Start

from wonderwallai import Wonderwall
from wonderwallai.patterns.topics import ECOMMERCE_TOPICS

wall = Wonderwall(
    topics=ECOMMERCE_TOPICS,
    sentinel_api_key="gsk_...",
    bot_description="a customer service chatbot for an online store",
)

# Scan user input before it reaches your LLM
verdict = await wall.scan_inbound(user_message)
if not verdict.allowed:
    return verdict.message  # User-friendly rejection

# Generate a canary token and inject it into your LLM system prompt
canary = wall.generate_canary(session_id)
system_prompt += wall.get_canary_prompt(canary)

# Scan LLM output before it reaches the user
verdict = await wall.scan_outbound(llm_response, canary)
response_text = verdict.message  # Cleaned text (API keys/PII redacted)

Configuration

All parameters have sensible defaults. Pass them as keyword arguments or use a WonderwallConfig object:

from wonderwallai import Wonderwall, WonderwallConfig

# Keyword arguments
wall = Wonderwall(
    topics=["Order tracking", "Returns", "Product questions"],
    similarity_threshold=0.35,
    sentinel_api_key="gsk_...",
    sentinel_model="llama-3.1-8b-instant",
    bot_description="a customer service chatbot",
    canary_prefix="MYAPP-",
    fail_open=True,
    block_message="I can only help with topics I'm designed for.",
)

# Or use a config object
config = WonderwallConfig(topics=["..."], ...)
wall = Wonderwall(config=config)

Key Parameters

Parameter Default Description
topics [] Allowed conversation topics. Empty disables semantic routing.
similarity_threshold 0.35 Cosine similarity threshold (0.0-1.0).
embedding_model None Pre-loaded SentenceTransformer instance (saves memory).
sentinel_api_key "" Groq API key. Falls back to GROQ_API_KEY env var.
sentinel_model "llama-3.1-8b-instant" Model for the sentinel classifier.
bot_description "an AI assistant" Used in the sentinel system prompt.
canary_prefix "WONDERWALL-" Prefix for generated canary tokens.
fail_open True Allow messages through on errors (vs block).
block_message Generic Message shown when semantic router blocks.
block_message_injection Generic Message shown when sentinel blocks.

Pre-Built Topic Sets

from wonderwallai.patterns.topics import (
    ECOMMERCE_TOPICS,   # 18 shopping/order topics
    SUPPORT_TOPICS,     # 13 technical support topics
    SAAS_TOPICS,        # 14 SaaS product topics
)

# Combine topic sets
wall = Wonderwall(topics=ECOMMERCE_TOPICS + SUPPORT_TOPICS)

Custom Patterns

Extend the built-in API key and PII detection patterns:

import re
from wonderwallai.patterns.api_keys import DEFAULT_API_KEY_PATTERNS

wall = Wonderwall(
    api_key_patterns=[re.compile(r'myapp_[a-zA-Z0-9]{32}')],
    pii_patterns={"employee_id": re.compile(r'EMP-\d{6}')},
    include_default_patterns=True,  # Merge with built-in patterns
)

How the Verdict Works

Every scan returns a Verdict object:

verdict = await wall.scan_inbound(message)

verdict.allowed      # bool — True if message passes
verdict.action       # "allow" | "block" | "redact"
verdict.blocked_by   # "semantic_router" | "sentinel_scan" | "egress_filter" | None
verdict.message      # The (possibly cleaned) text or block message
verdict.violations   # List of violation codes
verdict.scores       # Layer scores, e.g. {"semantic": 0.72}

Architecture

User Message
    |
    v
[Semantic Router] ---> cosine similarity vs allowed topics (sub-ms)
    |
    v
[Sentinel Scan] -----> LLM binary classifier via Groq (~100ms)
    |
    v
Your LLM (GPT, Claude, Llama, etc.)
    |
    v
[Egress Filter] -----> canary tokens, API keys, PII detection
    |
    v
User Response (cleaned)

Hosted API

Don't want to self-host? Use the WonderwallAi hosted API:

curl -X POST https://api.wonderwallai.com/v1/scan/inbound \
  -H "Authorization: Bearer ww_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{"message": "How do I track my order?"}'

Plans start at $0/month (1,000 scans). See pricing.

Contributing

See CONTRIBUTING.md for development setup and guidelines.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wonderwallai-0.1.1.tar.gz (92.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wonderwallai-0.1.1-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file wonderwallai-0.1.1.tar.gz.

File metadata

  • Download URL: wonderwallai-0.1.1.tar.gz
  • Upload date:
  • Size: 92.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for wonderwallai-0.1.1.tar.gz
Algorithm Hash digest
SHA256 782c8b43ae3557b3d9489047b26ede74d230d5a4a00765e417a5053741a754cc
MD5 ecb1361e5352b12d392cfc14ce5d0162
BLAKE2b-256 9f00c06e7ab74aa4a3d1ca94f51136ff39825f1d1b6587d9cd3926ed804475ce

See more details on using hashes here.

File details

Details for the file wonderwallai-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: wonderwallai-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for wonderwallai-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 740b526c22088954ce5913d4391dadfd2021c3b444209cde2303e790a05ac1b0
MD5 1c0254489d748ee827097ee10bc7e1da
BLAKE2b-256 57e2b3ead09cb076e412ef96c3b2e572a88ad6df4b99eae5a315ee4d08491368

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page