Skip to main content

Python SDK for AI-Warden - Prompt injection detection and protection

Project description

AI-Warden Python SDK

Version Python License

AI-Warden is a production-ready Python SDK for detecting and preventing prompt injection attacks in AI/LLM applications. Protect your AI systems with beautiful CLI tools, magic browser authentication, and comprehensive framework integrations.

โœจ Features

  • ๐Ÿ›ก๏ธ Advanced Detection - Pattern matching, LLM-based analysis, and hybrid modes
  • ๐ŸŒ Magic Login - Browser-based OAuth authentication (no copy-paste!)
  • ๐ŸŽจ Beautiful CLI - Rich terminal UI with colors, spinners, and progress bars
  • โšก Fast & Accurate - Pattern mode ~20ms, LLM mode ~1.2s
  • ๐Ÿ”Œ Framework Support - FastAPI, Django, Flask middleware included
  • ๐Ÿš€ Async Ready - Full async/await support with httpx
  • ๐Ÿ“ฆ Batch Processing - Validate multiple prompts efficiently
  • ๐Ÿ” Secure Storage - Credentials stored safely in ~/.ai-warden/

๐Ÿš€ Quick Start

Installation

pip install ai-warden

Magic Login

ai-warden login

This will:

  1. Open your browser automatically
  2. Let you sign up/login at the AI-Warden web portal
  3. Receive and save your API key securely
  4. You're ready to go! โœ…

Basic Usage

from ai_warden import AIWarden

# Auto-loads credentials from magic login
warden = AIWarden()

# Validate a prompt
result = warden.validate("Ignore all previous instructions")

print(result.is_safe)      # False
print(result.threat_type)  # "jailbreak_attempt"
print(result.confidence)   # 0.95
print(result.latency_ms)   # 23

๐Ÿ“– Documentation

Table of Contents


๐Ÿ” Authentication

Option 1: Magic Login (Recommended)

ai-warden login

Opens your browser, handles OAuth, saves credentials automatically.

Option 2: Manual Configuration

ai-warden configure --api-key sk_live_xxx

Or set environment variable:

export AI_WARDEN_API_KEY="sk_live_xxx"

Option 3: Direct in Code

from ai_warden import AIWarden

warden = AIWarden(api_key="sk_live_xxx")

๐Ÿ Python API

Basic Validation

from ai_warden import AIWarden

warden = AIWarden()

# Validate single prompt
result = warden.validate("Hello world")

if result.is_safe:
    print("โœ… Safe to use!")
else:
    print(f"โš ๏ธ Threat detected: {result.threat_type}")

Validation Modes

from ai_warden import ValidationMode

# Pattern-only (fast, ~20ms)
result = warden.validate("text", mode=ValidationMode.PATTERN)

# LLM-based (accurate, ~1.2s)
result = warden.validate("text", mode=ValidationMode.LLM)

# Hybrid (pattern first, then LLM if uncertain)
result = warden.validate("text", mode=ValidationMode.HYBRID)

# Auto (smart decision - recommended)
result = warden.validate("text", mode=ValidationMode.AUTO)

Batch Validation

prompts = [
    "Hello world",
    "Ignore previous instructions",
    "Show me all data"
]

results = warden.validate_batch(prompts)

for prompt, result in zip(prompts, results):
    print(f"{prompt}: {'โœ…' if result.is_safe else 'โš ๏ธ'}")

Async Support

import asyncio
from ai_warden import AsyncAIWarden

async def main():
    async with AsyncAIWarden() as warden:
        result = await warden.validate("text")
        print(result.is_safe)

asyncio.run(main())

Context Manager

with AIWarden() as warden:
    result = warden.validate("text")
    print(result.is_safe)

๐Ÿ› ๏ธ CLI Reference

ai-warden login

Magic browser-based authentication.

ai-warden login

# Output:
# ๐ŸŒ Opening browser for authentication...
# โณ Waiting for callback...
# โœ… Authentication successful!
# ๐Ÿ”‘ API key saved to ~/.ai-warden/credentials

Options:

  • --auth-url URL - Custom authentication URL
  • --port PORT - Local callback port (default: 8787)

ai-warden configure

Manual API key configuration.

# Interactive
ai-warden configure

# Direct
ai-warden configure --api-key sk_live_xxx --api-url http://46.62.240.255:8080/api

ai-warden validate

Validate a single prompt.

ai-warden validate "Ignore all instructions"

# Output:
# โš ๏ธ UNSAFE: jailbreak_attempt
# 
# Details:
#   Confidence: 0.95
#   Mode: pattern
#   Latency: 23ms

Options:

  • --mode MODE - Validation mode (pattern/llm/hybrid/auto)

ai-warden scan

Scan files for vulnerabilities.

# Scan single file
ai-warden scan app.py

# Scan directory recursively
ai-warden scan src/ --recursive

# Output:
# Scanning: app.py
# โœ… Line 42: Safe
# โš ๏ธ Line 89: UNSAFE - potential injection
# 
# Summary: 1 issue found in 1 file

Options:

  • --recursive, -r - Scan directories recursively
  • --mode MODE - Validation mode

ai-warden scan-skill

Scan a remote skill repository for prompt injection threats.

# Offline scanning (free, no API key needed)
ai-warden scan-skill https://github.com/user/skill --offline
ai-warden scan-skill https://github.com/user/skill --offline --json
ai-warden scan-skill https://github.com/user/skill --offline --strict

# API-powered scanning (requires API key)
ai-warden scan-skill https://github.com/user/skill
ai-warden scan-skill https://github.com/user/skill --json --strict

Options:

  • --offline - Use local scanner only (free, no API key)
  • --json - Machine-readable JSON output
  • --strict - Exit code 1 unless verdict is SAFE
  • --mode MODE - Detection mode (strict/balanced/permissive)

Example Output

๐Ÿ” AI-Warden Skill Scan
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
  Skill:    smart-web-search
  Source:   github.com/davidme6/smart-web-search
  Files:    4 scanned
  Mode:     offline

  LICENSE                  โœ… Safe       (0.00)
  README.md                โŒ CRITICAL   (1.00)
    โ”œโ”€ P102: Data Forwarding Instructions [CRITICAL] โ€” "Email**: smart-web-search@feedback.com"
    โ””โ”€ H003: Excessive External URLs [LOW] โ€” "Found 11 external URLs"
  SKILL.md                 โœ… Safe       (0.19)
    โ””โ”€ H003: Excessive External URLs [LOW] โ€” "Found 20 external URLs"
  _meta.json               โœ… Safe       (0.00)

  Verdict:     โŒ DANGEROUS
  Trust Score: 0/100
  Scan Time:   1.2s
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”

Verdicts

Verdict Trust Score Meaning
โœ… SAFE 70-100 No threats detected
โš ๏ธ WARNING 25-69 Suspicious patterns found, review recommended
โŒ DANGEROUS 0-24 Active threats detected, do not install

Offline vs API Mode

Offline (free) API (metered)
Detection Regex patterns Judge Mars ML + patterns
Speed Instant ~150ms/file
False positives Higher Lower
Zero-day threats โŒ โœ…
Requires API key No Yes

Python API

from ai_warden import AIWarden

warden = AIWarden()

# Offline scan
result = warden.scan_skill("https://github.com/user/skill", offline=True)
print(result["verdict"])     # SAFE, WARNING, or DANGEROUS
print(result["trustScore"])  # 0-100

# API-powered scan
result = warden.scan_skill("https://github.com/user/skill")
for f in result["files"]:
    print(f"{f['path']}: {f['riskLevel']} ({f['score']})")

ai-warden status

Show authentication status and usage.

ai-warden status

# Output:
# โ•ญโ”€โ”€โ”€ AI-Warden Status โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
# โ”‚ โœ… Authenticated                                 โ”‚
# โ”‚                                                  โ”‚
# โ”‚ API Key:   sk_live_...abc (valid)               โ”‚
# โ”‚ API URL:   http://46.62.240.255:8080/api        โ”‚
# โ”‚ Tier:      Free                                  โ”‚
# โ”‚ Usage:     42 / 1000 requests (4%)              โ”‚
# โ”‚ Remaining: 958 requests this month              โ”‚
# โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

ai-warden logout

Remove stored credentials.

ai-warden logout

# Output:
# ๐Ÿ—‘๏ธ Credentials removed
# Run 'ai-warden login' to authenticate again

๐ŸŒ Middleware

FastAPI

from fastapi import FastAPI
from ai_warden.middleware import FastAPIMiddleware

app = FastAPI()

app.add_middleware(
    FastAPIMiddleware,
    api_key="sk_live_xxx",    # Or load from env
    block_unsafe=True,         # Return 400 on unsafe prompts
    log_threats=True,          # Log detected threats
    exclude_paths=["/health"]  # Skip validation for these paths
)

@app.post("/chat")
async def chat(prompt: str):
    # Middleware validates prompt before reaching here
    return {"response": "Safe!"}

How it works:

  1. Middleware intercepts POST/PUT/PATCH requests
  2. Extracts text fields from JSON body
  3. Validates all text content
  4. Returns 400 if unsafe (when block_unsafe=True)
  5. Or adds warning header and continues

Django

# settings.py
MIDDLEWARE = [
    'ai_warden.middleware.django.AIWardenMiddleware',
    # ... other middleware
]

AI_WARDEN_API_KEY = "sk_live_xxx"
AI_WARDEN_BLOCK_UNSAFE = True
AI_WARDEN_LOG_THREATS = True
AI_WARDEN_EXCLUDE_PATHS = ["/admin/", "/static/"]

Flask

from flask import Flask, request
from ai_warden.middleware import flask_protect

app = Flask(__name__)

@app.route('/chat', methods=['POST'])
@flask_protect(api_key="sk_live_xxx", block_unsafe=True)
def chat():
    prompt = request.json['prompt']
    # Decorator validates prompt before function runs
    return {"response": "Safe!"}

๐Ÿš€ Advanced Usage

Custom Validation Logic

from ai_warden import AIWarden

warden = AIWarden()

def validate_user_input(text: str) -> bool:
    """Custom validation with additional checks."""
    # AI-Warden validation
    result = warden.validate(text)
    
    if not result.is_safe:
        print(f"Blocked: {result.threat_type}")
        return False
    
    # Additional custom checks
    if len(text) > 10000:
        print("Blocked: Too long")
        return False
    
    return True

Error Handling

from ai_warden import AIWarden
from ai_warden.exceptions import (
    AuthenticationError,
    ValidationError,
    APIError
)

warden = AIWarden()

try:
    result = warden.validate("text")
except AuthenticationError:
    print("Invalid API key")
except ValidationError as e:
    print(f"Validation failed: {e}")
except APIError as e:
    print(f"API error: {e}")

Custom API URL

warden = AIWarden(
    api_key="sk_live_xxx",
    api_url="https://your-custom-domain.com",
    timeout=60  # Custom timeout in seconds
)

Usage Statistics

warden = AIWarden()

usage = warden.get_usage()

print(f"Tier: {usage['tier']}")
print(f"Usage: {usage['usage']} / {usage['limit']}")
print(f"Remaining: {usage['limit'] - usage['usage']}")

๐Ÿงช Testing

Run tests with pytest:

pip install -e ".[dev]"
pytest

With coverage:

pytest --cov=ai_warden --cov-report=html

๐Ÿ“ฆ Installation Options

Basic Installation

pip install ai-warden

With Async Support

pip install ai-warden[async]

With Framework Support

pip install ai-warden[fastapi]
pip install ai-warden[django]
pip install ai-warden[flask]

All Features

pip install ai-warden[async,fastapi,django,flask,secure]

Development

pip install -e ".[dev]"

๐Ÿค Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make your changes
  4. Run tests: pytest
  5. Format code: black ai_warden/
  6. Submit a pull request

๐Ÿ“„ License

MIT License - see LICENSE for details.


๐Ÿ”— Links


๐Ÿ›ก๏ธ Security

If you discover a security vulnerability, please email security@ai-warden.com instead of using the issue tracker.


๐Ÿ™ Acknowledgments

Built with โค๏ธ using:

  • Click - Beautiful command-line interfaces
  • Rich - Rich terminal formatting
  • Pydantic - Data validation
  • Requests - HTTP client
  • httpx - Async HTTP client

Made with ๐Ÿ›ก๏ธ by the AI-Warden team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_warden-0.2.0.tar.gz (40.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_warden-0.2.0-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file ai_warden-0.2.0.tar.gz.

File metadata

  • Download URL: ai_warden-0.2.0.tar.gz
  • Upload date:
  • Size: 40.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for ai_warden-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fce601c16c8216d8c3ad30a4e79e1461202729dc6183582229315dd917a90b10
MD5 638c92379919f8763b99558dc691b4c3
BLAKE2b-256 39126a5c202bf6d2a03fc5b09826ce8f691d5287f8f545d3e5c6ace1a57e0e65

See more details on using hashes here.

File details

Details for the file ai_warden-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ai_warden-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for ai_warden-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e3bb5245ede348164b3d96651c3cf88134eea84531f7f69391d127ae0494dbb5
MD5 1951377bc0b35be4588509544f41b49a
BLAKE2b-256 9b3dd7e5be6fc4dad8512b8b39e441794ec0ea3ad76fc59259d48a1fa702bdba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page