Skip to main content

Scan web content for prompt injection, hidden instructions, and adversarial content targeting AI agents

Project description

Palisade Scanner ๐Ÿ”

PyPI Python 3.11+ License: MIT CI HuggingFace Space

Try it live on HuggingFace Spaces โ€” scan any URL without installing anything.

Scan web content for prompt injection, hidden instructions, and adversarial content targeting AI agents.

AI agents browse the web, read documents, and consume external content. Adversaries hide instructions in invisible text, HTML metadata, encoded payloads, and zero-width characters โ€” Palisade finds them all.


What makes Palisade unique

Capability Palisade Scanner Manual review Generic scrapers
Hidden text detection โœ… 20+ CSS/HTML techniques โŒ โŒ
Injection pattern matching โœ… 100+ regexes, 5 categories โŒ โŒ
LLM-as-judge classifier โœ… understands adversarial intent N/A โŒ
Metadata analysis โœ… comments, JSON-LD, meta, data attrs โŒ โŒ
Exfiltration detection โœ… URLs, eval(), fetch(), redirects โŒ โŒ
MCPGuard policy generation โœ… auto-generate rules โŒ โŒ
CI/CD mode โœ… --ci --threshold high โŒ โŒ
Zero-width character detection โœ… โŒ โŒ

Why

AI agents browse the web, read documents, and consume external content. Adversaries can hide instructions in:

  • Invisible text (white-on-white, zero font size, off-screen positioning)
  • HTML comments and metadata
  • Base64 encoded payloads
  • Zero-width character injections
  • Instructions disguised as product descriptions or reviews

This scanner finds them all and tells you what to do about it.

Quick Start

# Install
pip install palisade-scanner

# CLI: scan a URL
pis scan https://example.com
# or
palisade scan https://example.com

# Web UI: open the dashboard
pis web

# Docker
docker compose up
# โ†’ http://localhost:8000

Usage

CLI

# Scan a URL
pis scan https://example.com

# Scan a local file
pis scan --file suspicious.html

# Scan pasted text
pis scan --paste "<!-- ignore instructions -->"

# JSON output
pis scan https://example.com --format json

# CI/CD mode (exit code reflects risk)
pis scan https://example.com --ci --threshold high

# Generate MCPGuard policy rules
pis policies https://evil-site.com

API

# Scan via REST API
curl "http://localhost:8000/api/scan?url=https://example.com"

# HTML report
curl "http://localhost:8000/api/scan/https://example.com"

How It Works

Detection Layers

Layer What It Detects
Hidden Text Detector 20+ CSS/HTML hiding techniques (display:none, visibility, opacity, color matching, off-screen, zero-width chars, HTML comments)
Injection Pattern Matcher 100+ regex patterns across 5 categories (jailbreak, role override, exfiltration, tool manipulation, impersonation)
Instruction Classifier LLM-as-judge that understands adversarial intent (requires API key)
Metadata Analyzer HTML comments, JSON-LD, meta tags, data attributes, <noscript>, <template>
Exfiltration Detector URLs, endpoints, eval() patterns, redirect attempts, fetch() calls

Scoring

Risk Score: 0-100

Weighted formula:
  base = 100
  - critical * 25
  - high * 10
  - medium * 3
  - low * 1

Categories: none (0-5) โ†’ low (6-20) โ†’ medium (21-50) โ†’ high (51-80) โ†’ critical (81-100)

Architecture

User (CLI / Web / API)
        โ”‚
        โ–ผ
PipelineOrchestrator
        โ”‚
        โ”œโ”€โ”€ Loader (URL / File / Paste / PDF)
        โ”‚
        โ”œโ”€โ”€ Detector Pipeline (parallel)
        โ”‚   โ”œโ”€โ”€ HiddenTextDetector
        โ”‚   โ”œโ”€โ”€ InjectionPatternMatcher
        โ”‚   โ”œโ”€โ”€ MetadataAnalyzer
        โ”‚   โ”œโ”€โ”€ ExfiltrationDetector
        โ”‚   โ””โ”€โ”€ InstructionClassifier (LLM)
        โ”‚
        โ”œโ”€โ”€ ScoringEngine
        โ”‚
        โ””โ”€โ”€ Reporters
            โ”œโ”€โ”€ JSON / Markdown / Simple
            โ”œโ”€โ”€ Policy Generator (MCPGuard)
            โ””โ”€โ”€ Web UI (HTMX)

Project Structure

src/scanner/
โ”œโ”€โ”€ cli.py              # Typer CLI
โ”œโ”€โ”€ api.py              # FastAPI web app
โ”œโ”€โ”€ config.py           # Settings (env vars)
โ”œโ”€โ”€ domain/
โ”‚   โ”œโ”€โ”€ models.py       # Pydantic models
โ”‚   โ””โ”€โ”€ scoring.py      # Risk score engine
โ”œโ”€โ”€ loaders/
โ”‚   โ”œโ”€โ”€ url.py          # HTTP URL fetcher
โ”‚   โ”œโ”€โ”€ pdf.py          # PDF extractor
โ”‚   โ””โ”€โ”€ paste.py        # Raw text
โ”œโ”€โ”€ detectors/
โ”‚   โ”œโ”€โ”€ hidden_text.py       # CSS/HTML hiding
โ”‚   โ”œโ”€โ”€ injection_patterns.py # 100+ regex patterns
โ”‚   โ”œโ”€โ”€ instruction_classifier.py  # LLM-as-judge
โ”‚   โ”œโ”€โ”€ metadata_analyzer.py # Comments/meta/tags
โ”‚   โ””โ”€โ”€ exfiltration.py     # Data theft patterns
โ”œโ”€โ”€ pipeline/
โ”‚   โ””โ”€โ”€ orchestrator.py # Scan pipeline
โ”œโ”€โ”€ reporters/          # JSON/MD/Simple output
โ”œโ”€โ”€ policies/           # MCPGuard rule generation
โ””โ”€โ”€ utils/              # DOM helpers

Integration

MCPGuard

Generate rules compatible with MCPGuard:

pis scan https://evil-site.com --format mcpguard > rules.yaml
mcpguard load-rules rules.yaml

CI/CD

# .github/workflows/check-urls.yml
- name: Scan for prompt injection
  run: |
    pis scan ${{ matrix.url }} --ci --threshold medium

Roadmap

  • v0.1 โ€” Scanner core: CLI, 5 detectors, scoring, policy generation
  • v0.2 โ€” Live Monitor: scheduled re-scans, webhook alerts, diff detection
  • v0.3 โ€” Agent Validator: Browser Use agent tests pages in real time
  • v0.4 โ€” Content Safety Proxy: reverse proxy that strips injections
  • v0.5 โ€” Reputation Engine: web of trust for agent-safe URLs
  • v0.6 โ€” Red Team Lab: adversarial page generator + benchmark suite
  • v0.7 โ€” Certification Pipeline: verified AgentSafe badges

Related Projects

  • MCPGuard โ€” Runtime security proxy for MCP
  • MCPwn โ€” Offensive security testing for MCP
  • MCPscop โ€” Unified security dashboard

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

palisade_scanner-0.1.1.tar.gz (74.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

palisade_scanner-0.1.1-py3-none-any.whl (74.4 kB view details)

Uploaded Python 3

File details

Details for the file palisade_scanner-0.1.1.tar.gz.

File metadata

  • Download URL: palisade_scanner-0.1.1.tar.gz
  • Upload date:
  • Size: 74.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for palisade_scanner-0.1.1.tar.gz
Algorithm Hash digest
SHA256 499d2eb5a2d7a0e0dda3be99e1efa68fed62b67e950c88e74e41f85baae6cb23
MD5 3efca679db60b510296ff57edb2e9263
BLAKE2b-256 8ec09b29d8323d0536f5d09516cead6e1ee9b0c2dea9b46fe7d702d666770352

See more details on using hashes here.

File details

Details for the file palisade_scanner-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for palisade_scanner-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f2e6922fcde5314f4903c2c0404551865031c1237b929866ce278e99bf7a8099
MD5 ec7e3abc4d7843b2a02b0fc0accb7996
BLAKE2b-256 43be03d374dd6b2b3525563088604b5b828fa7f258f48a99f56ef431430e5039

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page