Skip to main content

Scan web content for prompt injection, hidden instructions, and adversarial content targeting AI agents

Project description

Palisade Scanner ๐Ÿ”

PyPI Python 3.11+ License: MIT CI HuggingFace Space

Try it live on HuggingFace Spaces โ€” scan any URL without installing anything.

Scan web content for prompt injection, hidden instructions, and adversarial content targeting AI agents.

AI agents browse the web, read documents, and consume external content. Adversaries hide instructions in invisible text, HTML metadata, encoded payloads, and zero-width characters โ€” Palisade finds them all.


What makes Palisade unique

Capability Palisade Scanner Manual review Generic scrapers
Hidden text detection โœ… 20+ CSS/HTML techniques โŒ โŒ
Injection pattern matching โœ… 100+ regexes, 5 categories โŒ โŒ
LLM-as-judge classifier โœ… understands adversarial intent N/A โŒ
Metadata analysis โœ… comments, JSON-LD, meta, data attrs โŒ โŒ
Exfiltration detection โœ… URLs, eval(), fetch(), redirects โŒ โŒ
MCPGuard policy generation โœ… auto-generate rules โŒ โŒ
CI/CD mode โœ… --ci --threshold high โŒ โŒ
Zero-width character detection โœ… โŒ โŒ

Why

AI agents browse the web, read documents, and consume external content. Adversaries can hide instructions in:

  • Invisible text (white-on-white, zero font size, off-screen positioning)
  • HTML comments and metadata
  • Base64 encoded payloads
  • Zero-width character injections
  • Instructions disguised as product descriptions or reviews

This scanner finds them all and tells you what to do about it.

Quick Start

# Install
pip install palisade-scanner

# CLI: scan a URL
pis scan https://example.com
# or
palisade scan https://example.com

# Web UI: open the dashboard
pis web

# Docker
docker compose up
# โ†’ http://localhost:8000

Usage

CLI

# Scan a URL
pis scan https://example.com

# Scan a local file
pis scan --file suspicious.html

# Scan pasted text
pis scan --paste "<!-- ignore instructions -->"

# JSON output
pis scan https://example.com --format json

# CI/CD mode (exit code reflects risk)
pis scan https://example.com --ci --threshold high

# Generate MCPGuard policy rules
pis policies https://evil-site.com

API

# Scan via REST API
curl "http://localhost:8000/api/scan?url=https://example.com"

# HTML report
curl "http://localhost:8000/api/scan/https://example.com"

How It Works

Detection Layers

Layer What It Detects
Hidden Text Detector 20+ CSS/HTML hiding techniques (display:none, visibility, opacity, color matching, off-screen, zero-width chars, HTML comments)
Injection Pattern Matcher 100+ regex patterns across 5 categories (jailbreak, role override, exfiltration, tool manipulation, impersonation)
Instruction Classifier LLM-as-judge that understands adversarial intent (requires API key)
Metadata Analyzer HTML comments, JSON-LD, meta tags, data attributes, <noscript>, <template>
Exfiltration Detector URLs, endpoints, eval() patterns, redirect attempts, fetch() calls

Scoring

Risk Score: 0-100

Weighted formula:
  base = 100
  - critical * 25
  - high * 10
  - medium * 3
  - low * 1

Categories: none (0-5) โ†’ low (6-20) โ†’ medium (21-50) โ†’ high (51-80) โ†’ critical (81-100)

Architecture

User (CLI / Web / API)
        โ”‚
        โ–ผ
PipelineOrchestrator
        โ”‚
        โ”œโ”€โ”€ Loader (URL / File / Paste / PDF)
        โ”‚
        โ”œโ”€โ”€ Detector Pipeline (parallel)
        โ”‚   โ”œโ”€โ”€ HiddenTextDetector
        โ”‚   โ”œโ”€โ”€ InjectionPatternMatcher
        โ”‚   โ”œโ”€โ”€ MetadataAnalyzer
        โ”‚   โ”œโ”€โ”€ ExfiltrationDetector
        โ”‚   โ””โ”€โ”€ InstructionClassifier (LLM)
        โ”‚
        โ”œโ”€โ”€ ScoringEngine
        โ”‚
        โ””โ”€โ”€ Reporters
            โ”œโ”€โ”€ JSON / Markdown / Simple
            โ”œโ”€โ”€ Policy Generator (MCPGuard)
            โ””โ”€โ”€ Web UI (HTMX)

Project Structure

src/scanner/
โ”œโ”€โ”€ cli.py              # Typer CLI
โ”œโ”€โ”€ api.py              # FastAPI web app
โ”œโ”€โ”€ config.py           # Settings (env vars)
โ”œโ”€โ”€ domain/
โ”‚   โ”œโ”€โ”€ models.py       # Pydantic models
โ”‚   โ””โ”€โ”€ scoring.py      # Risk score engine
โ”œโ”€โ”€ loaders/
โ”‚   โ”œโ”€โ”€ url.py          # HTTP URL fetcher
โ”‚   โ”œโ”€โ”€ pdf.py          # PDF extractor
โ”‚   โ””โ”€โ”€ paste.py        # Raw text
โ”œโ”€โ”€ detectors/
โ”‚   โ”œโ”€โ”€ hidden_text.py       # CSS/HTML hiding
โ”‚   โ”œโ”€โ”€ injection_patterns.py # 100+ regex patterns
โ”‚   โ”œโ”€โ”€ instruction_classifier.py  # LLM-as-judge
โ”‚   โ”œโ”€โ”€ metadata_analyzer.py # Comments/meta/tags
โ”‚   โ””โ”€โ”€ exfiltration.py     # Data theft patterns
โ”œโ”€โ”€ pipeline/
โ”‚   โ””โ”€โ”€ orchestrator.py # Scan pipeline
โ”œโ”€โ”€ reporters/          # JSON/MD/Simple output
โ”œโ”€โ”€ policies/           # MCPGuard rule generation
โ””โ”€โ”€ utils/              # DOM helpers

Integration

MCPGuard

Generate rules compatible with MCPGuard:

pis scan https://evil-site.com --format mcpguard > rules.yaml
mcpguard load-rules rules.yaml

CI/CD

# .github/workflows/check-urls.yml
- name: Scan for prompt injection
  run: |
    pis scan ${{ matrix.url }} --ci --threshold medium

Roadmap

  • v0.1 โ€” Scanner core: CLI, 5 detectors, scoring, policy generation
  • v0.2 โ€” Live Monitor: scheduled re-scans, webhook alerts, diff detection
  • v0.3 โ€” Agent Validator: Browser Use agent tests pages in real time
  • v0.4 โ€” Content Safety Proxy: reverse proxy that strips injections
  • v0.5 โ€” Reputation Engine: web of trust for agent-safe URLs
  • v0.6 โ€” Red Team Lab: adversarial page generator + benchmark suite
  • v0.7 โ€” Certification Pipeline: verified AgentSafe badges

Related Projects

  • MCPGuard โ€” Runtime security proxy for MCP
  • MCPwn โ€” Offensive security testing for MCP
  • MCPscop โ€” Unified security dashboard

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

palisade_scanner-0.1.2.tar.gz (74.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

palisade_scanner-0.1.2-py3-none-any.whl (74.8 kB view details)

Uploaded Python 3

File details

Details for the file palisade_scanner-0.1.2.tar.gz.

File metadata

  • Download URL: palisade_scanner-0.1.2.tar.gz
  • Upload date:
  • Size: 74.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for palisade_scanner-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ae71b91b41456d23d93f5c8ebee1148add49ddbbd52abb29b48d1e2760bed28f
MD5 5733e7fdf640c5d9ba79643faa701b50
BLAKE2b-256 e568c4f3f7c37eefb9371132270ba181c4f5d133fe987f1eca3eee5286edaa14

See more details on using hashes here.

File details

Details for the file palisade_scanner-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for palisade_scanner-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9961883b75eac5d51d19ebbcbbda8991455cb1e27f1292f453c004e642d5585f
MD5 a5697983abedfe22e392fca4adde682a
BLAKE2b-256 df2e49f9a0a7b17c50c297e7fa0d0784f7bb78d0101b6cf576401d09630c67e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page