Open-source static AI security scanner — prompt injection, broken LLM-as-judge, AI SBOM.
Project description
Whitney
Open-source static AI security scanner — finds prompt injection patterns that commodity scanners miss, without burning LLM API credits on every run.
Whitney is a curated Semgrep ruleset plus a thin Python wrapper plus an opt-in LLM-as-judge triage layer plus an AI dependency SBOM extractor. Zero custom SAST. Zero LLM calls in the default path. Python-only for now (TS/JS/Go on the roadmap).
pip install whitney
whitney scan ./your-repo
SEVERITY FILE :LINE TITLE
critical app/handlers/chat.py :42 LangChain SQLDatabaseChain with user-controlled question (P2SQL)
critical app/main.py :89 pandas_dataframe_agent.run with user input — arbitrary code exec
high app/api.py :17 Flask handler interpolates request.json into LLM prompt
high app/rag.py :55 WebBaseLoader fetched content flows into LLM chain unfiltered
What it finds
Prompt injection across 15 source types:
| Class | Source types covered |
|---|---|
| Direct | direct_http (Flask/FastAPI/Django), direct_cli (argparse/click/stdin), direct_voice (Whisper/Twilio SpeechResult) |
| Indirect fetched | indirect_rag (Chroma/Pinecone/Weaviate/pgvector), indirect_web_fetch (requests/WebBaseLoader/SeleniumURLLoader), indirect_file_upload (PyPDFLoader/UnstructuredFileLoader), indirect_email (SES/SNS), indirect_search (Tavily/SerpAPI/Brave/Google CSE) |
| Indirect agent | indirect_tool_response (LangChain tool return values), indirect_mcp (MCP call_tool responses), indirect_a2a (CrewAI/LangGraph agent-to-agent context handoff) |
| Indirect stored | indirect_db_stored (DB query results into prompts), indirect_memory_stored (Mem0/Zep/LangChain memory replay) |
| Cross-modal | cross_modal_image_ocr (pytesseract/easyocr), cross_modal_unicode (tag block/ZWJ/homoglyphs) |
Plus critical sinks by presence alone: LangChain PALChain / PythonAstREPLTool (CVE-2023-36258 class) and SQLDatabaseChain / create_sql_agent / NLSQLTableQueryEngine (P2SQL class).
Plus an AI dependency SBOM (whitney sbom) — a CycloneDX 1.5 inventory of the AI SDKs and models discovered in requirements.txt, pyproject.toml, package.json, and model=... source assignments. Includes a small known-vulnerable SDK version table.
Why it exists
The commodity static AI rulesets (Semgrep p/ai-best-practices, Bandit, Agent Audit) only catch ~30–50% of real prompt-injection patterns and are completely blind to indirect injection (RAG retrievals, web fetches, tool responses, agent-to-agent context handoffs). Whitney targets that gap.
| Scanner | Corpus recall | Corpus precision | Corpus F1 | Findings on 6 real-world AI repos |
|---|---|---|---|---|
| Whitney default (no LLM) | 1.000 | 0.897 | 0.945 | 47 |
| Whitney triage-on (opt-in) | 1.000 | 1.000 | 1.000 | 47 |
Semgrep p/ai-best-practices |
0.500 | 0.867 | 0.634 | 14 |
| Agent Audit 0.18.2 | 0.308 | 0.571 | 0.400 | — |
Bandit / Semgrep p/security-audit |
0.000 | — | — | 0 |
The corpus is 35 hand-labelled fixtures (26 positives + 9 negatives) spanning all 15 source types, shipped in tests/corpus/. Every fixture has a YAML sidecar with source_url, source_commit, vuln_subtype, and labelling rationale. Reproduce locally with python -m tests.corpus.eval.
On 5 blind-test repositories that were never used to develop the rules — aimaster-dev/chatbot-using-rag-and-langchain, Lizhecheng02/RAG-ChatBot, SachinSamuel01/rag-langchain-streamlit, streamlit/example-app-langchain-rag, Vigneshmaradiya/ai-agent-comparison — Whitney produces 11 findings, of which 9 are true positives and 2 are false positives in developer main() test harnesses. 81.8% precision, hand-audited. Full audit table in tests/corpus/DIFFERENTIAL.md.
Recognised defences
Whitney suppresses findings only when a vendor guardrail or correct LLM-as-judge is called on the untrusted content before it reaches the LLM:
- AWS Bedrock Guardrails (
apply_guardrail,GuardrailIdentifier=oninvoke_model) - Azure AI Content Safety / Prompt Shields (
ContentSafetyClient.detect_jailbreak) - Lakera Guard (
api.lakera.aior SDK calls) - NeMo Guardrails (
LLMRails.generate, wrapping-style) - DeepKeep AI firewall (
dk_request_filter) - OpenAI Moderation (
client.moderations.create) - Correct LLM-as-judge (classified via the opt-in triage layer — see
docs/TRIAGE.md)
Weak defences are explicitly not counted: regex/Pydantic string validation, length caps, keyword blocklists, system-prompt admonitions. All bypassable via Unicode, homoglyphs, Base64, language switching, or paraphrase. Whitney still records their presence in details["defense_present"] so remediation messages can point the developer at a stronger replacement.
Architecture
whitney/
├── scanner.py # public scan_repository(path) entry point
├── semgrep_runner.py # subprocess wrapper around Semgrep CLI
├── llm_triage.py # opt-in LLM-as-judge classifier (Claude Opus or mock)
├── sbom.py # AI dependency SBOM scanner (CycloneDX 1.5)
├── models.py # stdlib @dataclass Finding (no pydantic)
├── cli.py # `whitney scan|sbom|version` command-line interface
└── rules/
├── prompt_injection_taint.yaml # Semgrep taint mode — flow detection
├── prompt_injection_critical_sinks.yaml # PAL chain + SQL chain — presence alone
└── prompt_injection_structural.yaml # CrewAI / LLMChain / WebBaseLoader idioms
Three Semgrep rule files, each with a distinct detection philosophy:
-
prompt_injection_taint.yaml— single consolidated taint rule. 50+ pattern-sources, 25+ pattern-sanitizers, 40+ pattern-sinks, surgicalpattern-notexclusions for UI/storage shapes that aren't real LLM sinks. Catches direct and indirect prompt injection where the vulnerability is a data flow. -
prompt_injection_critical_sinks.yaml— AST pattern rules for sinks where presence alone is critical (no taint flow required): PAL chains, SQL chains, tool-calling executors with arbitrary code paths. -
prompt_injection_structural.yaml— AST pattern rules for code shapes where the vulnerability is the structure, not the data flow: CrewAITask(..., context=[upstream_task])agent handoff, LangChainLLMChainidiom,WebBaseLoader+ chain,PdfReader+ LLM.
Each rule has function-level guardrail suppression via pattern-not-inside: def $F(...): ... $BEDROCK.apply_guardrail(...) ... for recognised defences.
Total Python: ~800 lines across scanner.py, semgrep_runner.py, llm_triage.py, sbom.py, models.py, cli.py. No custom taint engine, no tree-sitter walker, no custom AST analysis. Semgrep does all the work.
Zero-LLM default, opt-in triage
The default scan path (whitney scan ./repo or scan_repository(path) without setting any env var) has zero LLM calls and produces byte-identical output across runs. Two opt-in modes layer on top:
# Default — Semgrep only, no API calls, fully reproducible
whitney scan ./my-repo
# Mock triage — structural heuristic for LLM-as-judge correctness, no API calls
WHITNEY_STRICT_JUDGE_PROMPTS=1 WHITNEY_TRIAGE_MOCK=1 whitney scan ./my-repo
# Real triage — calls Claude Opus to classify LLM-as-judge prompts
export ANTHROPIC_API_KEY=sk-ant-...
WHITNEY_STRICT_JUDGE_PROMPTS=1 whitney scan ./my-repo
Real-mode triage uses claude-opus-4-6 at temperature=0 with cached verdicts keyed by (model_id, prompt_version, code_hash), so repeat scans of unchanged code cost nothing. Cost cap of 50 classifier calls per scan, fail-open on any error. See docs/TRIAGE.md for full operator instructions, cost estimates, and troubleshooting.
Path excludes
Path excludes are applied automatically: venv, .venv, env, __pycache__, node_modules, tests, test_*, fixtures, examples, docs, dist, build, site-packages. A finding inside a test fixture is a false positive from the developer's perspective, regardless of its technical correctness.
What gets emitted
from whitney import scan_repository
findings = scan_repository("./my-repo")
for f in findings:
print(f.severity.value, f.check_id, f.details["file_path"], f.details["line_number"])
print(" CWE:", f.details["cwe"])
print(" OWASP LLM Top 10:", f.details["owasp"])
print(" OWASP Agentic:", f.details["owasp_agentic"])
Each finding carries:
check_id,title,description,severity,remediationdetails.file_path,details.line_number,details.end_line,details.code_snippetdetails.cwe— e.g.["CWE-94"]details.owasp— OWASP LLM Top 10 year-specific tag, e.g.["LLM01:2025"]details.owasp_agentic— OWASP Agentic Top 10, e.g.["AA01:2026"]details.confidence—HIGH/MEDIUM/LOWdetails.technology— frameworks recognised in the rule
CWE and the two OWASP families ship in the rule YAML metadata directly — they're free with every finding. Regulatory framework enrichment (ISO 42001, EU AI Act, NIST AI RMF, MITRE ATLAS) is the responsibility of downstream tooling that consumes Whitney's findings; the Shasta compliance package is one such consumer.
Known limitations
- Intra-file only. Semgrep OSS taint is intraprocedural and intra-file. Cross-file flows (taint source in
handlers/chat.py, LLM sink inservices/llm.py) are not tracked. This was empirically validated against 3 real-world repos — every vulnerable flow was intra-file — but larger monorepos may need Semgrep Pro or a future structural-rule extension. - Python only. TypeScript / JavaScript / Go support is on the roadmap. The rule authoring approach transfers directly once the source/sink taxonomies are filled in.
- Guardrail policy validation is out of scope. If a developer calls
bedrock.apply_guardrail(GuardrailIdentifier="xxx"), Whitney trusts that the policy "xxx" actually covers prompt injection. Validating the policy content would require pulling the policy definition from AWS at scan time. - 2 known FP patterns in blind tests. Developer
main()test harnesses with hardcoded queries inside a helper file can produce false positives if the helper imports RAG retrievers. Surfaces at 81.8% precision on never-previously-scanned real-world code — the cost of keepingdef main():entry points in scope so legitimate CLI apps are still caught.
See also
- docs/TRIAGE.md — operator guide for the opt-in LLM-as-judge triage layer
- docs/SCANNER.md — deeper architectural notes on the rule files
- tests/corpus/DIFFERENTIAL.md — full benchmark against every commodity scanner tested
- tests/corpus/README.md — how the labelled corpus is structured and how to add a fixture
- Shasta — sibling project covering cloud AI governance (AWS Bedrock / SageMaker, Azure OpenAI / ML), the regulatory framework mappings (ISO 42001, EU AI Act, NIST AI RMF, MITRE ATLAS), policy generation, dashboards, and reports
License
Apache-2.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whitney-0.1.0.tar.gz.
File metadata
- Download URL: whitney-0.1.0.tar.gz
- Upload date:
- Size: 47.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7a1e1719be21dad2008a1e184ca0221a3ac45e191a932d3b439f1f0534282e6
|
|
| MD5 |
74af23825396a3e9065f13b87182492c
|
|
| BLAKE2b-256 |
b3ddb3705362a73addbcc6259532d5e10919e444589b26dc5a1cef3173a0ef7a
|
File details
Details for the file whitney-0.1.0-py3-none-any.whl.
File metadata
- Download URL: whitney-0.1.0-py3-none-any.whl
- Upload date:
- Size: 46.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0068b7a171e3c1c76fb85da743bd71d1ace2cfab8aaa8e83b84b675789415fd3
|
|
| MD5 |
125a880fc9778bab1ec6512aeb08bb8c
|
|
| BLAKE2b-256 |
97b73575e5f6caa2ea94dbd5859fcab39829cc54d6b7574de5ab266ad8ee3d57
|