Adversarial security testing framework for LLM-powered applications

These details have not been verified by PyPI

Project links

Project description

LLMStrike

Adversarial security testing for LLM-powered applications.

Installation · Quick Start · Attack Categories · CI/CD · Custom Techniques

LLMStrike is an open-source Python CLI that runs a battery of AI-specific attack techniques against any LLM application endpoint and produces a detailed vulnerability report.

Think of it as Burp Suite for LLM applications — you point it at your running endpoint, not a model API directly. It tests the full application stack in production-like conditions: the system prompt, the RAG pipeline, context injection, output filtering, and how the application constructs requests.

25 techniques. 6 attack categories. OWASP-mapped. CI-ready. Extensible via YAML.

Why LLMStrike?

Most LLM security tools test the model in isolation. LLMStrike tests your application — the system prompt, the RAG pipeline, the context injection, the output filtering, and the request construction. That's where the real vulnerabilities live.

Tool	What it tests	Blind spot
garak	The underlying model — hallucination, toxicity, base model behavior	Your system prompt, RAG pipeline, and context injection are invisible to it
LLMStrike	The full application stack — system prompt, RAG pipeline, context injection, output filtering	This is the gap

LLMStrike directly implements the adversarial testing requirements called out in Executive Order 14110 on AI Safety and the NIST AI Risk Management Framework.

Warning — Ethical Use

LLMStrike is designed to test applications you own or have explicit written authorization to test. Unauthorized testing of third-party systems may violate computer fraud laws. Always obtain written permission before running LLMStrike against any endpoint you do not control.

Installation

pip install llmstrike

Or from source:

git clone https://github.com/akeemmckenzie/llmstrike.git
cd llmstrike
pip install -e .

Requirements: Python 3.10+

Quick Start

# Test an OpenAI-compatible endpoint
llmstrike probe --target https://your-app.com/api/chat --key sk-...

# Test an Anthropic endpoint
llmstrike probe --target https://your-app.com/api/chat --format anthropic --key sk-ant-...

# Run only prompt injection tests
llmstrike probe --target https://your-app.com/api/chat --key sk-... \
  --category prompt-injection-direct

# Run specific techniques by ID
llmstrike probe --target https://your-app.com/api/chat --key sk-... \
  --techniques pi_direct_role_switch,jailbreak_dan

# List all available techniques
llmstrike list techniques

# List attack categories
llmstrike list categories

Attack Categories

LLMStrike ships with 25 techniques across 6 categories, each mapped to the OWASP Top 10 for LLM Applications:

Category	OWASP	Severity	Techniques	What it tests
`prompt-injection-direct`	LLM01	CRITICAL	5	Role switching, instruction overrides, delimiter escapes, context escapes, completion-based leaking
`prompt-injection-indirect`	LLM01	CRITICAL	3	Document injection, web content injection, hidden/steganographic instructions
`jailbreak`	LLM01	HIGH	5	DAN-style personas, roleplay, hypothetical framing, encoding tricks, multi-turn escalation
`system-prompt-extraction`	LLM06	HIGH	5	Verbatim extraction, translation tricks, debug mode, constraint enumeration, behavioral probing
`data-exfiltration`	LLM06	HIGH	4	PII extraction, training data leakage, cross-context leakage, tool/RAG output leakage
`rag-poisoning`	LLM03	CRITICAL	3	Authority injection, false context injection, instruction smuggling via metadata

Detection Methods

Each technique carries its own detection logic:

Keyword — checks if the response contains specific success indicators
Keyword (inverted) — flags vulnerability when refusal phrases are absent (the model didn't refuse)
Regex pattern — matches response content against regex patterns (PII formats, credential patterns, system prompt fragments)
LLM-as-judge — uses a separate LLM to evaluate whether the response indicates a vulnerability

CLI Reference

`llmstrike probe`

Run an adversarial security probe against an LLM endpoint.

Options:
  --target URL             Target endpoint URL (required)
  --key API_KEY            API key (Bearer token / x-api-key)
  --format FORMAT          openai | anthropic | generic | raw (default: openai)
  --model MODEL            Model name for request body
  --system-prompt TEXT     System prompt to include in requests
  --category CATEGORY      Run only this category (repeatable)
  --techniques IDS         Comma-separated technique IDs
  --output DIR             Report output directory (default: ./llmstrike-reports)
  --judge-key API_KEY      API key for LLM-as-judge evaluation
  --concurrency N          Parallel technique runners (default: 3)
  --ci                     CI mode: JSON to stdout, exit 1 on critical/high
  --timeout SECONDS        Per-request timeout (default: 30)

`llmstrike list techniques`

llmstrike list techniques                                  # all techniques
llmstrike list techniques --category prompt-injection-direct  # filter by category

`llmstrike list categories`

llmstrike list categories

Target Formats

OpenAI (default)

Standard OpenAI-compatible /chat/completions format. Works with OpenAI, Azure OpenAI, vLLM, LocalAI, Ollama, and any OpenAI-compatible API.

llmstrike probe --target https://api.openai.com/v1/chat/completions \
  --key sk-... --model gpt-4o

Anthropic

Anthropic Messages API format.

llmstrike probe --target https://api.anthropic.com/v1/messages \
  --format anthropic --key sk-ant-... --model claude-sonnet-4-20250514

Generic

Jinja2-templated requests for custom API formats.

llmstrike probe --target https://your-app.com/api/query --format generic

Raw

Simple {"prompt": "..."} POST format for custom endpoints.

llmstrike probe --target https://your-app.com/api/generate --format raw

Scoring

LLMStrike produces a 0-100 security score with a letter grade for every probe:

Grade	Score	Meaning
A	90-100	Strong resistance across all tested categories
B	75-89	Minor weaknesses detected
C	60-74	Moderate vulnerabilities — remediation recommended
D	45-59	Significant vulnerabilities found
F	< 45	Critical exposure — immediate action required

How scoring works:

Each technique starts at 100 points, deducted based on (vulnerable_payloads / total_payloads) * severity_weight * 8
Severity weights: Critical = 10, High = 7, Medium = 4, Low = 1.5, Info = 0.5
Hard caps: any CRITICAL finding caps the overall score at 60; any HIGH caps at 75

Adding Custom Techniques

Create a YAML file following this schema:

id: my_custom_technique
name: "My Custom Technique"
description: "What this technique tests"
category: "prompt-injection-direct"
owasp: "LLM01:2023"
owasp_name: "Prompt Injection"
severity: high
payloads:
  - "payload 1"
  - "payload 2"
  - "payload 3"
  - "payload 4"
  - "payload 5"
detection:
  method: keyword
  keywords:
    - "indicator1"
    - "indicator2"
remediation: "How to fix this vulnerability"
references:
  - "https://example.com/reference"
tags:
  - "custom"

Drop the file in any directory and point LLMStrike at it:

export LLMSTRIKE_TECHNIQUES_DIR=/path/to/custom/techniques
llmstrike probe --target https://your-app.com/api/chat --key sk-...

Or run specific technique IDs directly with --techniques.

See CONTRIBUTING.md for full guidelines on writing techniques.

CI/CD Integration

GitHub Actions

name: LLM Security Scan
on:
  pull_request:
    branches: [main]

jobs:
  llm-security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install LLMStrike
        run: pip install llmstrike

      - name: Run security probe
        run: |
          llmstrike probe \
            --target ${{ secrets.LLM_ENDPOINT }} \
            --key ${{ secrets.LLM_API_KEY }} \
            --ci
        # Exit code 1 if any critical or high severity findings

In CI mode (--ci), LLMStrike outputs JSON to stdout and exits with code 1 if any critical or high severity findings are detected — making it a drop-in quality gate.

Architecture

                    +-------------+
                    |   CLI       |  (Click)
                    +------+------+
                           |
                    +------v------+
                    |   Runner    |  (asyncio orchestration)
                    +------+------+
                           |
              +------------+------------+
              |            |            |
       +------v----+ +----v-----+ +----v------+
       | Connector | | Scorer   | | Reporter  |
       | (httpx)   | | (grades) | | (HTML/JSON)|
       +-----------+ +----------+ +-----------+
              |
       +------v------+
       | Techniques  |  (YAML loader)
       +-------------+
              |
       +------v------+
       | techniques/ |  (YAML files)
       +-------------+

Reports

Every probe generates:

HTML report — self-contained, shareable security assessment with findings, evidence, remediation guidance, and scoring breakdown
JSON report — machine-readable output for integration with dashboards, SIEM, or compliance tooling
Terminal summary — color-coded findings table with severity, grade, and hit rates

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmstrike-0.1.0.tar.gz (39.4 kB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmstrike-0.1.0-py3-none-any.whl (70.7 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file llmstrike-0.1.0.tar.gz.

File metadata

Download URL: llmstrike-0.1.0.tar.gz
Upload date: Apr 9, 2026
Size: 39.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for llmstrike-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`44d417b853bcb06d42a8a2b56cb63d19b0e79ea38fc06d0e060a499abc9965fa`
MD5	`0a004bd324d6ab70c12b6c5ff9727f7e`
BLAKE2b-256	`cc5a1cc11eb7c818e9ca852663b7e931f7256be279d154dc0eabc600697f462d`

See more details on using hashes here.

File details

Details for the file llmstrike-0.1.0-py3-none-any.whl.

File metadata

Download URL: llmstrike-0.1.0-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 70.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for llmstrike-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d2ff1cf92bfc973052bc9c924faea4fe85f81eb6a08eaf43c0f593dcf9fe76da`
MD5	`a71e8a0ed264df87f2a75a986267a218`
BLAKE2b-256	`5aa07c965706b7d38d9eb5ff83a31c5043091bc4d379ac6df075713f9aeb4a81`

See more details on using hashes here.

llmstrike 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLMStrike

Why LLMStrike?

Warning — Ethical Use

Installation

Quick Start

Attack Categories

Detection Methods

CLI Reference

llmstrike probe

llmstrike list techniques

llmstrike list categories

Target Formats

OpenAI (default)

Anthropic

Generic

Raw

Scoring

Adding Custom Techniques

CI/CD Integration

GitHub Actions

Architecture

Reports

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`llmstrike probe`

`llmstrike list techniques`

`llmstrike list categories`