Skip to main content

Adversarial security testing framework for LLM applications

Project description

PromptFuzz

Adversarial security testing for LLM applications.

Find prompt injection, jailbreak, and data extraction vulnerabilities before attackers do.

PyPI version Python 3.10+ License: AGPL-3.0 Tests


You ship an LLM-powered product. You add a system prompt. You think it's secure. It isn't. PromptFuzz finds out before your users do.

PromptFuzz fires 165+ real adversarial attack prompts — jailbreaks, prompt injections, data extraction attempts, goal hijacking, and edge cases — at your application and generates a professional vulnerability report in seconds.


Install

pip install promptfuzz

Optional extras:

pip install "promptfuzz[openai]"     # if your target uses OpenAI
pip install "promptfuzz[anthropic]"  # if your target uses Anthropic

Quick start

1. Interactive wizard (recommended for first-time users)

Wizard UI

Just run promptfuzz with no arguments:

$ promptfuzz

  PromptFuzz v0.1.0 — LLM Security Testing

  ? What is your target?
    > HTTP/HTTPS endpoint (URL)
      Python function (module:function)

  ? Enter target URL: http://localhost:8000/chat

  ? Select attack categories:
   ◉ data_extraction  — System prompt leaking, credential extraction
   ◉ injection        — Prompt override, delimiter attacks
   ◉ jailbreak        — Persona switches, DAN, roleplay bypasses
   ○ edge_cases       — Unicode, long inputs, encoding attacks
   ○ goal_hijacking   — Purpose redirection attacks

  ? Output format:
    > Terminal + HTML report  (report.html)

  ? Minimum severity to report:
    > low  (show everything)

  ─────────────────────────────────────────
  Target   : http://localhost:8000/chat
  Attacks  : 110  (data_extraction + injection + jailbreak)
  Output   : Terminal + report.html
  Severity : low+
  ─────────────────────────────────────────
  Press ENTER to start scan (Ctrl+C to cancel)

2. promptfuzz test — quickest way to run from the terminal

# Test any URL directly — no flags needed
promptfuzz test https://api.mychatbot.com/chat

# Save an HTML report
promptfuzz test https://api.mychatbot.com/chat --output report.html

# Test a local Python function
promptfuzz test myapp:chat_handler --categories jailbreak injection

# CI mode — exit code 1 if any high/critical vulnerability found
promptfuzz test https://api.mychatbot.com/chat --fail-on high

Sample terminal output:

Results

3. promptfuzz scan — full power CLI with config file support

promptfuzz scan --target http://localhost:8000/chat --output report.html
promptfuzz scan --config promptfuzz.yaml --fail-on high

3. Python API

from promptfuzz import Fuzzer

def my_chatbot(message: str) -> str:
    # your LLM call here
    return response

fuzzer = Fuzzer(
    target=my_chatbot,
    context="customer support chatbot",
    categories=["jailbreak", "injection", "data_extraction"],
)
result = fuzzer.run()
result.report()           # rich terminal output
result.save("report.html")  # HTML report

Testing a FastAPI endpoint

URL/Endpoint

PromptFuzz works with any HTTP endpoint that accepts POST requests. No code changes needed.

# your_app.py
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class ChatRequest(BaseModel):
    message: str

@app.post("/chat")
async def chat(req: ChatRequest):
    reply = your_llm_call(req.message)
    return {"response": reply}
# Start your app
uvicorn your_app:app

# Test it
promptfuzz scan --target http://localhost:8000/chat

The runner auto-detects http:// targets, sends {"message": "..."} as the request body, and reads the "response" field from the reply. Both field names are configurable.


Attack categories

Category Count What it tests
jailbreak 40 DAN variants, persona switches, roleplay bypasses, encoding tricks
injection 40 Classic overrides, delimiter attacks, role elevation, instruction smuggling
data_extraction 30 System prompt leakage, credential extraction, reflection attacks
goal_hijacking 25 Competitor promotion, purpose replacement, loyalty switches
edge_cases 30 Unicode abuse, long inputs, encoding edge cases, null bytes
Total 165
promptfuzz list-attacks   # view full table with severity breakdown

CLI reference

promptfuzz scan [OPTIONS]

  --target, -t       URL or module:function path
  --config, -c       YAML config file (mutually exclusive with --target)
  --context          Description of the target application
  --categories, -C   Attack categories to run (repeatable)
  --output, -o       Save HTML report to path
  --json             Save JSON report to path
  --severity, -s     Minimum severity to display [low|medium|high|critical]
  --fail-on, -f      Exit code 1 if vulns at/above this severity are found
  --max-workers, -w  Concurrent request workers (default: 5)
  --timeout, -T      Per-attack timeout seconds (default: 30)
  --verbose, -v      Enable verbose output

Config file

# promptfuzz.yaml
target: "http://localhost:8000/chat"
context: "customer support chatbot"
categories:
  - jailbreak
  - injection
  - data_extraction
max_workers: 5
timeout: 30
headers:
  Authorization: "Bearer YOUR_TOKEN"
input_field: message
output_field: response
promptfuzz scan --config promptfuzz.yaml --output report.html
promptfuzz validate --config promptfuzz.yaml   # validate before running

CI/CD integration

# .github/workflows/security.yml
name: LLM Security
on: [push, pull_request]

jobs:
  promptfuzz:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install promptfuzz
      - name: Start app
        run: uvicorn myapp:app &
      - name: Run security scan
        run: |
          promptfuzz scan \
            --target http://localhost:8000/chat \
            --categories jailbreak injection \
            --fail-on high \
            --output report.html
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: security-report
          path: report.html

--fail-on high exits with code 1 if any high or critical vulnerability is found, blocking the merge.


Security score

Score Risk level
80–100 Low risk
50–79 Medium risk
20–49 High risk
0–19 Critical risk

Formula: max(0, 100 − (critical×25 + high×10 + medium×5 + low×2))


Contributing

Adding new attacks is the easiest way to contribute. Each attack is a JSON object in one of the five files under promptfuzz/attacks/. Copy an existing entry, update the id, name, prompt, and detection fields, and open a PR.

{
  "id": "JB-041",
  "name": "My new jailbreak",
  "category": "jailbreak",
  "severity": "high",
  "description": "What this attack does and why it matters.",
  "prompt": "The actual adversarial prompt sent to the LLM.",
  "detection": {
    "method": "refusal",
    "indicators": [],
    "success_if": "refusal_absent"
  },
  "tags": ["jailbreak", "persona"],
  "remediation": "Add explicit system prompt instruction to refuse persona changes."
}

Run ruff check . and pytest tests/ -v before opening a PR.


License

AGPL-3.0 © PromptFuzz Contributors

Free to use for personal projects, security research, and open-source software. Commercial use in closed-source products requires a commercial license — open an issue to discuss.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptfuzz-0.2.0.tar.gz (96.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promptfuzz-0.2.0-py3-none-any.whl (100.7 kB view details)

Uploaded Python 3

File details

Details for the file promptfuzz-0.2.0.tar.gz.

File metadata

  • Download URL: promptfuzz-0.2.0.tar.gz
  • Upload date:
  • Size: 96.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for promptfuzz-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6817de240b998938a68a8036f60380966e516649e8ad8dc8073bc4005e240a5a
MD5 65f675edeeaa266a1cd8b2df4b9bc813
BLAKE2b-256 aabf5390296d1a0cbf8a98f6cf334cf72509ddc2e7d47c3df12f41cad2e81505

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptfuzz-0.2.0.tar.gz:

Publisher: publish.yml on VaradDurge/PromptFuzz

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file promptfuzz-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: promptfuzz-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 100.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for promptfuzz-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c0208c351d6e664d623087066f190fee8e73b8d4b102cc3d5946fcfc7bf4d42
MD5 7cfd3cf8d5d064092773aa4fa7e79c60
BLAKE2b-256 9f5d09eb915306e7b21af659462a92b9d30cbfb3e294bcca0e31fa610b87baee

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptfuzz-0.2.0-py3-none-any.whl:

Publisher: publish.yml on VaradDurge/PromptFuzz

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page