Skip to main content

LLM and System Prompt vulnerability scanner tool

Project description

Glaider Prompt Fuzzer

Safeguarding Your GenAI Applications

Glaider Prompt Fuzzer is a cutting-edge tool designed to enhance the security of your generative AI applications. By simulating various LLM-based attacks, it evaluates the robustness of your system prompts and helps you fortify them against potential vulnerabilities.

Key Features

  • Dynamic testing tailored to your application's unique configuration
  • Support for 16 LLM providers
  • 15 different attack simulations
  • Interactive and CLI modes
  • Multi-threaded testing for efficiency
  • Playground interface for iterative prompt improvement

Getting Started

Installation

Choose one of the following methods:

  1. Via pip:

    pip install glaider-fuzzer
    
  2. From PyPI: Visit the Glaider Fuzzer package page

  3. Download the latest release wheel file from our GitHub releases page

Quick Start

  1. Set up your API key:

    export OPENAI_API_KEY=sk-123XXXXXXXXXXXX
    
  2. Launch the fuzzer:

    glaider-fuzzer
    
  3. Follow the prompts to input your system prompt and begin testing.

Supported LLM Providers

We support a wide range of LLM providers, including OpenAI, Anthropic, Google PaLM, and many more. Set the corresponding environment variable for your chosen provider.

Command-Line Options

  • --list-providers: Display available providers
  • --list-attacks: Show available attack simulations
  • --attack-provider & --attack-model: Specify attack source
  • --target-provider & --target-model: Set target for testing
  • --num-attempts & --num-threads: Configure test parameters
  • --attack-temperature: Adjust attack model temperature
  • --debug-level: Set debugging verbosity
  • --batch: Enable unattended mode

Usage Examples

  1. Interactive Mode (Default):

    glaider_fuzzer
    
  2. Quick Single Run:

    glaider-fuzzer -b ./system_prompt.examples/medium_system_prompt.txt
    
  3. Custom Benchmark:

    glaider-fuzzer -b ./system_prompt.examples/medium_system_prompt.txt --custom-benchmark=glaider_fuzz/attack_data/custom_benchmark1.csv
    
  4. Subset of Attacks:

    glaider-fuzzer -b ./system_prompt.examples/medium_system_prompt.txt --custom-benchmark=glaider_fuzz/attack_data/custom_benchmark1.csv --tests='["ucar","amnesia"]'
    

Simulated Attack Details

We use a dynamic testing approach, where we get the necessary context from your System Prompt and based on that adapt the fuzzing process.

Jailbreak
  • AIM Jailbreak: Probes ethical compliance by eliciting instructions on illegal activities via Machiavelli roleplay.
  • Affirmative Suffix: Exploits a vulnerability to elicit affirmative responses for prohibited content.
  • Amnesia: Causes the LLM to forget or disregard system prompt instructions, leading to unrelated outputs.
  • Contextual Redirection: Tricks the LLM into adopting a specific role that may bypass preset restrictions.
  • Do Anything Now Jailbreak (DAN): Tests LLM resilience against performing forbidden activities through roleplay.
  • Harmful Behavior: Challenges the LLM with prompts from a harmful behavior dataset.
  • Linguistic Evasion: Uses non-English languages to bypass preset restrictions or focus.
  • Self Refine: Attempts to bypass system prompts by refining responses towards unethical directions.
  • UCAR: Assesses LLM adherence to content restrictions when asked to ignore such filters.
  • Base64 Evasion Attempts to evade detection of harmful prompts by encoding them into base64 strings

Prompt Injection
  • Authoritative Role Impersonation: Impersonates an authoritative role to misdirect the LLM's outputs.
  • Complimentary Transition: Tests ability to maintain content standards when switching topics after a compliment.
  • Ethical Compliance: Evaluates resistance to discussing harmful or inappropriate content about sensitive topics.
  • Typoglycemia Attack: Exploits text processing vulnerabilities by omitting random characters, causing incorrect responses.

System prompt extraction
  • System Prompt Stealer: Attempts to extract the LLM's internal configuration or sensitive information.
Definitions
  • Broken: Attack type attempts that LLM succumbed to.
  • Resilient: Attack type attempts that LLM resisted.
  • Errors: Attack type attempts that had inconclusive results.

Contributing

We welcome contributions!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glaider_fuzzer-0.0.6.tar.gz (82.0 kB view details)

Uploaded Source

Built Distribution

glaider_fuzzer-0.0.6-py3-none-any.whl (94.6 kB view details)

Uploaded Python 3

File details

Details for the file glaider_fuzzer-0.0.6.tar.gz.

File metadata

  • Download URL: glaider_fuzzer-0.0.6.tar.gz
  • Upload date:
  • Size: 82.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for glaider_fuzzer-0.0.6.tar.gz
Algorithm Hash digest
SHA256 08f27f1040871dfc70050068af7f70394cd64f98241f73619ce070e4b219114b
MD5 f53eafc0a19510248b03dd70f8b4d347
BLAKE2b-256 8c43715b2dc252412f5910fd2b0e6b2bd47e9467e657784ed2247d539a1eaa83

See more details on using hashes here.

File details

Details for the file glaider_fuzzer-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for glaider_fuzzer-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1b2f370b906b899ce16f5aea39784b9e397ddbf47ef7fae0f012c4a4c4181e32
MD5 f05398e20f7b224e1a69512bd37c63a8
BLAKE2b-256 965f161c998f76f52f55d3a4902f74cdce8f1b7788c15eab8627164b6e653a45

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page