Skip to main content

Agentic LLM vulnerability scanner

Project description

Logo

Langalf

The open-source Agentic LLM Vulnerability Scanner .
Learn more »

GitHub Contributors GitHub Last Commit Downloads GitHub Issues GitHub Pull Requests Github License

About the Project 🧙

booking-screen

LLM threat vectors scanner

Prebuilt Datasets of Prompts


Focused on OWASP top 10 LLM


Integration under 1 min

Features

  • Comprehensive Threat Detection 🛡️: Scans for a wide array of LLM vulnerabilities including prompt injection, jailbreaking, hallucinations, biases, and other malicious exploitation attempts.
  • OWASP Top 10 for LLMs scan: to test the list of the most critical LLM vulnerabilities.
  • Privacy-centric Architecture 🔒: Ensures that all data scanning and analysis occur on-premise or in a local environment, with no external data transmission, maintaining strict data privacy.
  • Comprehensive Reporting Tools 📊: Offers detailed reports of vulnerability, helping teams to quickly understand and respond to security incidents.
  • Customizable Rule Sets 🛠️: Allows users to define custom attack rules and parameters to meet specific prompt attacks needs and compliance standards.

Note: Please be aware that Langalf is designed as a safety scanner tool and not a foolproof solution. It cannot guarantee complete protection against all possible threats.

📦 Installation

To get started with Langalf, simply install the package using pip:

pip install langalf

⛓️ Quick Start

langalf

2024-04-13 13:21:31.157 | INFO     | langalf.probe_data.data:load_local_csv:273 - Found 1 CSV files
2024-04-13 13:21:31.157 | INFO     | langalf.probe_data.data:load_local_csv:274 - CSV files: ['prompts.csv']
INFO:     Started server process [18524]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8718 (Press CTRL+C to quit)
python -m langalf
# or
langalf --help


langalf --port=PORT --host=HOST

LLM kwargs

Langalf uses plain text HTTP spec like:

POST https://api.openai.com/v1/chat/completions
Authorization: Bearer sk-xxxxxxxxx
Content-Type: application/json

{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "<<PROMPT>>"}],
     "temperature": 0.7
}

Where <<PROMPT>> will be replaced with the actual attack vector during the scan, insert the Bearer XXXXX header value with your app credentials.

Adding LLM integration templates

TBD

....

Adding own dataset

To add your own dataset you can place one or multiples csv files with prompt column, this data will be loaded on langalf startup

2024-04-13 13:21:31.157 | INFO     | langalf.probe_data.data:load_local_csv:273 - Found 1 CSV files
2024-04-13 13:21:31.157 | INFO     | langalf.probe_data.data:load_local_csv:274 - CSV files: ['prompts.csv']

Extending dataset collections

  1. Add new metadata to langalf.probe_data.REGISTRY
    {
        "dataset_name": "markush1/LLM-Jailbreak-Classifier",
        "num_prompts": 1119,
        "tokens": 19758,
        "approx_cost": 0.0,
        "source": "Hugging Face Datasets",
        "selected": True,
        "dynamic": False,
        "url": "https://huggingface.co/markush1/LLM-Jailbreak-Classifier",
    },

and implement loader into

@dataclass
class ProbeDataset:
    dataset_name: str
    metadata: dict
    prompts: list[str]
    tokens: int
    approx_cost: float

    def metadata_summary(self):
        return {
            "dataset_name": self.dataset_name,
            "num_prompts": len(self.prompts),
            "tokens": self.tokens,
            "approx_cost": self.approx_cost,
        }

Dynamic datasets with mutations

One of the current examples uses sampling for the existing preloaded prompt data and applying mutations yielding a new dataset

class Stenography:
    fn_library = {
        "rot5": stenography_fn.rot5,
        "rot13": stenography_fn.rot13,
        "base64": stenography_fn.base64_encode,
        "mirror": stenography_fn.mirror_words,
        "random_case": stenography_fn.randomize_letter_case,
        "scramble_words": stenography_fn.scramble_words,
        "noise_characters": stenography_fn.insert_noise_characters,
        "substitute_with_ascii": stenography_fn.substitute_with_ascii,
        "remove_vowels": stenography_fn.remove_vowels,
        "zigzag_obfuscation": stenography_fn.zigzag_obfuscation,
    }

    def __init__(self, prompt_groups: [ProbeDataset]):
        self.prompt_groups = prompt_groups

    def apply(self):
        for prompt_group in self.prompt_groups:

            size = len(prompt_group.prompts)
            for name, fn in self.fn_library.items():
                logger.info(f"Applying {name} to {prompt_group.dataset_name}")
                selected_prompts = random.sample(prompt_group.prompts, int(size * 0.25))
                prompts = [
                    f"Follow the instructions in {name}: {fn(p)}"
                    for p in selected_prompts
                ]
                yield ProbeDataset(
                    dataset_name=f"stenography.{name}({prompt_group.dataset_name})",
                    metadata={},
                    prompts=prompts,
                    tokens=count_words_in_list(prompts),
                    approx_cost=0.0,
                )

Probe endpoint

In the example of custom integration, we use /v1/self-probe for the sake of integration testing.

POST https://langalf-preview.vercel.app/v1/self-probe
Authorization: Bearer XXXXX
Content-Type: application/json

{
    "prompt": "<<PROMPT>>"
}

This endpoint randomly mimics the refusal of a fake LLM.

@app.post("/v1/self-probe")
def self_probe(probe: Probe):
    refuse = random.random() < 0.2
    message = random.choice(REFUSAL_MARKS) if refuse else "This is a test!"
    message = probe.prompt + " " + message
    return {
        "id": "chatcmpl-abc123",
        "object": "chat.completion",
        "created": 1677858242,
        "model": "gpt-3.5-turbo-0613",
        "usage": {"prompt_tokens": 13, "completion_tokens": 7, "total_tokens": 20},
        "choices": [
            {
                "message": {"role": "assistant", "content": message},
                "logprobs": None,
                "finish_reason": "stop",
                "index": 0,
            }
        ],
    }

CI/CD integration

TBD

Documentation

For more detailed information on how to use Langalf, including advanced features and customization options, please refer to the official documentation.

Roadmap and Future Goals

  • Expand dataset variety
  • Introduce two new attack vectors
  • Develop initial attacker LLM
  • Complete integration of OWASP Top 10 classification

Note: All dates are tentative and subject to change based on project progress and priorities.

👋 Contributing

Contributions to Langalf are welcome! If you'd like to contribute, please follow these steps:

  • Fork the repository on GitHub
  • Create a new branch for your changes
  • Commit your changes to the new branch
  • Push your changes to the forked repository
  • Open a pull request to the main Langalf repository

Before contributing, please read the contributing guidelines.

License

Langalf is released under the Apache License v2.

Contact us

🤝 Schedule a 1-on-1 Session

Book us with Cal.com

Book a 1-on-1 Session with the founders, to discuss any issues, provide feedback, or explore how we can improve langalf for you.

Repo Activity

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langalf-0.0.4.tar.gz (25.4 kB view details)

Uploaded Source

File details

Details for the file langalf-0.0.4.tar.gz.

File metadata

  • Download URL: langalf-0.0.4.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-1017-azure

File hashes

Hashes for langalf-0.0.4.tar.gz
Algorithm Hash digest
SHA256 909841262fd5063e9cf690bb078e034161ab70347ce900bcf1bfcf95eb59e37e
MD5 664fdb95fe7d938ab943dd674f2d0318
BLAKE2b-256 9650f752856e34f0ebc8ec65252c95ccbab83b0444fd97205fbdefe581c23019

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page