Security Bench - Security testing for LLM pipelines

Project description

Security Bench

Security testing framework for AI/LLM pipelines - Test your AI systems for prompt injection, jailbreaks, data leakage, and other security vulnerabilities.

Overview

Security Bench is a comprehensive security testing tool designed for real-world AI deployments. Unlike benchmarks that only test base models, Security Bench tests your entire pipeline - including RAG systems, tool-calling agents, and multi-component architectures.

Key Features

Two Testing Modes - LLM endpoint scanning AND local code/config auditing
Pipeline-First Testing - Test deployed systems, not just models
Lynis-Style Auditing - Rich terminal output with grades, findings, and remediation
Privacy-Preserving - 100% local execution, no data leaves your environment
Comprehensive Coverage - 500+ LLM tests, 266 local checks across 32 categories
Simple to Advanced - Zero-config for quick scans, detailed config for complex systems

Quick Start

# Install
pip install securitybench

# Scan an LLM endpoint for vulnerabilities
sb scan https://api.example.com/chat

# Balanced scan (recommended for benchmarking)
sb scan https://api.example.com/chat --balanced

# Audit local project for security issues
sb audit

# Code-only analysis
sb code ./src

LLM Endpoint Testing

# Quick security check
sb scan https://api.example.com/chat --header "Authorization: Bearer sk-..." --limit 20

# Balanced scan across all 31 attack categories
sb scan https://api.example.com/chat --balanced

# Full scan with configuration file
sb scan --config securitybench.yaml

Local Security Auditing (Lynis-style)

# Full audit (code + config + infrastructure)
sb audit

# Code analysis only (prompt injection patterns, secrets, etc.)
sb code

# Configuration checks only (exposed keys, insecure settings)
sb config

# Infrastructure checks only (Docker, K8s, permissions)
sb infra

# Get remediation guidance for a specific finding
sb fix CODE-001

Test Modes

Security Bench provides flexible test modes to balance speed vs. coverage:

Mode	Tests	Use Case
`--limit 20`	20 random	Quick smoke test
`--limit 50`	50 random	Default, general testing
`--balanced`	155 (5 × 31 categories)	Recommended for benchmarking
`--balanced --per-category 10`	310	Thorough coverage
`--categories SPE,PIN`	Filtered	Focus on specific attacks

Balanced Mode (Recommended)

The --balanced flag ensures even sampling across all 31 attack categories:

sb scan https://api.example.com/chat --balanced --delay 2

Default: 5 tests per category = 155 tests total
Adjustable: --per-category 10 for 310 tests
Stable: Always includes all categories, even as the test database grows
Comparable: Results are comparable across models on the leaderboard

Rate Limiting

Use --delay to avoid rate limits when testing external APIs:

sb scan https://openrouter.ai/api/v1 -m gpt-4 --balanced --delay 2

This adds a 2-second pause between API calls (155 tests × 2s ≈ 5 minutes).

Test Categories (32 Total)

Security Bench provides comprehensive coverage across 32 attack categories, expandable as new techniques are discovered.

Core Categories (10 with tests, 22 planned)

Injection & Manipulation:

SPE - System Prompt Extraction (14,138 tests)
PIN - Prompt Injection Direct (64 tests)
IND - Indirect Injection (114 tests)
JBR - Jailbreak (123 tests)
OBF - Obfuscation (84 tests)
MTM - Multi-Turn Manipulation (19 tests)
GHJ - Goal Hijacking
CTX - Context Manipulation

Information Leakage:

ILK - Information Leakage (3,298 tests)
SEC - Secret Extraction (115 tests)
EXF - Data Exfiltration
MEX - Model Extraction

Code & Execution:

CEX - Code Execution (331 tests)
OPS - Output Manipulation

Advanced Threats:

AGY - Excessive Agency
RAG - RAG/Vector Poisoning
VEC - Vector/Embedding Attacks
MEM - Memory Poisoning
IAT - Inter-Agent Trust
MCP - Model Context Protocol
COT - Chain-of-Thought Manipulation
IMG - Multi-modal Injection

Plus 10 more: Social Engineering, Compliance, Hallucination Exploitation, Resource Exhaustion, and others.

See SB_TAXONOMY_V2.1_FLAT.md for complete taxonomy.

Editions

Community (Open Source)

Core testing engine
~330 baseline security tests
CLI and MCP server
Apache 2.0 License

PRO (Commercial)

+750 advanced tests
Industry-specific attack scenarios
Continuous updates with latest techniques
Priority support

Documentation

Evaluation

Security Bench uses an LLM-as-Judge approach for evaluating test results. The judge model analyzes responses to determine if attacks succeeded (e.g., system prompt leaked, jailbreak bypassed safety filters). For specific attack types like TensorTrust, pattern matching is used instead.

Project Status

Status: Beta Version: 0.1.0

Working:

✅ LLM endpoint scanning (sb scan)
✅ Local security auditing (sb audit, sb code, sb config, sb infra)
✅ Rich terminal output with grades and findings
✅ JSON output for CI/CD integration

Coming Soon:

MCP server for AI assistant integration
HTML/PDF report generation

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Acknowledgments

See ACKNOWLEDGMENTS.md for research inspiration and credits.

License

Elastic License 2.0 (ELv2) - Free to use, but you may not offer it as a hosted service. See LICENSE for details.

Made with ❤️ for the AI security community

Project details

Release history Release notifications | RSS feed

0.3.0

Feb 7, 2026

0.2.13

Jan 31, 2026

0.2.12

Jan 31, 2026

0.2.11

Jan 31, 2026

0.1.17

Jan 27, 2026

0.1.16

Jan 26, 2026

0.1.15

Jan 26, 2026

0.1.14

Jan 26, 2026

0.1.13

Jan 26, 2026

0.1.12

Jan 26, 2026

0.1.11

Jan 26, 2026

0.1.10

Jan 26, 2026

0.1.9

Jan 26, 2026

0.1.8

Jan 26, 2026

This version

0.1.7

Jan 26, 2026

0.1.6

Jan 26, 2026

0.1.5

Jan 26, 2026

0.1.4

Jan 26, 2026

0.1.3

Jan 26, 2026

0.1.2

Jan 26, 2026

0.1.1

Jan 26, 2026

0.1.0

Jan 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

securitybench-0.1.7.tar.gz (37.7 kB view details)

Uploaded Jan 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

securitybench-0.1.7-py3-none-any.whl (43.9 kB view details)

Uploaded Jan 26, 2026 Python 3

File details

Details for the file securitybench-0.1.7.tar.gz.

File metadata

Download URL: securitybench-0.1.7.tar.gz
Upload date: Jan 26, 2026
Size: 37.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for securitybench-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`2acac51cbb630af6fb3aa6071e39da4d2314c17a2dd31c77e359eac387fb8e05`
MD5	`f347f16e998ecde92b444bb64d737c0a`
BLAKE2b-256	`12e10aa9ff5df6095e25a079c28d6d9ff7b228576f7db43536c2bf796899820e`

See more details on using hashes here.

File details

Details for the file securitybench-0.1.7-py3-none-any.whl.

File metadata

Download URL: securitybench-0.1.7-py3-none-any.whl
Upload date: Jan 26, 2026
Size: 43.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for securitybench-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c314e15e4b43d5e0f8289f4b9e18b06eafc532582b289f722c7e6850cfdcac90`
MD5	`2af748fca81b53ae7d1604689ffbcc35`
BLAKE2b-256	`a0d75e1b19b0d468cf806dced1942278b81e014e4b4a316b39084c74383d9946`

See more details on using hashes here.

securitybench 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Security Bench

Overview

Key Features

Quick Start

LLM Endpoint Testing

Local Security Auditing (Lynis-style)

Test Modes

Balanced Mode (Recommended)

Rate Limiting

Test Categories (32 Total)

Core Categories (10 with tests, 22 planned)

Editions

Community (Open Source)

PRO (Commercial)

Documentation

Evaluation

Project Status

Contributing

Acknowledgments

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes