Skip to main content

LLM security testing framework with CVE-style severity scoring and multi-model benchmarking

Project description

AI Safety Testing

PyPI version Python versions Tests License

LLM Security Testing Framework with CVE-style severity scoring and multi-model benchmarking

⚡ Quick Start (30 seconds)

pip install ai-safety-tester
from ai_safety_tester import SimpleAITester

tester = SimpleAITester(model="llama3.2:1b")
results = tester.run_all_tests()

Output:

==================================================
AI Safety Testing Results
==================================================
basic_response       ✅ PASS
refusal              ✅ PASS
math                 ✅ PASS
==================================================
Total: 3/3 tests passed
==================================================

🎯 Features

  • Real benchmarks (MMLU, TruthfulQA, HellaSwag - 24K+ questions)
  • CVE-style severity scoring (CRITICAL/HIGH/MEDIUM/LOW)
  • Multi-provider (Ollama local, OpenAI cloud)
  • Multi-model comparison with HTML dashboards
  • Semantic similarity detection (optional)

📊 Compare Models

from ai_safety_tester import SimpleAITester
from ai_safety_tester.benchmark import BenchmarkDashboard

# Test multiple models
results_llama = SimpleAITester(model="llama3.2:1b").run_all_tests()
results_mistral = SimpleAITester(model="mistral:7b").run_all_tests()

# Generate comparison
bench_llama = BenchmarkDashboard.run_benchmark("llama3.2:1b", results_llama)
bench_mistral = BenchmarkDashboard.run_benchmark("mistral:7b", results_mistral)

print(BenchmarkDashboard.generate_comparison_table([bench_llama, bench_mistral]))

Output:

| Rank | Model         | Pass Rate | Security Score | Status     |
|------|---------------|-----------|----------------|------------|
| 1    | mistral:7b    | 95.8%     | 1.2/10         | ✅ Secure  |
| 2    | llama3.2:1b   | 83.3%     | 4.8/10         | ⚠️ Moderate |

🔬 Run Academic Benchmarks

from ai_safety_tester import SimpleAITester
from ai_safety_tester.benchmark import BenchmarkRunner

tester = SimpleAITester(model="llama3.2:1b")

# Quick sample (100 questions, ~5 min)
runner = BenchmarkRunner(tester, use_full_datasets=True, sample_size=100)
results = runner.run_all()

print(f"MMLU: {results['mmlu']['accuracy']:.1%}")
print(f"TruthfulQA: {results['truthfulqa']['truthfulness_score']:.1%}")
print(f"HellaSwag: {results['hellaswag']['accuracy']:.1%}")

🔐 OpenAI Support

pip install ai-safety-tester[openai]
from ai_safety_tester.providers import OpenAIProvider

provider = OpenAIProvider(model="gpt-3.5-turbo")  # Uses OPENAI_API_KEY env var
result = provider.generate("Test prompt")

📖 Documentation

🔗 Requirements

  • Python 3.11+
  • Ollama (for local models)
  • Models: ollama pull llama3.2:1b

📝 License

MIT


Author: Nahuel | Date: November 2025

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_safety_tester-1.3.0-py3-none-any.whl (21.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_safety_tester-1.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_safety_tester-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 20b49071d508e3dd84e7fb4fa0f9673688d29eec15ebdb6c70e43008607bd06d
MD5 f20f01ad260dd17ce1d428e70fd90e35
BLAKE2b-256 46379f9197011217eae58d594d4afd53574cd6d0d1817b00706997435e2ef3e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page