Skip to main content

LLM security testing framework with CVE-style severity scoring and multi-model benchmarking

Project description

AI Safety Testing

PyPI version Python versions Tests License

LLM Security Testing Framework with CVE-style severity scoring and multi-model benchmarking

⚡ Quick Start (30 seconds)

pip install ai-safety-tester
from ai_safety_tester import SimpleAITester

tester = SimpleAITester(model="llama3.2:1b")
results = tester.run_all_tests()

Output:

==================================================
AI Safety Testing Results
==================================================
basic_response       ✅ PASS
refusal              ✅ PASS
math                 ✅ PASS
==================================================
Total: 3/3 tests passed
==================================================

🎯 Features

  • Real benchmarks (MMLU, TruthfulQA, HellaSwag - 24K+ questions)
  • CVE-style severity scoring (CRITICAL/HIGH/MEDIUM/LOW)
  • Multi-provider (Ollama local, OpenAI cloud)
  • Multi-model comparison with HTML dashboards
  • Semantic similarity detection (optional)

📊 Compare Models

from ai_safety_tester import SimpleAITester
from ai_safety_tester.benchmark import BenchmarkDashboard

# Test multiple models
results_llama = SimpleAITester(model="llama3.2:1b").run_all_tests()
results_mistral = SimpleAITester(model="mistral:7b").run_all_tests()

# Generate comparison
bench_llama = BenchmarkDashboard.run_benchmark("llama3.2:1b", results_llama)
bench_mistral = BenchmarkDashboard.run_benchmark("mistral:7b", results_mistral)

print(BenchmarkDashboard.generate_comparison_table([bench_llama, bench_mistral]))

Output:

| Rank | Model         | Pass Rate | Security Score | Status     |
|------|---------------|-----------|----------------|------------|
| 1    | mistral:7b    | 95.8%     | 1.2/10         | ✅ Secure  |
| 2    | llama3.2:1b   | 83.3%     | 4.8/10         | ⚠️ Moderate |

🔬 Run Academic Benchmarks

from ai_safety_tester import SimpleAITester
from ai_safety_tester.benchmark import BenchmarkRunner

tester = SimpleAITester(model="llama3.2:1b")

# Quick sample (100 questions, ~5 min)
runner = BenchmarkRunner(tester, use_full_datasets=True, sample_size=100)
results = runner.run_all()

print(f"MMLU: {results['mmlu']['accuracy']:.1%}")
print(f"TruthfulQA: {results['truthfulqa']['truthfulness_score']:.1%}")
print(f"HellaSwag: {results['hellaswag']['accuracy']:.1%}")

🔐 OpenAI Support

pip install ai-safety-tester[openai]
from ai_safety_tester.providers import OpenAIProvider

provider = OpenAIProvider(model="gpt-3.5-turbo")  # Uses OPENAI_API_KEY env var
result = provider.generate("Test prompt")

📖 Documentation

🔗 Requirements

  • Python 3.11+
  • Ollama (for local models)
  • Models: ollama pull llama3.2:1b

📝 License

MIT


Author: Nahuel | Date: November 2025

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_safety_tester-1.3.1.tar.gz (40.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_safety_tester-1.3.1-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file ai_safety_tester-1.3.1.tar.gz.

File metadata

  • Download URL: ai_safety_tester-1.3.1.tar.gz
  • Upload date:
  • Size: 40.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for ai_safety_tester-1.3.1.tar.gz
Algorithm Hash digest
SHA256 dc6b8225a7c08bd6cb67fe9d9cdf04b4143e0e5970751ffebe99ee7d82cf3ea4
MD5 cdcb9be372af50c7ff17c40efeced73f
BLAKE2b-256 cf0e5d2abf71fd9ebb326f32b642ab96b09ec03022277ce13727d81a78bb6462

See more details on using hashes here.

File details

Details for the file ai_safety_tester-1.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_safety_tester-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6f10f5047924e54d27b71269234fd51bf73a2537ce9b292f89e53d5c497bc749
MD5 90758aea0b6c2f13c7a4311d0a336b28
BLAKE2b-256 637c378167baa98f765f5f9d87edbae23c0f08ca24a1e4cbfc85fee7c7b8bbc4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page