Skip to main content

Unified AI/LLM Security Scanner - Static Code Analysis + Live Model Testing

Project description

AI Security CLI

A unified command-line tool for AI/LLM security scanning and testing. Combines static code analysis with live model testing to provide comprehensive security assessment for AI applications.

Features

  • Static Code Analysis: Scan Python codebases for OWASP LLM Top 10 vulnerabilities
  • Live Model Testing: Test live LLM models for security vulnerabilities via API
  • Remote Repository Scanning: Scan GitHub, GitLab, and Bitbucket repositories directly via URL
  • Multiple Providers: Support for OpenAI, Anthropic, AWS Bedrock, Google Vertex AI, Azure OpenAI, Ollama, and custom endpoints
  • Interactive HTML Reports: Rich reports with real-time filtering by severity, category, and text search
  • SARIF Output: CI/CD integration with GitHub Code Scanning, Azure DevOps, VS Code, and more
  • 4-Factor Confidence Scoring: Advanced confidence calculation for accurate vulnerability assessment

Installation

# Basic installation
pip install ai-security-cli

# With cloud provider support
pip install ai-security-cli[cloud]

# Development installation
pip install ai-security-cli[dev]

# Full installation with all features
pip install ai-security-cli[all]

Quick Start

# Static code analysis (local)
ai-security-cli scan ./my_project

# Static code analysis (remote GitHub repository)
ai-security-cli scan https://github.com/langchain-ai/langchain

# Live model testing
export OPENAI_API_KEY=sk-...
ai-security-cli test -p openai -m gpt-4 --mode quick

# Generate HTML report
ai-security-cli scan ./my_project -o html -f security_report.html

Architecture

High-Level Overview

┌─────────────────────────────────────────────────────────────────────────────────┐
│                              AI SECURITY CLI                                     │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│    ┌──────────────────┐                      ┌──────────────────┐               │
│    │   scan command   │                      │   test command   │               │
│    └────────┬─────────┘                      └────────┬─────────┘               │
│             │                                         │                          │
│             ▼                                         ▼                          │
│    ┌─────────────────────────────┐          ┌─────────────────────────────┐    │
│    │   STATIC ANALYSIS ENGINE    │          │    LIVE TESTING ENGINE      │    │
│    │                             │          │                             │    │
│    │  • Python AST Parser        │          │  • 7 LLM Providers          │    │
│    │  • 10 OWASP Detectors       │          │  • 11 Live Detectors        │    │
│    │  • 7 Security Scorers       │          │  • 4-Factor Confidence      │    │
│    └──────────────┬──────────────┘          └──────────────┬──────────────┘    │
│                   │                                         │                    │
│                   └─────────────────┬───────────────────────┘                    │
│                                     ▼                                            │
│                        ┌─────────────────────────┐                              │
│                        │    REPORT GENERATION    │                              │
│                        │  JSON | HTML | SARIF    │                              │
│                        └─────────────────────────┘                              │
└─────────────────────────────────────────────────────────────────────────────────┘

Static Analysis Flow

┌──────────────────────────────────────────────────────────────────────────────────┐
│                           STATIC ANALYSIS PIPELINE                                │
└──────────────────────────────────────────────────────────────────────────────────┘

  ┌─────────┐      ┌─────────────┐      ┌────────────────────────────────────────┐
  │ Python  │      │  AST Parser │      │         10 OWASP DETECTORS             │
  │  Code   │─────▶│  & Pattern  │─────▶│                                        │
  │ (.py)   │      │  Extractor  │      │  ┌──────────┐ ┌──────────┐ ┌────────┐ │
  └─────────┘      └─────────────┘      │  │  LLM01   │ │  LLM02   │ │ LLM03  │ │
                                        │  │  Prompt  │ │ Insecure │ │Training│ │
                                        │  │ Injection│ │  Output  │ │Poison  │ │
                                        │  └──────────┘ └──────────┘ └────────┘ │
                                        │  ┌──────────┐ ┌──────────┐ ┌────────┐ │
                                        │  │  LLM04   │ │  LLM05   │ │ LLM06  │ │
                                        │  │Model DoS │ │  Supply  │ │Secrets │ │
                                        │  │          │ │  Chain   │ │        │ │
                                        │  └──────────┘ └──────────┘ └────────┘ │
                                        │  ┌──────────┐ ┌──────────┐ ┌────────┐ │
                                        │  │  LLM07   │ │  LLM08   │ │ LLM09  │ │
                                        │  │ Insecure │ │Excessive │ │  Over  │ │
                                        │  │  Plugin  │ │ Agency   │ │reliance│ │
                                        │  └──────────┘ └──────────┘ └────────┘ │
                                        │  ┌──────────┐                         │
                                        │  │  LLM10   │                         │
                                        │  │  Model   │                         │
                                        │  │  Theft   │                         │
                                        │  └──────────┘                         │
                                        └───────────────────┬────────────────────┘
                                                            │
                                                            ▼
  ┌────────────────────────────────────────────────────────────────────────────────┐
  │                            7 SECURITY SCORERS                                   │
  │                                                                                 │
  │   ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐            │
  │   │  Prompt  │ │  Model   │ │   Data   │ │Hallucin- │ │ Ethical  │            │
  │   │ Security │ │ Security │ │ Privacy  │ │  ation   │ │    AI    │            │
  │   └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘            │
  │   ┌──────────┐ ┌──────────┐                                                    │
  │   │Governance│ │  OWASP   │                                                    │
  │   │          │ │  Score   │                                                    │
  │   └──────────┘ └──────────┘                                                    │
  └───────────────────────────────────────────┬────────────────────────────────────┘
                                              │
                                              ▼
                            ┌─────────────────────────────────┐
                            │          SCAN RESULT            │
                            │  • Findings    • Category Scores│
                            │  • Overall Score  • Confidence  │
                            └─────────────────────────────────┘

Live Testing Flow

┌──────────────────────────────────────────────────────────────────────────────────┐
│                            LIVE TESTING PIPELINE                                  │
└──────────────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────────────┐
│                              7 LLM PROVIDERS                                    │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌───────┐ ┌─────┐│
│  │ OpenAI  │ │Anthropic│ │ AWS     │ │ Google  │ │  Azure  │ │Ollama │ │Cust-││
│  │         │ │         │ │ Bedrock │ │ Vertex  │ │ OpenAI  │ │(local)│ │ om  ││
│  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └───┬───┘ └──┬──┘│
└───────┴──────────┴──────────┴──────────┴──────────┴─────────┴────────┴──────┘
                                        │
                                        ▼
                          ┌──────────────────────────┐
                          │    BASELINE QUERIES      │
                          └────────────┬─────────────┘
                                       │
                                       ▼
┌────────────────────────────────────────────────────────────────────────────────┐
│                            11 LIVE DETECTORS                                    │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐│
│  │  Prompt  │ │Jailbreak │ │   Data   │ │ Halluc-  │ │   DoS    │ │   Bias   ││
│  │Injection │ │          │ │ Leakage  │ │ ination  │ │          │ │Detection ││
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘│
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐            │
│  │  Model   │ │Adversar- │ │  Output  │ │  Supply  │ │Behavioral│            │
│  │Extraction│ │   ial    │ │  Manip.  │ │  Chain   │ │ Anomaly  │            │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘            │
└───────────────────────────────────────────┬────────────────────────────────────┘
                                            │
                                            ▼
┌────────────────────────────────────────────────────────────────────────────────┐
│                        4-FACTOR CONFIDENCE CALCULATION                          │
│                                                                                 │
│    Response Analysis (30%) + Detector Logic (35%) +                            │
│    Evidence Quality (25%) + Severity Factor (10%) = Confidence Score           │
└───────────────────────────────────────────┬────────────────────────────────────┘
                                            │
                                            ▼
                          ┌─────────────────────────────────┐
                          │          TEST RESULT            │
                          │  • Vulnerabilities  • Score     │
                          │  • Tests Passed   • Confidence  │
                          └─────────────────────────────────┘

Component Architecture

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           ai_security package                                    │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌─────────────────────────────────────────────────────────────────────────┐    │
│  │                           CLI LAYER (cli.py)                             │    │
│  │    scan command ─────────────────────────── test command                 │    │
│  └─────────┬───────────────────────────────────────────┬───────────────────┘    │
│            │                                           │                         │
│            ▼                                           ▼                         │
│  ┌──────────────────────────┐            ┌──────────────────────────┐           │
│  │      scanner.py          │            │       tester.py          │           │
│  └────────────┬─────────────┘            └────────────┬─────────────┘           │
│               │                                       │                          │
│      ┌────────┴────────┐                    ┌─────────┴─────────┐               │
│      ▼                 ▼                    ▼                   ▼               │
│  ┌────────────┐  ┌────────────┐      ┌────────────┐    ┌────────────┐          │
│  │  STATIC    │  │  SCORERS   │      │   LIVE     │    │ PROVIDERS  │          │
│  │ DETECTORS  │  │            │      │ DETECTORS  │    │            │          │
│  │ LLM01-10   │  │ 7 scorers  │      │ 11 detects │    │ 7 providers│          │
│  └────────────┘  └────────────┘      └────────────┘    └────────────┘          │
│                                                                                  │
│  ┌─────────────────────────────────────────────────────────────────────────┐    │
│  │  REPORTERS: base | json | html | sarif                                   │    │
│  └─────────────────────────────────────────────────────────────────────────┘    │
│  ┌─────────────────────────────────────────────────────────────────────────┐    │
│  │  MODELS: finding.py | vulnerability.py | result.py                       │    │
│  └─────────────────────────────────────────────────────────────────────────┘    │
│  ┌─────────────────────────────────────────────────────────────────────────┐    │
│  │  UTILS: markov_chain | entropy | scoring | statistical                   │    │
│  └─────────────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────────────┘

CLI Commands

Static Code Analysis (scan)

Scan Python code for OWASP LLM Top 10 vulnerabilities. Supports local files/directories and remote Git repositories.

ai-security-cli scan <path> [OPTIONS]

Path Options:

Path Type Example
Local file ./app.py
Local directory ./my_project
GitHub URL https://github.com/user/repo
GitLab URL https://gitlab.com/user/repo
Bitbucket URL https://bitbucket.org/user/repo

Options:

Option Description Default
-o, --output Output format: text, json, html, sarif text
-f, --output-file Write output to file -
-s, --severity Minimum severity: critical, high, medium, low, info info
-c, --confidence Minimum confidence threshold (0.0-1.0) 0.7
--category Filter by OWASP category (LLM01-LLM10) all
-v, --verbose Enable verbose output false

Examples:

# Scan a local project directory
ai-security-cli scan ./my_llm_app

# Scan with JSON output
ai-security-cli scan ./app.py -o json -f results.json

# Scan for high severity issues only
ai-security-cli scan ./project -s high

# Scan specific OWASP categories
ai-security-cli scan ./project --category LLM01 --category LLM02

# Generate HTML report
ai-security-cli scan ./project -o html -f security_report.html

# Scan a GitHub repository directly
ai-security-cli scan https://github.com/langchain-ai/langchain

Live Model Testing (test)

Test live LLM models for security vulnerabilities.

ai-security-cli test [OPTIONS]

Options:

Option Description Default
-p, --provider LLM provider (required) -
-m, --model Model name (required) -
-e, --endpoint Custom endpoint URL -
-t, --tests Specific tests to run all
--mode Testing depth: quick, standard, comprehensive standard
-o, --output Output format: text, json, html, sarif text
-f, --output-file Write output to file -
--timeout Timeout per test in seconds 30
-v, --verbose Enable verbose output false

Supported Providers:

Provider Environment Variables
openai OPENAI_API_KEY
anthropic ANTHROPIC_API_KEY
bedrock AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
vertex GOOGLE_APPLICATION_CREDENTIALS
azure AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT
ollama None (local)
custom CUSTOM_API_KEY (optional)

Examples:

# Quick test with OpenAI
export OPENAI_API_KEY=sk-...
ai-security-cli test -p openai -m gpt-4 --mode quick

# Comprehensive test with Anthropic
export ANTHROPIC_API_KEY=...
ai-security-cli test -p anthropic -m claude-3-opus --mode comprehensive

# Test specific vulnerabilities
ai-security-cli test -p openai -m gpt-4 -t prompt-injection -t jailbreak

# Test with Ollama (local)
ai-security-cli test -p ollama -m llama2 --mode standard

OWASP LLM Top 10 Coverage

Static Analysis Detectors

ID Vulnerability Description
LLM01 Prompt Injection Detects unsanitized user input in prompts
LLM02 Insecure Output Handling Identifies unvalidated LLM output
LLM03 Training Data Poisoning Finds unsafe data loading
LLM04 Model Denial of Service Detects missing rate limiting
LLM05 Supply Chain Vulnerabilities Identifies unsafe model loading
LLM06 Sensitive Information Disclosure Finds hardcoded secrets
LLM07 Insecure Plugin Design Detects unsafe plugin loading
LLM08 Excessive Agency Identifies autonomous actions
LLM09 Overreliance Finds missing output validation
LLM10 Model Theft Detects exposed model artifacts

Live Testing Detectors

ID Detector Description
PI Prompt Injection Tests for injection vulnerabilities
JB Jailbreak Tests for instruction bypass attacks
DL Data Leakage Tests for PII exposure
HAL Hallucination Tests for factual accuracy
DOS Denial of Service Tests for resource exhaustion
BIAS Bias Detection Tests for demographic bias
ME Model Extraction Tests for architecture disclosure
ADV Adversarial Inputs Tests for encoding attacks
OM Output Manipulation Tests for response injection
SC Supply Chain Tests for unsafe code generation
BA Behavioral Anomaly Tests for unexpected behavior

Output Formats

  • Text: Human-readable terminal output
  • JSON: Machine-readable format for CI/CD
  • HTML: Interactive reports with filtering
  • SARIF: GitHub Code Scanning, Azure DevOps, VS Code integration

Integration

GitHub Actions

name: AI Security Scan
on: [push, pull_request]
jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      - run: pip install ai-security-cli
      - run: ai-security-cli scan . -o sarif -f results.sarif
      - uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: results.sarif

Pre-commit Hook

repos:
  - repo: local
    hooks:
      - id: ai-security-scan
        name: AI Security Scan
        entry: ai-security-cli scan
        language: system
        types: [python]
        args: ['-s', 'high']

Development

git clone https://github.com/deosha/ai-security-cli.git
cd ai-security-cli
pip install -e ".[dev]"
pytest tests/ -v --cov=ai_security

License

MIT License - see LICENSE for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_security_cli-1.0.0b2.tar.gz (197.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_security_cli-1.0.0b2-py3-none-any.whl (228.2 kB view details)

Uploaded Python 3

File details

Details for the file ai_security_cli-1.0.0b2.tar.gz.

File metadata

  • Download URL: ai_security_cli-1.0.0b2.tar.gz
  • Upload date:
  • Size: 197.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for ai_security_cli-1.0.0b2.tar.gz
Algorithm Hash digest
SHA256 48f8b88d1531ed6cefc9b569219eb553a81f5f925dd65bfd1e66780bfb707994
MD5 6a3b94444a69c612ceb5d919b844bd2e
BLAKE2b-256 560d951eddf0fd34f68f32535d3d26c9c7329c98861eeb15818e4415044b4cc7

See more details on using hashes here.

File details

Details for the file ai_security_cli-1.0.0b2-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_security_cli-1.0.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 be8d1503f6ea3174f8d723778c417ee8d2d9f29cb7e5431a958bd4e17efee55d
MD5 27925d84a9b08f3f42070fed5bb80b44
BLAKE2b-256 2fe895f9072e7e865bf24745968a7a5f985572f50a657e4bc859a57c9bf19bb9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page