Community-driven AI security audit tool using interpretability techniques
Project description
Community AI Security Audit Tool
A plug-and-play, community-driven AI security audit framework. Scan any model (local, HF Hub, OpenAI, Anthropic, AWS Bedrock, Ollama) for vulnerabilities, interpret decisions, and push findings to your SIEM (Splunk, Elastic, Datadog, Sentinel).
Features
- Model Adapters: HuggingFace, OpenAI, Anthropic, AWS Bedrock, Local (PyTorch/ONNX/SafeTensors), Ollama
- Vulnerability Scanners: Backdoor/Trojan detection (activation clustering), Adversarial robustness (FGSM/PGD)
- Interpretability: Integrated Gradients, LIME
- SIEM Connectors: Splunk HEC, Elastic/Elasticsearch, Datadog Logs, Microsoft Sentinel
- Reporting: Markdown, JSON, HTML
- CLI:
discover,scan,interpret,auditcommands - Extensible: Plugin system for custom adapters, scanners, interpreters, connectors
Quickstart
# Install
pip install community-ai-audit[hf] # with HuggingFace support
# Discover available plugins
community-ai-audit discover
# Scan a HuggingFace model
community-ai-audit scan distilgpt2 --provider huggingface --profile quick
# Full audit (scan + interpret) with SIEM push
community-ai-audit audit meta-llama/Llama-3-8B-Instruct \
--provider huggingface \
--profile standard \
--scanners backdoor adversarial \
--interpreters integrated-gradients lime \
--connectors splunk elastic \
--config config/my_connectors.yaml
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ AuditEngine │
│ (orchestrates: load → scan → interpret → report → push) │
└─────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Adapters │ │ Scanners │ │ Interpreters │
│ (load_model) │ │ (scan model) │ │ (explain model) │
├─────────────────┤ ├─────────────────┤ ├─────────────────┤
│ • HuggingFace │ │ • Backdoor │ │ • Integrated │
│ • OpenAI │ │ • Adversarial │ │ Gradients │
│ • Anthropic │ │ • (custom) │ │ • LIME │
│ • AWS Bedrock │ │ │ │ • (custom) │
│ • Local │ │ │ │ │
│ • Ollama │ │ │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└────────────────────┼────────────────────┘
▼
┌─────────────────────┐
│ Reporters │
│ (markdown, json, │
│ html) │
└─────────────────────┘
│
▼
┌─────────────────────┐
│ SIEM Connectors │
│ (push findings) │
├─────────────────────┤
│ • Splunk HEC │
│ • Elastic │
│ • Datadog │
│ • Sentinel │
│ • (custom) │
└─────────────────────┘
Provider Matrix
| Provider | Text | Image | Multimodal | Embedding | Auth |
|---|---|---|---|---|---|
| HuggingFace | ✅ | ✅ | ✅ | ✅ | HF_TOKEN |
| OpenAI | ✅ | ✅ | ✅ | ✅ | OPENAI_API_KEY |
| Anthropic | ✅ | ❌ | ❌ | ❌ | ANTHROPIC_API_KEY |
| AWS Bedrock | ✅ | ✅ | ✅ | ✅ | AWS creds |
| Local (PyTorch) | ✅ | ✅ | ✅ | ✅ | None |
| Ollama | ✅ | ❌ | ❌ | ❌ | Local server |
Installation
# Core only
pip install community-ai-audit
# With HuggingFace transformers
pip install community-ai-audit[hf]
# With TensorFlow support
pip install community-ai-audit[tf]
# All optional dependencies
pip install community-ai-audit[all]
# Development install
git clone https://github.com/anomalyco/community-ai-audit
cd community-ai-audit
pip install -e .[dev]
Configuration
Create config/my_config.yaml:
model:
device: auto
dtype: auto
scanners:
backdoor:
enabled: true
num_clusters: 5
activation_threshold: 0.85
adversarial:
enabled: true
epsilon: 0.1
pgd_steps: 10
connectors:
splunk:
hec_url: https://splunk.example.com:8088
hec_token: ${SPLUNK_HEC_TOKEN}
index: security
elastic:
url: https://es.example.com:9243
api_key: ${ELASTICSEARCH_API_KEY}
Use with --config config/my_config.yaml.
CLI Commands
Discover
community-ai-audit discover
community-ai-audit discover --format json
Scan
# Quick scan with defaults
community-ai-audit scan distilgpt2 --provider huggingface
# Custom profile and scanners
community-ai-audit scan model.pt --provider local \
--profile deep \
--scanners backdoor adversarial \
--input-shape '[32, 768]' \
--output markdown --save report.md
Interpret
community-ai-audit interpret distilgpt2 --provider huggingface \
--interpreters integrated-gradients lime \
--input "The model should classify this as positive."
Audit (Full Pipeline)
community-ai-audit audit meta-llama/Llama-3-8B-Instruct \
--provider huggingface \
--profile standard \
--scanners backdoor adversarial \
--interpreters integrated-gradients \
--connectors splunk elastic \
--config config/my_connectors.yaml
Extending
See Plugin Guide for:
Known Limitations
- Scanners require white-box access (gradients/activations) — work best with
provider=localor HuggingFace local models - Text model adversarial attacks need embedding-space perturbations (token IDs are discrete)
- Integrated Gradients on text requires access to embedding layer
- Large model audits can be slow — batch mode coming in v0.2.0
- TensorFlow support planned
Contributing
See CONTRIBUTING.md and Plugin Guide.
License
MIT License — see LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file community_ai_audit-0.5.0.tar.gz.
File metadata
- Download URL: community_ai_audit-0.5.0.tar.gz
- Upload date:
- Size: 106.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efbf1e6fae77320ad132cf5a56af244546cd45c1d4ae87baac192b3c84e2f793
|
|
| MD5 |
09acacb7b140485f3c9e76359c7c93ab
|
|
| BLAKE2b-256 |
e60b2e1725ff87f29f0528b932387b05bb9e9d2db86a570ce4789e7c68fe3c75
|
File details
Details for the file community_ai_audit-0.5.0-py3-none-any.whl.
File metadata
- Download URL: community_ai_audit-0.5.0-py3-none-any.whl
- Upload date:
- Size: 136.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd9dab39a0da5c20dc15d7172df8a9041a222d255c9c3c9403edae54c4681436
|
|
| MD5 |
85529b1076de6920e8ded96e4b11d693
|
|
| BLAKE2b-256 |
ac62773c2fccdb17bfa11cb1c23913348e0be10f79a676f995b91facfbc0f9eb
|