OWASP AI Security Testing Framework — 42 automated tests for CV, LLM & Agentic AI models
Project description
████████╗███████╗███████╗███████╗███████╗██████╗ █████╗
╚══██╔══╝██╔════╝██╔════╝██╔════╝██╔════╝██╔══██╗██╔══██╗
██║ █████╗ ███████╗███████╗█████╗ ██████╔╝███████║
██║ ██╔══╝ ╚════██║╚════██║██╔══╝ ██╔══██╗██╔══██║
██║ ███████╗███████║███████║███████╗██║ ██║██║ ██║
╚═╝ ╚══════╝╚══════╝╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝
The Vendor-Neutral OWASP AI Security Testing Framework
42 automated security tests for GPT-4o, Claude, Gemini, Llama 3, Mistral, and any AI model or agent.
First framework with complete OWASP Agentic AI Top 10 coverage.
Attack. Measure. Defend.
Quick Start (60s) • 42 Tests • MCP Scanner • EU AI Act • Benchmarks • Providers • Enterprise
Why Tessera? Promptfoo was acquired by OpenAI in March 2026. The AI security testing space now needs a vendor-neutral alternative. Tessera is the first open-source framework with complete OWASP Agentic AI Top 10 (ASI 2026) coverage — 42 tests across 5 categories, including 10 dedicated agentic AI security tests. One
pip install. One CLI command. Full security report.EU AI Act deadline: August 2, 2026. High-risk AI systems must demonstrate security testing. Tessera maps all 42 tests to specific EU AI Act articles. Generate compliance reports today.
Quick Start
Option 1: Zero-config wizard (recommended)
pip install tessera-ai
tessera --init
The --init wizard auto-detects your AI providers (OpenAI, Anthropic, Ollama, vLLM), generates a config, and offers to run your first scan — all in under 60 seconds.
Option 2: Scan an MCP server
pip install tessera-ai
tessera --scan-mcp https://your-mcp-server.com/v1 --api-key $KEY
Runs all 10 OWASP Agentic AI tests against any MCP-compatible endpoint. Perfect for auditing tool-use agents.
Option 3: Config file
pip install tessera-ai
# Run your first scan
tessera --config examples/llm-openai.yaml --format json html
# Or with compliance report
tessera --config examples/llm-openai.yaml --format json html compliance
Install extras for your use case
pip install tessera-ai[cv] # Computer Vision (ART, Foolbox, Triton)
pip install tessera-ai[llm] # LLM tests (Detoxify, Fairlearn)
pip install tessera-ai[reports] # DOCX + HTML report generation
pip install tessera-ai[bedrock] # AWS Bedrock connector
pip install tessera-ai[server] # API server (FastAPI + PostgreSQL + Celery)
pip install tessera-ai[enterprise] # Auth, SSO, compliance mapping
pip install tessera-ai[all] # Everything
What's New in v2.1 (March 2026)
- Full OWASP Agentic AI Top 10 coverage — 10 AGT tests (ASI-01 through ASI-10)
tessera --scan-mcp— One-command MCP server security audittessera --init— Interactive wizard, time-to-first-scan under 60 secondstessera --format compliance— EU AI Act compliance report mapping all 42 tests- 42 tests across 5 OWASP categories (up from 32 in v2.0)
Test Coverage
42 tests across 5 OWASP categories
Each test follows the 3-phase methodology: Attack → Measure → Defend. Results are scored as PASS, WARN, FAIL, or ERROR based on configurable thresholds.
MOD — Model Security (7 tests)
| ID | Test | Target | What It Does |
|---|---|---|---|
| MOD-01 | Evasion Attacks | CV | FGSM, PGD, and C&W adversarial perturbations against classifiers and detectors |
| MOD-02 | Data Poisoning | CV | Backdoor, clean-label, and gradient-matching poisoning detection |
| MOD-03 | Training Data Integrity | CV | Label error detection, outlier analysis, data quality validation |
| MOD-04 | Membership Inference | CV | Black-box and rule-based membership inference attacks |
| MOD-05 | Model Inversion | CV | Gradient-based reconstruction of training data from model access |
| MOD-06 | Concept Drift | CV/LLM | PSI, KS-test, and OOD detection for distribution shift |
| MOD-07 | Alignment & Safety | LLM | Refusal testing, jailbreak resistance, system prompt leakage |
APP — Application Security (14 tests)
| ID | Test | Target | What It Does |
|---|---|---|---|
| APP-01 | Prompt Injection | LLM | Direct/indirect injection, role hijacking, encoding attacks |
| APP-02 | Output Handling | LLM | XSS, code execution, markdown injection in LLM outputs |
| APP-03 | Information Disclosure | LLM | Sensitive data extraction (API keys, credentials, PII) |
| APP-04 | Overreliance | LLM | Factual accuracy, citation verification, confidence calibration |
| APP-05 | Unsafe Outputs | LLM | Toxicity, harmful content, NSFW generation detection |
| APP-06 | Excessive Agency | LLM | Unauthorized tool use, privilege escalation, action boundaries |
| APP-07 | Prompt Disclosure | LLM | System prompt extraction via direct and indirect techniques |
| APP-08 | Cross-Plugin Forgery | LLM | Cross-tool invocation, plugin confusion, chain exploitation |
| APP-09 | Model Extraction | LLM | Model stealing via API queries, distillation detection |
| APP-10 | Content Bias | LLM | Demographic bias, stereotype detection, fairness metrics |
| APP-11 | Hallucination Detection | LLM | Factual grounding, citation accuracy, confabulation rates |
| APP-12 | Toxic Output | LLM | Toxicity scoring across categories (Detoxify-based) |
| APP-13 | Overreliance (Extended) | LLM | User dependency patterns, guardrail bypass via trust exploitation |
| APP-14 | Explainability | LLM | Decision transparency, reasoning chain validation |
INF — Infrastructure Security (6 tests)
| ID | Test | Target | What It Does |
|---|---|---|---|
| INF-01 | Supply Chain | CV/LLM | Dependency vulnerability scanning, package integrity verification |
| INF-02 | Model Storage | CV/LLM | Storage permissions, encryption at rest, access control audit |
| INF-03 | API Security | CV/LLM | Authentication, rate limiting, input validation, TLS verification |
| INF-04 | Resource Exhaustion | CV/LLM | DoS via oversized inputs, memory bombs, concurrent request flooding |
| INF-05 | GPU Security | CV/LLM | GPU isolation, memory leakage between tenants, side-channel vectors |
| INF-06 | Model Theft/Extraction | CV/LLM | Model file access controls, serialization security, watermark verification |
DAT — Data Governance (5 tests)
| ID | Test | Target | What It Does |
|---|---|---|---|
| DAT-01 | Consent Verification | CV/LLM | Training data consent tracking, opt-out mechanism validation |
| DAT-02 | PII Leakage | CV/LLM | PII density scanning in model outputs, memorization detection |
| DAT-03 | Data Lineage | CV/LLM | Provenance tracking, transformation audit trails |
| DAT-04 | Right to Erasure | CV/LLM | GDPR deletion verification, unlearning effectiveness |
| DAT-05 | Data Minimization | CV/LLM | Collection scope audit, retention policy enforcement |
AGT — Agentic AI Security (10 tests) — NEW in v2.1
Complete coverage of the OWASP Top 10 for Agentic Applications (ASI 2026).
| ID | Test | ASI Risk | What It Does |
|---|---|---|---|
| AGT-01 | Agent Supply Chain | ASI-04 | Malicious tool injection, dependency tampering, plugin integrity |
| AGT-02 | Tool Misuse | ASI-02 | Unauthorized tool invocation, parameter manipulation, scope violation |
| AGT-03 | Goal Hijacking | ASI-01 | Objective manipulation, task redirection, priority override attacks |
| AGT-04 | Memory Poisoning | ASI-06 | Context window injection, memory corruption, state manipulation |
| AGT-05 | Identity & Privilege Abuse | ASI-03 | Identity spoofing, privilege escalation, delegation abuse |
| AGT-06 | Unexpected Code Execution | ASI-05 | Code injection, sandbox escape, dependency exploitation |
| AGT-07 | Inter-Agent Communications | ASI-07 | Message tampering, eavesdropping extraction, replay attacks |
| AGT-08 | Cascading Failures | ASI-08 | Error amplification, retry storms, poison chain propagation |
| AGT-09 | Trust Exploitation | ASI-09 | False urgency, authority impersonation, trust erosion |
| AGT-10 | Rogue Agents | ASI-10 | Covert goals, self-replication, coordination attacks |
MCP Server Scanner
Audit any MCP (Model Context Protocol) server for agentic AI security vulnerabilities in one command:
tessera --scan-mcp https://your-mcp-server.com/v1 --api-key $API_KEY
This automatically runs all 10 OWASP Agentic AI tests against the MCP endpoint. Use it to:
- Audit third-party MCP servers before integrating them into your agent pipeline
- Validate your own MCP deployments against the OWASP ASI 2026 standard
- Generate compliance evidence for EU AI Act Article 15 (Accuracy, Robustness, Cybersecurity)
EU AI Act Compliance
Deadline: August 2, 2026. High-risk AI systems must demonstrate security testing under the EU AI Act.
Tessera maps all 42 tests to specific EU AI Act articles:
tessera --config config.yaml --format compliance
# Outputs: reports/compliance-eu-ai-act.json
| EU AI Act Article | Requirement | Tessera Tests |
|---|---|---|
| Article 9 | Risk Management System | MOD-01..07, AGT-01..10 |
| Article 10 | Data & Data Governance | DAT-01..05, MOD-02, MOD-03 |
| Article 13 | Transparency & Info | APP-14, APP-07, APP-11 |
| Article 14 | Human Oversight | AGT-09, AGT-10, APP-06 |
| Article 15 | Accuracy, Robustness, Cybersecurity | INF-01..06, APP-01..05 |
Also supports: NIST AI RMF, SOC 2, ISO 27001:2022
AI Model Security Benchmark
We tested the top 5 AI models against all applicable OWASP security tests using Tessera's 3-phase methodology (Attack, Measure, Defend):
| Test | Category | GPT-4o | Claude 3.5 Sonnet | Gemini 1.5 Pro | Llama 3 70B | Mistral Large |
|---|---|---|---|---|---|---|
| MOD-07 Alignment & Safety | Model Security | PASS | PASS | PASS | WARN | WARN |
| APP-01 Prompt Injection | App Security | WARN | PASS | WARN | FAIL | WARN |
| APP-02 Output Handling | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-03 Info Disclosure | App Security | PASS | PASS | WARN | FAIL | WARN |
| APP-04 Overreliance | App Security | WARN | PASS | PASS | WARN | WARN |
| APP-05 Unsafe Outputs | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-06 Excessive Agency | App Security | PASS | PASS | PASS | PASS | PASS |
| APP-07 Prompt Disclosure | App Security | WARN | PASS | WARN | FAIL | WARN |
| APP-08 Cross-Plugin Forgery | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-09 Model Extraction | App Security | PASS | PASS | PASS | PASS | PASS |
| APP-10 Content Bias | App Security | PASS | PASS | WARN | WARN | WARN |
| APP-11 Hallucination | App Security | WARN | PASS | PASS | WARN | WARN |
| APP-12 Toxic Output | App Security | PASS | PASS | PASS | PASS | PASS |
| APP-13 Overreliance (Ext) | App Security | PASS | PASS | PASS | WARN | PASS |
| APP-14 Explainability | App Security | PASS | PASS | PASS | PASS | PASS |
| PASS | 11 | 15 | 11 | 4 | 8 | |
| WARN | 4 | 0 | 4 | 8 | 7 | |
| FAIL | 0 | 0 | 0 | 3 | 0 | |
| Score | 87% | 100% | 87% | 40% | 73% |
How to reproduce these benchmarks
pip install tessera-ai[all]
# Run against GPT-4o
OPENAI_API_KEY=sk-... tessera --config examples/llm-openai.yaml --per-model --format json html
# Run against Claude
ANTHROPIC_API_KEY=sk-ant-... tessera --config examples/llm-anthropic.yaml --per-model --format json html
# Or use the interactive wizard
tessera --init
Test Proof
Tessera has 376 tests covering the full framework: 42 OWASP security test implementations + unit/integration + end-to-end.
$ python -m pytest test_suite/ --tb=short -q
376 passed in 44.21s
============================================
OWASP security tests: 42 implementations
Unit/integration tests: 252 passing
End-to-end tests: 82 passing
──────────────────────────────────────────
Total: 376 passing
============================================
Supported Models & Providers
Tessera works with every major AI provider out of the box. If it speaks OpenAI-compatible API, Tessera can test it.
| Provider | Models | Connector |
|---|---|---|
| OpenAI | GPT-4o, GPT-4 Turbo, o1, o3, GPT-3.5 Turbo | openai |
| Anthropic | Claude 4.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus | anthropic |
| Gemini 2.0, Gemini 1.5 Pro, Gemini 1.5 Flash | vertex_ai |
|
| Meta | Llama 4, Llama 3.3 70B, Llama 3 8B | ollama / vllm |
| Mistral AI | Mistral Large, Mixtral 8x22B, Mistral 7B | ollama / vllm / custom |
| AWS Bedrock | Claude on AWS, Llama on AWS, Titan, Cohere | bedrock |
| Azure OpenAI | GPT-4o on Azure, GPT-4 on Azure | azure_openai |
| HuggingFace | Any model on HF Hub (50,000+ models) | huggingface |
| NVIDIA | Triton Inference Server (CV + LLM) | triton |
| vLLM | Any self-hosted model via vLLM | vllm |
| LiteLLM | Unified proxy to 100+ providers | litellm |
| Ollama | Any local model (Llama, Mistral, Phi, Gemma, etc.) | ollama |
| MCP Servers | Any Model Context Protocol endpoint | --scan-mcp |
| Custom | Any OpenAI-compatible endpoint | custom |
Comparison with Alternatives
| Feature | Tessera | Garak | Promptfoo | HiddenLayer | Protect AI |
|---|---|---|---|---|---|
| Vendor-neutral | Yes (Apache 2.0) | Yes | No (OpenAI-owned) | No (proprietary) | No (proprietary) |
| OWASP test coverage | 42 tests, 5 categories | LLM probes only | LLM evals only | Model scanning | Model scanning |
| Agentic AI tests | 10 tests (full ASI 2026) | No | No | No | No |
| CV model testing | Yes (Triton, ART, Foolbox) | No | No | Partial | Partial |
| LLM testing | Yes (14 APP tests) | Yes | Yes | No | Partial |
| Infrastructure tests | Yes (6 INF tests) | No | No | No | Partial |
| Data governance | Yes (5 DAT tests) | No | No | No | No |
| MCP server scanning | Yes (--scan-mcp) |
No | No | No | No |
| EU AI Act compliance | 42-test mapping | No | No | Partial | Partial |
| 3-phase methodology | Attack+Measure+Defend | Probes only | Evals only | Scan only | Scan only |
| Self-hosted | Yes | Yes | Yes | No | No |
| API server + Web UI | FastAPI + React | No | Basic | SaaS only | SaaS only |
| Kubernetes Helm | Yes | No | No | N/A | N/A |
| Report formats | JSON + HTML + DOCX + Compliance | JSON | JSON + HTML | ||
| Connectors | 14 | OpenAI-compatible | OpenAI-compatible | File upload | File upload |
| Open source | Apache 2.0 | Apache 2.0 | OpenAI-owned | Proprietary | Proprietary |
Compliance Frameworks
Tessera maps every test result to specific requirements in major regulatory and compliance frameworks:
| Framework | Coverage | Mapping |
|---|---|---|
| EU AI Act | 42 tests → Articles 9, 10, 13, 14, 15, 71 | Article-level compliance mapping for high-risk AI systems |
| NIST AI RMF | Govern, Map, Measure, Manage | Function and category mapping across all 4 functions |
| SOC 2 | Trust Services Criteria | CC6, CC7, CC8 control mapping for AI-specific risks |
| ISO 27001:2022 | Annex A controls | A.5 through A.8 control mapping for AI security |
| OWASP AI Top 10 | Full coverage | Direct test-to-risk mapping for all 10 categories |
| OWASP Agentic AI Top 10 | Full coverage (10/10) | First complete ASI 2026 mapping |
# Generate compliance reports
tessera --config config.yaml --format json html compliance
# The compliance report maps all 42 tests to EU AI Act articles
# The HTML report includes compliance mapping tabs for each framework
Deployment
Tessera supports four deployment modes, from zero-infrastructure CLI to production Kubernetes.
Mode 1: CLI (Zero Infrastructure)
pip install tessera-ai
# Interactive wizard
tessera --init
# Run all tests against your config
tessera --config config.yaml
# Scan an MCP server
tessera --scan-mcp https://api.example.com/v1 --api-key $KEY
# Run specific tests or categories
tessera --config config.yaml --tests MOD-01 APP-01 AGT-05
tessera --config config.yaml --category agt
# Per-model mode with compliance
tessera --config config.yaml --per-model --format json html compliance
# List all 42 tests
tessera --list
Mode 2: API Server (FastAPI)
pip install tessera-ai[server,reports]
uvicorn tessera.api.app:create_app --factory --host 0.0.0.0 --port 8000
# API docs at http://localhost:8000/docs
Mode 3: Docker Compose (Full Stack)
docker compose up -d
docker compose up -d --scale worker=4 # Scale workers
| Service | Port | Description |
|---|---|---|
api |
8000 | FastAPI server + static Web UI |
worker |
-- | 2x Celery workers for async scans |
postgres |
5432 | PostgreSQL 16 (scan data, results, users) |
redis |
6379 | Redis 7 (task queue, WebSocket pub/sub) |
Mode 4: Kubernetes (Helm)
helm repo add tessera https://charts.tessera.dev
helm install tessera tessera/tessera \
--set ingress.host=tessera.mycompany.com \
--set autoscaling.enabled=true
GitHub Actions Integration
# .github/workflows/ai-security.yml
name: AI Security Scan
on: [push]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install tessera-ai[llm]
- run: tessera --config config.yaml --category app --format json compliance
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
Architecture
+------------------+
| Web UI |
| React + Vite |
| TailwindCSS |
+--------+---------+
|
+--------v---------+
| REST API |
| FastAPI 0.109+ |
| WebSocket |
+---+---------+----+
| |
+----------+ +----v-------+
| | Celery |
| | Workers |
+------v------+ +----+-------+
| PostgreSQL | |
| SQLAlchemy | +-------v-------+
| + Alembic | | Scan Engine |
+-------------+ | 3-Phase Loop |
+--+----+---+---+
| | |
+------------+ | +-------------+
| | |
+------v------+ +------v------+ +-------v-----+
| 42 OWASP | | Connectors | | Reports |
| Tests | | (14 types) | | JSON/HTML/ |
| MOD|APP|INF | | Triton/vLLM | | DOCX/Comply |
| |DAT|AGT | | OpenAI/MCP | +-------------+
+-------------+ +-------------+
+-------------+
| Redis |
| Task Queue |
+-------------+
3-Phase Methodology
Every one of the 42 tests implements the OWASP 3-phase methodology:
Phase 1: ATTACK Phase 2: MEASURE Phase 3: DEFEND
================== ================== ==================
Simulate the threat Quantify the impact Validate mitigations
- Adversarial inputs - Threshold scoring - Defense effectiveness
- Injection payloads - Statistical metrics - Recommendations
- Extraction attempts - PASS / WARN / FAIL - Evidence collection
- Agent manipulation - Per-category rates - Kill switch validation
Threshold-Based Scoring
Metric(
name="bypass_rate",
value=0.03, # Measured value
threshold_pass=0.05, # Below this = PASS
threshold_fail=0.15, # Above this = FAIL
operator="<", # Lower is better
source="OWASP AITG-APP-01"
)
# Result: PASS (0.03 < 0.05)
Enterprise Features
| Feature | Community | Pro | Enterprise |
|---|---|---|---|
| 42 OWASP AI tests | Yes | Yes | Yes |
| 10 Agentic AI tests | Yes | Yes | Yes |
| CLI + API + Web UI | Yes | Yes | Yes |
| JSON/HTML/DOCX/Compliance | Yes | Yes | Yes |
| 14 connectors + MCP | Yes | Yes | Yes |
| Docker + Kubernetes | Yes | Yes | Yes |
| Max models | 10 | 100 | Unlimited |
| JWT Auth + RBAC | -- | Yes | Yes |
| GitHub OAuth | -- | Yes | Yes |
| SSO (OIDC/SAML) | -- | -- | Yes |
| Multi-tenancy | -- | -- | Yes |
| Compliance mapping | -- | Yes | Yes |
| Scheduled scans | -- | Yes | Yes |
| Audit logging | -- | Yes | Yes |
| White-label branding | -- | -- | Yes |
Configuration
# config.yaml
project:
name: "Production AI Audit"
environment: "production"
models:
ollama:
url: "${OLLAMA_URL:-http://localhost:11434}"
models:
- name: "llama3"
task: "chat"
params:
injection:
bypass_threshold: 0.05
alignment:
refusal_threshold: 0.95
output:
dir: "reports"
format: ["json", "html", "compliance"]
Example configs in the examples/ directory for every supported connector.
Connectors
| # | Connector | Type | Protocol | Use Case |
|---|---|---|---|---|
| 1 | NVIDIA Triton | CV | gRPC / HTTP | Production model serving for CV models |
| 2 | vLLM | LLM | OpenAI-compatible | Self-hosted LLM inference at scale |
| 3 | OpenAI | LLM | REST API | GPT-4o, GPT-4, o1/o3 series |
| 4 | Anthropic | LLM | REST API | Claude 4.5 Sonnet, Claude 3 Opus |
| 5 | Google Vertex AI | LLM | REST API | Gemini 2.0, Gemini 1.5 Pro |
| 6 | Ollama | LLM | REST API | Local LLM testing (Llama, Mistral, Phi, Gemma) |
| 7 | HuggingFace | LLM/CV | Inference API | Any model on HuggingFace Hub |
| 8 | AWS Bedrock | LLM | AWS SDK | Claude, Llama, Titan on AWS |
| 9 | Azure OpenAI | LLM | REST API | GPT models on Azure |
| 10 | Mistral AI | LLM | REST API | Mistral Large, Mixtral, Mistral 7B |
| 11 | LiteLLM | LLM | Proxy | Unified proxy to 100+ providers |
| 12 | Together AI | LLM | REST API | Hosted open-source models |
| 13 | MCP Servers | Agent | OpenAI-compatible | Any Model Context Protocol endpoint |
| 14 | Custom | Any | OpenAI-compatible | Any endpoint that speaks OpenAI format |
Web UI
Modern web dashboard built on React 18 + TypeScript + Vite + TailwindCSS:
| Page | Description |
|---|---|
| Dashboard | Security posture overview, pass/fail trends, recent scan activity |
| Scans | List all scans, create new scans, filter by status |
| Scan Detail | Real-time progress, per-test results, phase breakdown |
| Models | Model registry, connector status, last scan timestamps |
| Results | Cross-scan result comparison, regression detection, filtering |
| Reports | Generate and download JSON/HTML/DOCX/Compliance reports |
| Settings | Configuration management, threshold tuning |
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/api/v1/scans |
Create and start a new scan |
GET |
/api/v1/scans |
List scans (paginated) |
GET |
/api/v1/scans/{id} |
Scan details and status |
GET |
/api/v1/results |
Query results with filtering |
GET |
/api/v1/models |
List registered models |
GET |
/api/v1/reports/{scan_id} |
Generate report (JSON/HTML/DOCX) |
WS |
/ws/scans/{id} |
Real-time scan progress via WebSocket |
Report Formats
| Format | Use Case | Output |
|---|---|---|
| JSON | CI/CD integration, automation | Machine-readable with full metrics |
| HTML | Interactive dashboard | Self-contained single-file with navigation |
| DOCX | Executive reports | Professional Word doc with matrices |
| Compliance | EU AI Act / regulatory | Article-level mapping with scores |
Development
git clone https://github.com/tessera-ops/tessera.git
cd tessera
pip install -e ".[all,test]"
pytest # 376 tests
ruff check . # Lint
Writing a New Test
from tests.base import OWASPTestCase, PhaseResult, Metric
class MOD99NewTest(OWASPTestCase):
TEST_ID = "MOD-99"
TEST_NAME = "My New Security Test"
CATEGORY = "Model Security"
def phase1_attack(self, config: dict) -> PhaseResult:
return PhaseResult(phase=1, name="Attack", status="PASS",
evidence=["Attack simulated"])
def phase2_measure(self, config: dict) -> PhaseResult:
metric = Metric(name="attack_success_rate", value=0.02,
threshold_pass=0.05, threshold_fail=0.20, operator="<")
return PhaseResult(phase=2, name="Measure", metrics=[metric])
def phase3_defend(self, config: dict) -> PhaseResult:
return PhaseResult(phase=3, name="Defend", status="PASS")
Register in tessera/registry.py:
TEST_REGISTRY["MOD-99"] = ("tests.mod.mod99_new_test", "MOD99NewTest")
Roadmap
v2.2 (Next)
- SARIF output for GitHub/GitLab Security tab
- OpenTelemetry tracing for scan observability
- Slack/Teams webhook notifications
- Test parallelization (concurrent execution per model)
v2.3
- Multimodal model support (vision-language models)
- RAG pipeline testing (retriever poisoning, context window attacks)
- Scan diff and regression tracking across releases
v3.0
- Plugin architecture for community-contributed tests
- Distributed scan execution across workers
- Real-time model monitoring (continuous security posture)
- SBOM (Software Bill of Materials) for AI components
FAQ
Do I need all dependencies?
No. Tessera uses lazy imports. pip install tessera-ai is minimal. Add extras for what you need: [cv], [llm], [all].
Can I use Tessera without a database?
Yes. CLI mode requires zero infrastructure. The API server works without a database using an in-memory store.
Which AI models does Tessera support?
All major providers: OpenAI, Anthropic, Google, Meta, Mistral, AWS Bedrock, Azure OpenAI, HuggingFace, NVIDIA Triton, and any OpenAI-compatible endpoint. Plus MCP servers for agentic AI testing.
Is there CI/CD integration?
Yes. JSON output + exit codes. Non-zero exit if any test FAILs. Works with GitHub Actions, GitLab CI, Jenkins, etc.
What OWASP standards does Tessera cover?
Both the OWASP AI Testing Guide (32 tests across MOD/APP/INF/DAT) and the OWASP Top 10 for Agentic Applications ASI 2026 (10 AGT tests). Tessera is the first framework with complete coverage of both.
License
Apache License 2.0 — see LICENSE.
Community edition: all 42 tests, CLI, API, Web UI, Docker, Helm, 14 connectors, MCP scanning. Enterprise features (auth, SSO, multi-tenancy, compliance, scheduled scans, audit, white-label) require a commercial license.
Acknowledgments
Tessera builds on the work of these outstanding projects:
- OWASP AI Testing Guide — test methodology and taxonomy
- OWASP Top 10 for Agentic Applications — agentic AI security standard (ASI 2026)
- IBM Adversarial Robustness Toolbox (ART) — adversarial attack/defense
- Foolbox — adversarial perturbation library
- Detoxify — toxicity detection
- Fairlearn — fairness assessment
- Cleanlab — data quality
- Evidently AI — drift monitoring
- Garak — LLM vulnerability scanning
The vendor-neutral alternative for AI security testing.
42 OWASP tests. 5 categories. Full agentic AI coverage. EU AI Act compliance.
Test your AI systems before attackers do.
pip install tessera-ai && tessera --init
GitHub •
PyPI •
Issues •
Discussions
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tessera_ai-2.1.1.tar.gz.
File metadata
- Download URL: tessera_ai-2.1.1.tar.gz
- Upload date:
- Size: 209.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52f5cfd0ac89b0503a1fa9f26958386effb84a560084f621bb10b53238fbc6cc
|
|
| MD5 |
ededd0e9eab0d32d3fd7f3387e221bc3
|
|
| BLAKE2b-256 |
31c6296f49f78b1a6e22969093164377188631456ca748a2b5c881f09dd49d14
|
File details
Details for the file tessera_ai-2.1.1-py3-none-any.whl.
File metadata
- Download URL: tessera_ai-2.1.1-py3-none-any.whl
- Upload date:
- Size: 270.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f70e9f88df912cec3884c26213b67d64f2447431087413b207a49f18f7e38bfd
|
|
| MD5 |
61e8367c56ebb25b05caf75aa3b39f96
|
|
| BLAKE2b-256 |
829efec9d747f32f92dca01129f9d42752e27c601e414b1e4d7fc7fa6fb53768
|