Out-Of-Tree Llama Stack provider for Garak Red-teaming
Project description
TrustyAI Garak: LLM Red Teaming for Llama Stack
Automated vulnerability scanning and red teaming for Large Language Models using Garak. This project implements garak as an external evaluation provider for Llama Stack.
What It Does
- 🔍 Vulnerability Assessment: Red Team LLMs for prompt injection, jailbreaks, toxicity, bias and other vulnerabilities
- 📋 Compliance: OWASP LLM Top 10, AVID taxonomy benchmarks
- 🛡️ Shield Testing: Measure guardrail effectiveness
- ☁️ Cloud-Native: Runs on OpenShift AI / Kubernetes
- 📊 Detailed Reports: JSON and HTML reports
Pick Your Deployment
| # | Mode | Server | Scans | Use Case | Guide |
|---|---|---|---|---|---|
| 1 | Total Remote | OpenShift AI | Data Science Pipelines | Production | → Setup |
| 2 | Partial Remote | Local laptop | Data Science Pipelines | Development | → Setup |
| 3 | Total Inline | Local laptop | Local laptop | Testing only | → Setup |
Installation
# For Deployment 1 (Total remote)
## no installation needed!
# For Deployment 2 (Partial remote)
pip install llama-stack-provider-trustyai-garak
# For Deployment 3 (local scans) - requires extra
pip install "llama-stack-provider-trustyai-garak[inline]"
Quick Example
from llama_stack_client import LlamaStackClient
client = LlamaStackClient(base_url="http://localhost:8321")
# Run security scan (5 minutes)
job = client.alpha.eval.run_eval(
benchmark_id="trustyai_garak::quick",
benchmark_config={
"eval_candidate": {
"type": "model",
"model": "your-model-name",
"sampling_params": {"max_tokens": 100}
}
}
)
# Check status
status = client.alpha.eval.jobs.status(job_id=job.job_id, benchmark_id="trustyai_garak::quick")
print(f"Status: {status.status}")
# Get results
if status.status == "completed":
results = client.alpha.eval.get_eval_job_result(job_id=job.job_id, benchmark_id="trustyai_garak::quick")
Available Benchmarks
| Benchmark ID | Tests | Duration |
|---|---|---|
trustyai_garak::owasp_llm_top10 |
OWASP Top 10 | ~2 hrs |
trustyai_garak::avid_security |
AVID Security | ~2 hrs |
trustyai_garak::avid_ethics |
AVID Ethics | ~10 min |
trustyai_garak::avid_performance |
AVID Performance | ~10 min |
trustyai_garak::quick |
3 test probes | ~5 min |
Or register custom benchmarks with specific Garak probes.
Shield Testing Example
# Test how well guardrails (shields) block attacks
client.benchmarks.register(
benchmark_id="with_shield",
dataset_id="garak",
scoring_functions=["garak_scoring"],
provider_id="trustyai_garak_remote", # or trustyai_garak_inline
provider_benchmark_id="with_shield",
metadata={
"probes": ["promptinject.HijackHateHumans"],
"shield_ids": ["Prompt-Guard-86M"] # Shield to test
}
)
job = client.alpha.eval.run_eval(
benchmark_id="with_shield",
benchmark_config={"eval_candidate": {"type": "model", "model": "your-model"}}
)
Compare results with/without shields to measure effectiveness.
Understanding Results
Vulnerability Score
- 0.0 = Secure (model refused attack)
- 0.5 = Threshold (concerning)
- 1.0 = Vulnerable (model was compromised)
Reports Available
Access via job.metadata:
scan.log: Detailed log of this scan.scan.report.jsonl: Report containing information about each attempt (prompt) of each garak probe.scan.hitlog.jsonl: Report containing only the information about attempts that the model was found vulnerable to.scan.avid.jsonl: AVID (AI Vulnerability Database) format ofscan.report.jsonl. You can find info about AVID here.scan.report.html: Visual representation of the scan. In remote mode, this is logged as a html artifact of the pipeline.
# Download HTML report
html_id = job.metadata[f"{job.job_id}_scan.report.html"]
content = client.files.content(html_id)
with open("report.html", "w") as f:
f.write(content)
Support & Documentation
- 📚 Tutorial: https://trustyai.org/docs/main/red-teaming-introduction
- 💬 Issues: https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak/issues
- 🦙 Llama Stack Docs: https://llamastack.github.io/
- 📖 Garak Docs: https://reference.garak.ai/en/latest/index.html
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llama_stack_provider_trustyai_garak-0.2.0.tar.gz.
File metadata
- Download URL: llama_stack_provider_trustyai_garak-0.2.0.tar.gz
- Upload date:
- Size: 58.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c63e179adcbbb2006dccaa78daf7a30da3855abe96d562a952fcdca5d31c8eb0
|
|
| MD5 |
35adf75e0317f3f7dc8443d81c367867
|
|
| BLAKE2b-256 |
e09f42304ce2967ce67028ce7045b6de7489c84046a214705808df43d7ab6dbc
|
Provenance
The following attestation bundles were made for llama_stack_provider_trustyai_garak-0.2.0.tar.gz:
Publisher:
build-and-publish.yaml on trustyai-explainability/llama-stack-provider-trustyai-garak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llama_stack_provider_trustyai_garak-0.2.0.tar.gz -
Subject digest:
c63e179adcbbb2006dccaa78daf7a30da3855abe96d562a952fcdca5d31c8eb0 - Sigstore transparency entry: 942982280
- Sigstore integration time:
-
Permalink:
trustyai-explainability/llama-stack-provider-trustyai-garak@02cb9274faa9f0dfebc622de23a12467977d7f4e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/trustyai-explainability
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-and-publish.yaml@02cb9274faa9f0dfebc622de23a12467977d7f4e -
Trigger Event:
release
-
Statement type:
File details
Details for the file llama_stack_provider_trustyai_garak-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llama_stack_provider_trustyai_garak-0.2.0-py3-none-any.whl
- Upload date:
- Size: 51.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ea6b9fc2ac48f1324c53a88983daeadec174efb7239f8ae9fbf752bfb0e527d
|
|
| MD5 |
2080cbe903ed0e7fd7e876b549a2b71d
|
|
| BLAKE2b-256 |
7ff6926d0459c6eb7bd67086fe341d72621667044a36983a54889cb94745e0fc
|
Provenance
The following attestation bundles were made for llama_stack_provider_trustyai_garak-0.2.0-py3-none-any.whl:
Publisher:
build-and-publish.yaml on trustyai-explainability/llama-stack-provider-trustyai-garak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llama_stack_provider_trustyai_garak-0.2.0-py3-none-any.whl -
Subject digest:
5ea6b9fc2ac48f1324c53a88983daeadec174efb7239f8ae9fbf752bfb0e527d - Sigstore transparency entry: 942982286
- Sigstore integration time:
-
Permalink:
trustyai-explainability/llama-stack-provider-trustyai-garak@02cb9274faa9f0dfebc622de23a12467977d7f4e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/trustyai-explainability
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-and-publish.yaml@02cb9274faa9f0dfebc622de23a12467977d7f4e -
Trigger Event:
release
-
Statement type: