Out-Of-Tree Llama Stack provider for Garak Red-teaming
Project description
TrustyAI Garak: LLM Red Teaming for Llama Stack
Automated vulnerability scanning and red teaming for Large Language Models using Garak. This project implements garak as an external evaluation provider for Llama Stack.
What It Does
- 🔍 Vulnerability Assessment: Red Team LLMs for prompt injection, jailbreaks, toxicity, bias and other vulnerabilities
- 📋 Compliance: OWASP LLM Top 10, AVID taxonomy benchmarks
- 🛡️ Shield Testing: Measure guardrail effectiveness
- ☁️ Cloud-Native: Runs on OpenShift AI / Kubernetes
- 📊 Detailed Reports: JSON and HTML reports
Pick Your Deployment
| Mode | Llama Stack server | Garak scans | Typical use case | Guide |
|---|---|---|---|---|
| Total Remote | OpenShift/Kubernetes | KFP pipelines | Production | → Setup |
| Partial Remote | Local machine | KFP pipelines | Development | → Setup |
| Total Inline | Local machine | Local machine | Fast local testing | → Setup |
- Feature notebook:
demos/guide.ipynb - Metadata reference:
BENCHMARK_METADATA_REFERENCE.md
Installation
# For Deployment 1 (Total remote)
## no installation needed!
# For Deployment 2 (Partial remote)
pip install llama-stack-provider-trustyai-garak
# For Deployment 3 (local scans) - requires extra
pip install "llama-stack-provider-trustyai-garak[inline]"
Quick Workflow
from llama_stack_client import LlamaStackClient
client = LlamaStackClient(base_url="http://localhost:8321")
# Discover Garak provider
garak_provider = next(
p for p in client.providers.list()
if p.provider_type.endswith("trustyai_garak")
)
garak_provider_id = garak_provider.provider_id
# List predefined benchmarks
benchmarks = client.alpha.benchmarks.list()
print([b.identifier for b in benchmarks if b.identifier.startswith("trustyai_garak::")])
# Run a predefined benchmark
benchmark_id = "trustyai_garak::quick"
job = client.alpha.eval.run_eval(
benchmark_id=benchmark_id,
benchmark_config={
"eval_candidate": {
"type": "model",
"model": "your-model-id",
"sampling_params": {"max_tokens": 100},
}
},
)
# Poll status
status = client.alpha.eval.jobs.status(job_id=job.job_id, benchmark_id=benchmark_id)
print(status.status)
# Retrieve final result
if status.status == "completed":
job_result = client.alpha.eval.jobs.retrieve(job_id=job.job_id, benchmark_id=benchmark_id)
Custom Benchmark Schema
Use metadata.garak_config for Garak command configuration. Provider-level runtime parameters (for example timeout, shield_ids) stay at top-level metadata.
client.alpha.benchmarks.register(
benchmark_id="custom_promptinject",
dataset_id="garak",
scoring_functions=["garak_scoring"],
provider_id=garak_provider_id,
provider_benchmark_id="custom_promptinject",
metadata={
"garak_config": {
"plugins": {
"probe_spec": ["promptinject"]
},
"reporting": {
"taxonomy": "owasp"
}
},
"timeout": 900
}
)
Update and Deep-Merge Behavior
- To create a tuned variant of a predefined (or existing custom) benchmark, set
provider_benchmark_idto the predefined (or existing custom) benchmark ID and pass overrides inmetadata. - Provider metadata is deep-merged, so you can override only the parts you care about.
- Predefined benchmarks are comprehensive by design. For faster exploratory runs, lower
garak_config.run.soft_probe_prompt_capto reduce prompts per probe.
client.alpha.benchmarks.register(
benchmark_id="quick_promptinject_tuned",
dataset_id="garak",
scoring_functions=["garak_scoring"],
provider_id=garak_provider_id,
provider_benchmark_id="trustyai_garak::quick",
metadata={
"garak_config": {
"plugins": {"probe_spec": ["promptinject"]},
"system": {"parallel_attempts": 20}
},
"timeout": 1200
}
)
# Faster (less comprehensive) variant of a predefined benchmark
client.alpha.benchmarks.register(
benchmark_id="owasp_fast",
dataset_id="garak",
scoring_functions=["garak_scoring"],
provider_id=garak_provider_id,
provider_benchmark_id="trustyai_garak::owasp_llm_top10",
metadata={
"garak_config": {
"run": {"soft_probe_prompt_cap": 100}
}
}
)
Shield Testing
Use either shield_ids (all treated as input shields) or shield_config (explicit input/output mapping).
client.alpha.benchmarks.register(
benchmark_id="with_shields",
dataset_id="garak",
scoring_functions=["garak_scoring"],
provider_id=garak_provider_id,
provider_benchmark_id="with_shields",
metadata={
"garak_config": {
"plugins": {"probe_spec": ["promptinject.HijackHateHumans"]}
},
"shield_config": {
"input": ["Prompt-Guard-86M"],
"output": ["Llama-Guard-3-8B"]
},
"timeout": 600
}
)
Understanding Results (_overall and TBSA)
job_result.scores contains:
- probe-level entries (for example
promptinject.HijackHateHumans) - synthetic
_overallaggregate entry across all probes
_overall.aggregated_results can include:
total_attemptsvulnerable_responsesattack_success_rateprobe_counttbsa(Tier-Based Security Aggregate, 1.0 to 5.0, higher is better)version_probe_hashprobe_detector_pairs_contributing
TBSA is derived from probe:detector pass-rate and z-score DEFCON grades with tier-aware aggregation and weighting, to give a more meaningful overall security posture than a plain pass/fail metric.
Scan Artifacts
Access scan files from job metadata:
scan.logscan.report.jsonlscan.hitlog.jsonlscan.avid.jsonlscan.report.html
Remote mode stores prefixed keys in metadata (for example {job_id}_scan.report.html).
Notes on Remote Cluster Resources
- Partial remote mode needs KFP resources only.
- Total remote mode needs full stack resources (KFP, LlamaStackDistribution, RBAC, secrets, and Postgres manifests).
- See
lsd_remote/for full reference manifests.
Support & Documentation
- 📚 Tutorial: https://trustyai.org/docs/main/red-teaming-introduction
- 💬 Issues: https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak/issues
- 🦙 Llama Stack Docs: https://llamastack.github.io/
- 📖 Garak Docs: https://reference.garak.ai/en/latest/index.html
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llama_stack_provider_trustyai_garak-0.3.1.tar.gz.
File metadata
- Download URL: llama_stack_provider_trustyai_garak-0.3.1.tar.gz
- Upload date:
- Size: 142.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b199b6bdbedf5cceea23eb136e289ad2cf17ecb3ce1245961ec2ca83f72f7246
|
|
| MD5 |
25458dc9e79e66295ff97f9011210ba9
|
|
| BLAKE2b-256 |
7bbd11acb5e7a97d4b77bd5c033fcd1f7f68ad5f293da719376aab4f6a46accb
|
Provenance
The following attestation bundles were made for llama_stack_provider_trustyai_garak-0.3.1.tar.gz:
Publisher:
build-and-publish.yaml on trustyai-explainability/llama-stack-provider-trustyai-garak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llama_stack_provider_trustyai_garak-0.3.1.tar.gz -
Subject digest:
b199b6bdbedf5cceea23eb136e289ad2cf17ecb3ce1245961ec2ca83f72f7246 - Sigstore transparency entry: 1097560448
- Sigstore integration time:
-
Permalink:
trustyai-explainability/llama-stack-provider-trustyai-garak@a707be0762760748f6ee72868e5b128c29ec9473 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/trustyai-explainability
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-and-publish.yaml@a707be0762760748f6ee72868e5b128c29ec9473 -
Trigger Event:
release
-
Statement type:
File details
Details for the file llama_stack_provider_trustyai_garak-0.3.1-py3-none-any.whl.
File metadata
- Download URL: llama_stack_provider_trustyai_garak-0.3.1-py3-none-any.whl
- Upload date:
- Size: 113.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c0908280b91a3e238cdf21c099624636a507f4e2da6cef440a0687d1c547c38
|
|
| MD5 |
a99e7353695b2e603724f550fb971313
|
|
| BLAKE2b-256 |
eaa6e46f15aca9b662c0bced1661c2ea9a096fe46ad0f780f177530a22602032
|
Provenance
The following attestation bundles were made for llama_stack_provider_trustyai_garak-0.3.1-py3-none-any.whl:
Publisher:
build-and-publish.yaml on trustyai-explainability/llama-stack-provider-trustyai-garak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llama_stack_provider_trustyai_garak-0.3.1-py3-none-any.whl -
Subject digest:
3c0908280b91a3e238cdf21c099624636a507f4e2da6cef440a0687d1c547c38 - Sigstore transparency entry: 1097560485
- Sigstore integration time:
-
Permalink:
trustyai-explainability/llama-stack-provider-trustyai-garak@a707be0762760748f6ee72868e5b128c29ec9473 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/trustyai-explainability
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-and-publish.yaml@a707be0762760748f6ee72868e5b128c29ec9473 -
Trigger Event:
release
-
Statement type: