Skip to main content

Score AI agent architectures against the AWAF open specification

Project description

awaf-cli

The reference implementation of the AWAF open specification. Catch agent architecture regressions before they ship.

CI PyPI Python License

Runs in CI on every PR that touches agent code. Scores across 10 architectural pillars defined by the AWAF open specification. Fails the build when something regresses.

No dashboards that need a legend. No compliance jargon. One number per pillar, one finding per issue, one fix per finding.


Install

pip install awaf

Requires Python 3.11+. Bring your own model and API key.


Provider Support

awaf-cli is model-agnostic. Use any supported LLM provider — no vendor lock-in.

Provider Models Key Env Var
anthropic claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5 ANTHROPIC_API_KEY
openai gpt-4o, gpt-4o-mini, o3, o4-mini OPENAI_API_KEY
azure Any Azure OpenAI deployment AZURE_OPENAI_API_KEY
google gemini-2.0-flash, gemini-1.5-pro GOOGLE_API_KEY
litellm Any LiteLLM-compatible model Provider-specific

Default provider: anthropic with claude-opus-4-5. Scores are calibrated on Claude; other providers may yield slight variance.


API Keys from .env

awaf automatically loads a .env file in the current directory at startup. Keys already set in the environment take precedence.

Create a .env file next to your project:

ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...
# GOOGLE_API_KEY=...

Then run normally — no export needed:

awaf run
awaf run --pillar foundation

If you prefer to load .env manually before running:

# bash / zsh
export $(grep -v '^#' .env | xargs) && awaf run
# PowerShell
Get-Content .env | ForEach-Object { $k,$v = $_ -split '=',2; [System.Environment]::SetEnvironmentVariable($k,$v) }; awaf run

Quickstart

# Default: Anthropic (.env or export)
export ANTHROPIC_API_KEY=sk-ant-...
awaf run

# OpenAI
export OPENAI_API_KEY=sk-...
awaf run --provider openai --model gpt-4o

# Azure / GitHub Copilot
export AZURE_OPENAI_API_KEY=...
awaf run --provider azure --model gpt-4o --azure-endpoint https://your-resource.openai.azure.com --azure-deployment gpt-4o

# LiteLLM (Bedrock, Groq, Ollama, etc.)
awaf run --provider litellm --model bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0
AWAF Assessment: my-agent
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  Overall Score    78  Near Ready
  Provider         openai / gpt-4o

  TIER 0: FOUNDATION
  Foundation        85  verified

  TIER 1: CLOUD WAF ADAPTED
  Op. Excellence    74  verified
  Security          82  verified
  Reliability       71  verified
  Performance       80  verified
  Cost Optim.       65  partial
  Sustainability    79  verified

  TIER 2: AGENT-NATIVE  (1.5x weight)
  Reasoning Integ.  71  partial
  Controllability   78  verified
  Context Integrity 80  verified

  FILES ANALYZED     12 files
  FILES NOT SCANNED  infra/iam.yaml  Cost score confidence: partial

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Per-Project Config

# awaf.toml
[project]
name = "my-agent"

[provider]
name = "openai"              # anthropic | openai | azure | google | litellm
model = "gpt-4o"
api_key_env = "OPENAI_API_KEY"   # defaults to provider standard env var

# Azure / Copilot specific
# name = "azure"
# model = "gpt-4o"
# api_key_env = "AZURE_OPENAI_API_KEY"
# azure_endpoint = "https://your-resource.openai.azure.com"
# azure_deployment = "gpt-4o"
# azure_api_version = "2025-01-01-preview"

# LiteLLM — any model string LiteLLM supports
# name = "litellm"
# model = "bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0"

[thresholds]
overall_fail = 60
tier2_fail = 50
regression_limit = 10
warn_only = false

[files]
agent_patterns = ["agents/**/*.py", "tools/**/*.py", "pipelines/**"]
exclude = ["tests/**", "docs/**"]

[reporting]
post_pr_comment = true
terminal_format = "compact"    # compact | full | json

CI Integration

GitHub Actions

name: AWAF Assessment
on:
  pull_request:
    paths:
      - 'agents/**'
      - 'tools/**'
      - 'pipelines/**'

jobs:
  awaf:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: YogirajA/awaf-action@v1
        with:
          # Use whichever provider key you have
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
          # openai-api-key: ${{ secrets.OPENAI_API_KEY }}
          # azure-openai-api-key: ${{ secrets.AZURE_OPENAI_API_KEY }}
          provider: anthropic           # anthropic | openai | azure | google | litellm
          model: claude-opus-4-5        # optional, uses provider default if omitted
          project-name: my-agent
          fail-threshold: 60
          tier2-fail-threshold: 50
          score-regression-limit: 10
          post-pr-comment: true

AWAF only runs when agent files change. Unrelated commits are skipped (exit 3).

GitLab CI

include:
  - remote: 'https://raw.githubusercontent.com/YogirajA/awaf-cli/main/integrations/gitlab/awaf-gitlab-ci.yml'

awaf:
  variables:
    ANTHROPIC_API_KEY: $ANTHROPIC_API_KEY
    AWAF_PROVIDER: anthropic
    AWAF_PROJECT_NAME: my-agent
    AWAF_FAIL_THRESHOLD: "60"

Exit Codes

Code Meaning
0 Passed all thresholds
1 Score below threshold or regression exceeded
2 Assessment failed (API error, ingest error)
3 No agent files changed, skipped

CLI Reference

awaf run                                         # assess current directory
awaf run --paths agents/ tools/                  # specific paths
awaf run --ci                                    # CI mode with git context
awaf run --pillar foundation                     # single pillar only
awaf run --provider openai --model gpt-4o        # override provider
awaf run --provider litellm --model ollama/llama3 # local model via LiteLLM
awaf run --sequential                            # one pillar at a time (avoids rate limits)
awaf run --sequential --delay 10                 # sequential with 10s pause between pillars
awaf history                                     # score history for current project
awaf compare <id1> <id2>                         # diff two assessments
awaf report --format json                        # JSON output for CI artifact upload
awaf report --coverage                           # show files analyzed and skipped
awaf providers                                   # list configured providers and status

No color codes when stdout is not a TTY. No spinners in CI mode.

Running pillars one at a time

Useful on free-tier API plans or when debugging a specific pillar. Each run saves to awaf.db and contributes to score history.

awaf run --pillar foundation
awaf run --pillar security
awaf run --pillar controllability
# ... pick the pillars you care about

To score all 10 pillars sequentially with a pause between each call:

awaf run --sequential --delay 15

What Gets Scored

awaf-cli implements AWAF v1.0 across 10 pillars in 3 tiers. Full pillar definitions and scoring questions are in the specification repo.

Tier 0: Foundation. Can this agent run independently?

Tier 1: Cloud WAF Adapted (1.0x weight). Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, Sustainability

Tier 2: Agent-Native (1.5x weight). Reasoning Integrity, Controllability, Context Integrity

The agent-native pillars are what make AWAF distinct. Cloud infrastructure has no equivalent for them; they exist because agents are not servers. See aradhye.com for the original thinking behind this.


What It Analyzes

awaf-cli reads what is in your repository: Python, TypeScript, Go, YAML, JSON, TOML, Markdown, and PDF files.

It can verify: trust tier enforcement in code, kill switch and cancel implementations, loop detection and budget guards, eval framework presence, sanitization at input boundaries, slice boundary documentation.

It cannot verify (flagged as partial confidence): cloud resource configs not in the repo, whether SLOs are being met in production, runtime hallucination rates, whether circuit breakers are actually firing.

When something cannot be verified, the output says so explicitly. Partial confidence with clear coverage gaps is more useful than a confident score built on assumptions.


Score History

Every assessment is stored locally in awaf.db. Score history is tracked per project, per branch, per commit, and per provider/model.

awaf history

my-agent  last 5 assessments
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  2026-02-27  a3f9c12  PR #47   72  -6   openai/gpt-4o       Controllability regression
  2026-02-24  8bc1a33  main     78  +3   anthropic/claude-opus-4-5  Context Integrity improved
  2026-02-21  4de92f1  main     75  +0   anthropic/claude-opus-4-5
  2026-02-18  2ab77c4  main     75  +8   openai/gpt-4o       Security and Reliability up
  2026-02-12  9ff3e21  main     67  —    anthropic/claude-opus-4-5

Six months of CI runs become your architectural changelog.


How It Works

awaf-cli sends your architecture artifacts to the LLM provider of your choice. Each of the 10 AWAF pillars is evaluated by a separate model call running concurrently. Results are written to a local SQLite database. No central coordinator. No shared state between pillar evaluations.

Artifacts → Ingestor → Event Bus → [10 Pillar Agents concurrently] → SQLite → Terminal
                                          ↑
                              Provider Abstraction Layer
                         (Anthropic | OpenAI | Azure | Google | LiteLLM)

The tool is built to be AWAF-compliant itself: choreography over orchestration, vertical slice per pillar, blast radius bounded. See ARCHITECTURE.md.


Environment Variables

# Provider selection (can also be set in awaf.toml)
AWAF_PROVIDER=anthropic          # anthropic | openai | azure | google | litellm
AWAF_MODEL=claude-opus-4-5       # optional model override

# API keys — use whichever provider you're running
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4o
GOOGLE_API_KEY=...

# Session controls
AWAF_DB_URL=sqlite:///./awaf.db
AWAF_MAX_ARTIFACTS_TOKENS=40000
AWAF_SESSION_BUDGET_USD=1.00     # approximate; pricing varies by provider
AWAF_LOG_LEVEL=INFO

Deployment Modes

Mode Setup Data Right For
Local pip install awaf + API key awaf.db on your machine Solo developers, OSS projects
Cloud API key + AWAF_MODE=cloud awaf.dev (coming) Teams, dashboards, benchmarks
On-Prem Docker Compose / Helm Your PostgreSQL Enterprise, regulated industries

The local mode is fully functional with no account required. Cloud and on-prem add team dashboards, cross-project score history, and industry benchmarks. On-prem: no artifacts leave your network. All model API calls use your own API key. No telemetry unless opted in.


Score Badge

[![AWAF Score](https://img.shields.io/badge/AWAF%20Score-78%20Near%20Ready-2563EB?style=flat-square)](https://github.com/YogirajA/AWAF)

Live badge (cloud mode):

[![AWAF Score](https://awaf.dev/badge/your-project)](https://awaf.dev/your-project)

Contributing

Bug reports, feature requests, and PRs welcome. Provider adapter contributions especially welcome — see PROVIDER_SPEC.md for the interface contract.

For changes to the AWAF specification itself (pillar definitions, scoring questions, methodology), open an issue in the AWAF specification repo. This repo is for the implementation.


License

Apache 2.0. See LICENSE.


Related

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

awaf-0.1.1.tar.gz (212.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

awaf-0.1.1-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file awaf-0.1.1.tar.gz.

File metadata

  • Download URL: awaf-0.1.1.tar.gz
  • Upload date:
  • Size: 212.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for awaf-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2b0d6b9e78c95edf0c751d37cc404fa5cbbd092fc732f201299d8b1448396d8d
MD5 497488aeef09b5388078bc94cf3d1282
BLAKE2b-256 532e6779008d3502928bae0e42b212bdd04faaa5c629ebc1a63a81a371f258e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for awaf-0.1.1.tar.gz:

Publisher: publish.yml on YogirajA/awaf-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file awaf-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: awaf-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for awaf-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3d752264c8429fa0ec6c00c819d215fdba675e8f28a1d1cd38f78a609642e820
MD5 6cb442866d20d012e92df6781092677b
BLAKE2b-256 631ef34ded743d4c1d519f23c25ef726f14d444259a3a94469ef8e3f5953c04f

See more details on using hashes here.

Provenance

The following attestation bundles were made for awaf-0.1.1-py3-none-any.whl:

Publisher: publish.yml on YogirajA/awaf-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page