Skip to main content

AI-powered code review with RAG-grounded explanations, 10 static analysers, and ECDSA-signed provenance

Project description

🛡️ ACR-QA

Automated Code Review & Quality Assurance Platform

10 static analysis tools. One canonical schema. RAG-enhanced AI explanations. $0 recurring cost.

Python 3.11+ Version Live Tests Coverage Precision OWASP Languages FastAPI License: MIT CI Tests WCAG 2.1 AA i18n Helm Terraform OpenTelemetry Signed SLSA Uptime


What It Is

ACR-QA is a provenance-first, AI-augmented code review platform built as a graduation thesis at KSIU. It solves three real problems that frustrate every developer using static analysis tools:

Problem What ACR-QA does
Alert fatigue — 7 tools dump raw JSON in incompatible schemas, full of duplicates Normalises all output into one canonical schema, deduplicates cross-tool, ranks by severity
LLM hallucination — AI assistants give confident but wrong security advice RAG: the LLM can only explain rules it can cite from a curated 66-rule knowledge base; semantic entropy (3× runs) detects contradictions
Invisible test gaps — code coverage % doesn't tell you which complex functions have no test AST-based Test Gap Analyzer ranks untested symbols by cyclomatic complexity

Key numbers: 97.1% precision · 9/10 OWASP Top 10 · 2,339 tests (2,274 Python + 65 TypeScript) · 36 async API endpoints · $0 recurring cost


Architecture

C4Container
    title ACR-QA — System Overview

    Person(dev, "Developer")

    Container_Boundary(sys, "ACR-QA") {
        Container(cli, "CLI / GitHub Action", "Python", "Entry point. Detects language, routes to adapter.")
        Container(core, "Analysis Engine", "Python 3.11", "10 tools → normalise → score → dedup → AI explain → quality gate")
        Container(api, "Async REST API", "FastAPI :8000", "36 /v1/ endpoints · JWT + API key auth · Swagger at /docs")
        Container(worker, "Background Worker", "Celery + Redis", "Async scan execution — POST /v1/scans returns 202 + job_id")
        ContainerDb(pg, "PostgreSQL 15", "", "Provenance: runs · findings · LLM calls · feedback · users · api_keys")
        ContainerDb(redis, "Redis 5.2", "", "Rate limiting · explanation cache · Celery broker + result backend")
    }

    Container_Ext(groq, "Groq API", "LLM", "Llama 3.3-70b explanations · Llama 3.1-8b feasibility · free tier")
    Container_Ext(gh, "GitHub / GitLab", "CI", "PR comments · SARIF upload · merge blocking")

    Rel(dev, cli, "runs analysis")
    Rel(cli, core, "invokes pipeline")
    Rel(core, pg, "stores provenance")
    Rel(core, redis, "rate limit / cache")
    Rel(core, groq, "RAG explanations")
    Rel(core, gh, "PR comments + SARIF")

Full C4 diagrams: C1 Context · C2 Containers · C3 Components · C4 Code


Quick Start

Option A — Docker (one command)

git clone https://github.com/ahmed-145/acr-qa.git && cd acr-qa
cp .env.example .env          # add your GROQ_API_KEY_1..4
make up
Service URL
FastAPI http://localhost:8000
Grafana http://localhost:3005 (admin/admin)
Prometheus http://localhost:9091

Option B — Local

pip install -r requirements.txt
createdb acrqa && psql -d acrqa -f DATABASE/schema.sql
cp .env.example .env && source .env
uvicorn FRONTEND.api.main:app --port 8000    # → http://localhost:8000/docs

Run your first analysis

# Python project
python3 CORE/main.py --target-dir ./myproject --rich

# JavaScript / TypeScript project
python3 CORE/main.py --target-dir ./my-express-app --lang javascript --no-ai

# Go project
python3 CORE/main.py --target-dir ./my-go-api --lang go

# JSON output for CI pipelines
python3 CORE/main.py --target-dir . --json --no-ai > findings.json

What Makes It Different

Feature CodeRabbit SonarQube ACR-QA
Multi-tool normalisation ✅ 10 tools
AI explanations ✅ RAG + entropy
Hallucination detection ✅ semantic entropy (3×)
Test gap analysis ✅ AST-based
Confidence per finding ✅ 0–100 score
Feedback-driven tuning ✅ auto suppression
Cost-benefit analysis ✅ ROI per finding
Call-graph reachability ✅ AST-based, 0% FP rate
Path feasibility (FP reduction) ✅ LLM4PFA approach
Cross-language vuln chains ✅ CHARON-inspired
CBoM (quantum-safety) ✅ NIST FIPS 203/204
Quality gates (CI blocking) ✅ configurable
SARIF export ✅ v2.1.0
OWASP compliance report ✅ 9/10 + CWE IDs
Recurring cost $$$ $$$ $0

Features

Detection Pipeline

10 tools run in parallel, all output normalised into one CanonicalFinding schema:

Tool Language What It Catches
Ruff Python Style, imports, unused code, PEP8
Bandit Python Security anti-patterns (33 rules)
Semgrep Python / JS / Go OWASP Top 10 patterns, custom rules
Vulture Python Dead code, unreachable branches
Radon Python Cyclomatic complexity, maintainability
Secrets Detector All API keys, passwords, JWTs, tokens
SCA Scanner Python Known-vulnerable dependency versions
ESLint JS / TS Security plugin — 20 rules
gosec Go Go security vulnerabilities
staticcheck Go Go static analysis and bug detection

AI Explanation (RAG-Enhanced)

  • 66-rule knowledge base (config/rules.yml) — every rule has description, rationale, remediation, and code examples
  • Evidence-grounded prompts — the LLM is given the rule text; it cannot invent advice for rules it can't cite
  • Semantic entropy — 3× LLM runs with varying temperature; contradictions lower the confidence score
  • Self-evaluation — LLM rates its own output 1–5 on relevance/accuracy/clarity
  • Path feasibility (Feature 7) — second AI call validates whether a HIGH finding's code path is actually reachable in production (LLM4PFA approach)

Quality Gate

# .acrqa.yml
quality_gate:
  mode: block         # block = fail CI + prevent merge | warn = comment only
  max_high: 0         # zero tolerance for HIGH severity
  max_medium: 5
  max_security: 0

Fails CI with exit code 1. GitHub Action posts severity table as PR comment.

Per-Repo Policy (.acrqa.yml)

rules:
  disabled_rules: [IMPORT-001, STYLE-007]
  severity_overrides: {COMPLEXITY-001: low}

analysis:
  ignore_paths: [.venv, migrations/, node_modules]

ai:
  max_explanations: 50
  model: llama-3.1-8b-instant

Inline Suppression

result = eval(user_input)      # acr-qa:ignore
password = "secret123"         # acrqa:disable SECURITY-005

FastAPI Dashboard (http://localhost:8000/docs)

  • Severity counters with live counts
  • Finding cards with collapsible AI explanations + 🎯 confidence badge
  • Cost-benefit widget: analysis cost, hours saved, ROI ratio
  • Trend charts across last 30 runs
  • False-positive feedback (👍/👎) — feeds triage memory for future suppression
  • Filters by severity, category, rule ID, full-text search
  • Export: SARIF, provenance trace, Markdown reports

CLI Reference

python3 CORE/main.py [options]

  --target-dir DIR     Directory to analyse (default: samples/realistic-issues)
  --repo-name NAME     Repository name for provenance tracking
  --pr-number N        PR number (enables GitHub PR comment posting)
  --limit N            Max findings to AI-explain (default: 50)
  --diff-only          Analyse only files changed in git diff
  --diff-base BRANCH   Base branch for diff (default: main)
  --auto-fix           Generate auto-fix suggestions for fixable rules
  --rich               Rich terminal output with colour-coded tables
  --lang LANG          auto (default) | python | javascript | typescript | go
  --no-ai              Skip AI explanation step (faster, no API key needed)
  --json               Output findings as JSON to stdout (pipe-friendly)
  --version            Print version and exit

Language Support

Python (v1.0+)

Ruff · Bandit · Semgrep · Vulture · Radon · Secrets · SCA · CBoM

JavaScript / TypeScript (v3.0.1+)

python3 CORE/main.py --target-dir ./my-react-app          # auto-detect
python3 CORE/main.py --target-dir ./my-express-api --lang javascript

ESLint (security plugin) · Semgrep JS rules · npm audit 56 rule mappings · 15 HIGH-severity security rules covered

Go (v3.2.0+)

python3 CORE/main.py --target-dir ./my-go-api --lang go

Prerequisites: go install github.com/securego/gosec/v2/cmd/gosec@latest && go install honnef.co/go/tools/cmd/staticcheck@latest

gosec · staticcheck · Semgrep Go rules · 45+ rule mappings


CI/CD Integration

GitHub Actions

# .github/workflows/acr-qa.yml (already in repo)
# Triggers on every PR:
# 1. Runs all 10 tools on changed files (--diff-only)
# 2. Normalises, scores, generates AI explanations
# 3. Posts severity-sorted PR comment with code suggestions
# 4. Uploads findings.sarif to GitHub Security Tab
# 5. Fails merge if quality gate violated (exit code 1)

Required secrets: GROQ_API_KEY_1..4 · GITHUB_TOKEN (auto-provided)

Manual trigger on PR

Comment on any PR:

acr-qa review

GitLab CI

.gitlab-ci.yml included. Set GROQ_API_KEY_1 and GITLAB_TOKEN in CI/CD Variables.


Monitoring

Prometheus scrapes /metrics every 15 s. Grafana dashboard at http://localhost:3005 (admin/admin):

Panel Metric
Request Rate rate(acrqa_http_requests_total[5m])
P95 Latency histogram_quantile(0.95, ...)
HTTP Success Rate 2xx / total * 100
LLM Latency avg(explain endpoint duration)
Error Rate rate(5xx[1m])

Testing

make test-all          # 2,339 tests (full suite)
make test              # acceptance tests only
make run               # pipeline on sample files
Test File Tests Coverage
test_acceptance.py 4 Pipeline E2E with mocked LLM
test_api.py 9 FastAPI endpoints
test_normalizer.py 7 Ruff / Bandit / Semgrep normalisation
test_new_engines.py 117 Secrets, SCA, CBoM, autofix, quality gate, KeyPool
test_deep_coverage.py 100 12-module deep coverage
test_god_mode.py 96 All features + regression + edge cases
test_js_adapter.py 63 JS/TS adapter, E2E pipeline, CLI routing
test_reachability.py 74 Call-graph engine, fixtures, enrich_findings
test_taint_analyzer.py 65+ Inter-procedural taint, sanitizer recognition
test_supply_chain.py 62 Lockfile parsers, OSV CVE, risk scoring, SBOM
test_attestation.py 60 AttestationEngine, ECDSA-P256, Dilithium3 PQ
test_exploit_verifier.py 59 Docker sandbox, 3-tier verdict
test_fastapi_app.py 32 FastAPI TestClient — all v1 endpoints
test_chaos.py 13 Postgres/Redis failure injection
test_week1_completion.py 42 Fuzz/snapshot/perf-gate/mutation-killing
(+ 31 more files) 1,536 Additional coverage
TypeScript (Vitest) 65 Button, Badge, Card, ScanCard, FindingsTable…

Thesis Evaluation

Research Questions

RQ Implementation Metric
RQ1 Can RAG reduce LLM hallucination? 66-rule KB + evidence-grounded prompts + entropy consistency_score (0–1)
RQ2 How to ensure provenance? Full PostgreSQL audit trail per LLM call llm_explanations table
RQ3 What confidence scoring works? score = severity × category × tool × rule × fix_validated confidence_score (0–100)
RQ4 Does it match industry tools? 10-tool pipeline vs CodeRabbit / SonarQube Feature parity table above

Benchmark Results (4 repositories)

Repository Findings Precision Security Precision Recall
DVPWA 44 81.8% 100% 50%
Pygoat 440 96.4% 100% 100%
VulPy 293 100% 100% 100%
DSVW 59 100% 100% 100%
Overall 836 97.1% 100%

DVPWA recall 50%: 3 of 6 known vulns detected; CSRF (runtime-only), hardcoded debug mode, and one abstracted cred require DAST — documented limitation, not a bug.


Architecture Decision Records

Key design decisions are documented in docs/adr/:

ADR Decision
0001 Scope: self-hosted thesis tool, not SaaS
0002 LanguageAdapter pattern for multi-language support
0003 RAG + semantic entropy over generic LLM prompts
0004 Groq free tier with 4-key rotation pool
0005 PostgreSQL for provenance storage

Technology Stack

Layer Technology
Language Python 3.11+
Web Framework FastAPI 0.115 (async)
Frontend React 18 + TypeScript + Vite 5 + shadcn/ui
Database PostgreSQL 15
Cache / Rate Limiting Redis 7
AI Model Groq Llama 3.3-70b (explanations) · Llama 3.1-8b (feasibility)
Static Analysis Ruff, Semgrep, Bandit, Vulture, Radon, gosec, staticcheck, ESLint
Terminal UI Rich
Schema Validation Pydantic v2
Containerisation Docker + Docker Compose
Monitoring Prometheus + Grafana
CI/CD GitHub Actions + GitLab CI
Export Formats SARIF v2.1.0, Markdown, JSON

Interactive Demo Notebooks

Run these Marimo notebooks locally or view the exported HTML:

Notebook Description HTML Preview
notebooks/walkthrough.py 12-cell full pipeline walkthrough walkthrough.html
notebooks/engine_demos/taint.py Taint analyzer — source→sink data-flow demo_taint.html
notebooks/engine_demos/exploit.py Exploit verifier — DAST-in-Docker demo_exploit.html
notebooks/engine_demos/attestation.py Attestation + post-quantum hybrid signatures demo_attestation.html
notebooks/engine_demos/offline.py Zero-egress / Ollama mode proof demo_offline.html
# Run interactively (edit cells live)
pip install marimo
marimo run notebooks/walkthrough.py

# Or open in edit mode for defense demo
marimo edit notebooks/walkthrough.py

Documentation

Document Description
C1–C4 Architecture Full C4 model: context, containers, components, code flow
ADRs Architecture decision records — why each major choice was made
API Reference All 32 REST endpoints
Policy Engine .acrqa.yml full reference
Evaluation Report Precision/recall, OWASP coverage, competitive analysis
Per-Tool Evaluation Per-engine accuracy across benchmark repos
User Study Protocol 20-minute structured study protocol
Cloud Deployment AWS / GCP / Railway deployment guide
Changelog Full version history
Contributing Development setup and contribution guide

Academic Context

Student Ahmed Mahmoud Abbas
Supervisor Dr. Samy AbdelNabi
Institution King Salman International University (KSIU)
Timeline October 2025 – June 2026
Status v4.6.0 · Distribution release · Evaluation complete · Cloud-deployed · 39/39 tasks done

Remaining Thesis Work


License

MIT — see LICENSE


Built with ❤️ at King Salman International University · ⭐ Star this repo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acrqa-4.6.0.tar.gz (219.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acrqa-4.6.0-py3-none-any.whl (178.1 kB view details)

Uploaded Python 3

File details

Details for the file acrqa-4.6.0.tar.gz.

File metadata

  • Download URL: acrqa-4.6.0.tar.gz
  • Upload date:
  • Size: 219.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for acrqa-4.6.0.tar.gz
Algorithm Hash digest
SHA256 df0ac996ddfb7350d806e10f1f9b6354135543b2f1649cba28871cecf7524106
MD5 22226c88410a08ccde2987c5b5b1bd4c
BLAKE2b-256 0a7ad5b09bc3f9bb800651a10fddc94c458c60a9606178c9ca80603e54566984

See more details on using hashes here.

File details

Details for the file acrqa-4.6.0-py3-none-any.whl.

File metadata

  • Download URL: acrqa-4.6.0-py3-none-any.whl
  • Upload date:
  • Size: 178.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for acrqa-4.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bbb88dd4b668f7519aeda03a7bf665ec199f015d9c5eb5f71df92df0a2f32c1b
MD5 5f7be5609a1fb77b40138e991ea7045b
BLAKE2b-256 e88a5a820ec9abeccc70952e9f8fc65e9ed00d473df9fac9b773c7e9f0f42fac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page