AI-powered code review with RAG-grounded explanations, 10 static analysers, and ECDSA-signed provenance
Project description
🛡️ ACR-QA
Automated Code Review & Quality Assurance Platform
10 static analysis tools. One canonical schema. RAG-enhanced AI explanations. $0 recurring cost.
What It Is
ACR-QA is a provenance-first, AI-augmented code review platform built as a graduation thesis at KSIU. It solves three real problems that frustrate every developer using static analysis tools:
| Problem | What ACR-QA does |
|---|---|
| Alert fatigue — 7 tools dump raw JSON in incompatible schemas, full of duplicates | Normalises all output into one canonical schema, deduplicates cross-tool, ranks by severity |
| LLM hallucination — AI assistants give confident but wrong security advice | RAG: the LLM can only explain rules it can cite from a curated 66-rule knowledge base; semantic entropy (3× runs) detects contradictions |
| Invisible test gaps — code coverage % doesn't tell you which complex functions have no test | AST-based Test Gap Analyzer ranks untested symbols by cyclomatic complexity |
Key numbers: 97.1% precision · 9/10 OWASP Top 10 · 2,339 tests (2,274 Python + 65 TypeScript) · 36 async API endpoints · $0 recurring cost
Architecture
C4Container
title ACR-QA — System Overview
Person(dev, "Developer")
Container_Boundary(sys, "ACR-QA") {
Container(cli, "CLI / GitHub Action", "Python", "Entry point. Detects language, routes to adapter.")
Container(core, "Analysis Engine", "Python 3.11", "10 tools → normalise → score → dedup → AI explain → quality gate")
Container(api, "Async REST API", "FastAPI :8000", "36 /v1/ endpoints · JWT + API key auth · Swagger at /docs")
Container(worker, "Background Worker", "Celery + Redis", "Async scan execution — POST /v1/scans returns 202 + job_id")
ContainerDb(pg, "PostgreSQL 15", "", "Provenance: runs · findings · LLM calls · feedback · users · api_keys")
ContainerDb(redis, "Redis 5.2", "", "Rate limiting · explanation cache · Celery broker + result backend")
}
Container_Ext(groq, "Groq API", "LLM", "Llama 3.3-70b explanations · Llama 3.1-8b feasibility · free tier")
Container_Ext(gh, "GitHub / GitLab", "CI", "PR comments · SARIF upload · merge blocking")
Rel(dev, cli, "runs analysis")
Rel(cli, core, "invokes pipeline")
Rel(core, pg, "stores provenance")
Rel(core, redis, "rate limit / cache")
Rel(core, groq, "RAG explanations")
Rel(core, gh, "PR comments + SARIF")
Full C4 diagrams: C1 Context · C2 Containers · C3 Components · C4 Code
Quick Start
Option A — Docker (one command)
git clone https://github.com/ahmed-145/acr-qa.git && cd acr-qa
cp .env.example .env # add your GROQ_API_KEY_1..4
make up
| Service | URL |
|---|---|
| FastAPI | http://localhost:8000 |
| Grafana | http://localhost:3005 (admin/admin) |
| Prometheus | http://localhost:9091 |
Option B — Local
pip install -r requirements.txt
createdb acrqa && psql -d acrqa -f DATABASE/schema.sql
cp .env.example .env && source .env
uvicorn FRONTEND.api.main:app --port 8000 # → http://localhost:8000/docs
Run your first analysis
# Python project
python3 CORE/main.py --target-dir ./myproject --rich
# JavaScript / TypeScript project
python3 CORE/main.py --target-dir ./my-express-app --lang javascript --no-ai
# Go project
python3 CORE/main.py --target-dir ./my-go-api --lang go
# JSON output for CI pipelines
python3 CORE/main.py --target-dir . --json --no-ai > findings.json
What Makes It Different
| Feature | CodeRabbit | SonarQube | ACR-QA |
|---|---|---|---|
| Multi-tool normalisation | ✅ | ✅ | ✅ 10 tools |
| AI explanations | ✅ | ✅ | ✅ RAG + entropy |
| Hallucination detection | ❌ | ❌ | ✅ semantic entropy (3×) |
| Test gap analysis | ❌ | ❌ | ✅ AST-based |
| Confidence per finding | ❌ | ❌ | ✅ 0–100 score |
| Feedback-driven tuning | ❌ | ❌ | ✅ auto suppression |
| Cost-benefit analysis | ❌ | ❌ | ✅ ROI per finding |
| Call-graph reachability | ❌ | ❌ | ✅ AST-based, 0% FP rate |
| Path feasibility (FP reduction) | ❌ | ❌ | ✅ LLM4PFA approach |
| Cross-language vuln chains | ❌ | ❌ | ✅ CHARON-inspired |
| CBoM (quantum-safety) | ❌ | ❌ | ✅ NIST FIPS 203/204 |
| Quality gates (CI blocking) | ✅ | ✅ | ✅ configurable |
| SARIF export | ✅ | ✅ | ✅ v2.1.0 |
| OWASP compliance report | ✅ | ✅ | ✅ 9/10 + CWE IDs |
| Recurring cost | $$$ | $$$ | $0 |
Features
Detection Pipeline
10 tools run in parallel, all output normalised into one CanonicalFinding schema:
| Tool | Language | What It Catches |
|---|---|---|
| Ruff | Python | Style, imports, unused code, PEP8 |
| Bandit | Python | Security anti-patterns (33 rules) |
| Semgrep | Python / JS / Go | OWASP Top 10 patterns, custom rules |
| Vulture | Python | Dead code, unreachable branches |
| Radon | Python | Cyclomatic complexity, maintainability |
| Secrets Detector | All | API keys, passwords, JWTs, tokens |
| SCA Scanner | Python | Known-vulnerable dependency versions |
| ESLint | JS / TS | Security plugin — 20 rules |
| gosec | Go | Go security vulnerabilities |
| staticcheck | Go | Go static analysis and bug detection |
AI Explanation (RAG-Enhanced)
- 66-rule knowledge base (
config/rules.yml) — every rule has description, rationale, remediation, and code examples - Evidence-grounded prompts — the LLM is given the rule text; it cannot invent advice for rules it can't cite
- Semantic entropy — 3× LLM runs with varying temperature; contradictions lower the confidence score
- Self-evaluation — LLM rates its own output 1–5 on relevance/accuracy/clarity
- Path feasibility (Feature 7) — second AI call validates whether a HIGH finding's code path is actually reachable in production (LLM4PFA approach)
Quality Gate
# .acrqa.yml
quality_gate:
mode: block # block = fail CI + prevent merge | warn = comment only
max_high: 0 # zero tolerance for HIGH severity
max_medium: 5
max_security: 0
Fails CI with exit code 1. GitHub Action posts severity table as PR comment.
Per-Repo Policy (.acrqa.yml)
rules:
disabled_rules: [IMPORT-001, STYLE-007]
severity_overrides: {COMPLEXITY-001: low}
analysis:
ignore_paths: [.venv, migrations/, node_modules]
ai:
max_explanations: 50
model: llama-3.1-8b-instant
Inline Suppression
result = eval(user_input) # acr-qa:ignore
password = "secret123" # acrqa:disable SECURITY-005
FastAPI Dashboard (http://localhost:8000/docs)
- Severity counters with live counts
- Finding cards with collapsible AI explanations + 🎯 confidence badge
- Cost-benefit widget: analysis cost, hours saved, ROI ratio
- Trend charts across last 30 runs
- False-positive feedback (👍/👎) — feeds triage memory for future suppression
- Filters by severity, category, rule ID, full-text search
- Export: SARIF, provenance trace, Markdown reports
CLI Reference
python3 CORE/main.py [options]
--target-dir DIR Directory to analyse (default: samples/realistic-issues)
--repo-name NAME Repository name for provenance tracking
--pr-number N PR number (enables GitHub PR comment posting)
--limit N Max findings to AI-explain (default: 50)
--diff-only Analyse only files changed in git diff
--diff-base BRANCH Base branch for diff (default: main)
--auto-fix Generate auto-fix suggestions for fixable rules
--rich Rich terminal output with colour-coded tables
--lang LANG auto (default) | python | javascript | typescript | go
--no-ai Skip AI explanation step (faster, no API key needed)
--json Output findings as JSON to stdout (pipe-friendly)
--version Print version and exit
Language Support
Python (v1.0+)
Ruff · Bandit · Semgrep · Vulture · Radon · Secrets · SCA · CBoM
JavaScript / TypeScript (v3.0.1+)
python3 CORE/main.py --target-dir ./my-react-app # auto-detect
python3 CORE/main.py --target-dir ./my-express-api --lang javascript
ESLint (security plugin) · Semgrep JS rules · npm audit 56 rule mappings · 15 HIGH-severity security rules covered
Go (v3.2.0+)
python3 CORE/main.py --target-dir ./my-go-api --lang go
Prerequisites: go install github.com/securego/gosec/v2/cmd/gosec@latest && go install honnef.co/go/tools/cmd/staticcheck@latest
gosec · staticcheck · Semgrep Go rules · 45+ rule mappings
CI/CD Integration
GitHub Actions
# .github/workflows/acr-qa.yml (already in repo)
# Triggers on every PR:
# 1. Runs all 10 tools on changed files (--diff-only)
# 2. Normalises, scores, generates AI explanations
# 3. Posts severity-sorted PR comment with code suggestions
# 4. Uploads findings.sarif to GitHub Security Tab
# 5. Fails merge if quality gate violated (exit code 1)
Required secrets: GROQ_API_KEY_1..4 · GITHUB_TOKEN (auto-provided)
Manual trigger on PR
Comment on any PR:
acr-qa review
GitLab CI
.gitlab-ci.yml included. Set GROQ_API_KEY_1 and GITLAB_TOKEN in CI/CD Variables.
Monitoring
Prometheus scrapes /metrics every 15 s. Grafana dashboard at http://localhost:3005 (admin/admin):
| Panel | Metric |
|---|---|
| Request Rate | rate(acrqa_http_requests_total[5m]) |
| P95 Latency | histogram_quantile(0.95, ...) |
| HTTP Success Rate | 2xx / total * 100 |
| LLM Latency | avg(explain endpoint duration) |
| Error Rate | rate(5xx[1m]) |
Testing
make test-all # 2,339 tests (full suite)
make test # acceptance tests only
make run # pipeline on sample files
| Test File | Tests | Coverage |
|---|---|---|
test_acceptance.py |
4 | Pipeline E2E with mocked LLM |
test_api.py |
9 | FastAPI endpoints |
test_normalizer.py |
7 | Ruff / Bandit / Semgrep normalisation |
test_new_engines.py |
117 | Secrets, SCA, CBoM, autofix, quality gate, KeyPool |
test_deep_coverage.py |
100 | 12-module deep coverage |
test_god_mode.py |
96 | All features + regression + edge cases |
test_js_adapter.py |
63 | JS/TS adapter, E2E pipeline, CLI routing |
test_reachability.py |
74 | Call-graph engine, fixtures, enrich_findings |
test_taint_analyzer.py |
65+ | Inter-procedural taint, sanitizer recognition |
test_supply_chain.py |
62 | Lockfile parsers, OSV CVE, risk scoring, SBOM |
test_attestation.py |
60 | AttestationEngine, ECDSA-P256, Dilithium3 PQ |
test_exploit_verifier.py |
59 | Docker sandbox, 3-tier verdict |
test_fastapi_app.py |
32 | FastAPI TestClient — all v1 endpoints |
test_chaos.py |
13 | Postgres/Redis failure injection |
test_week1_completion.py |
42 | Fuzz/snapshot/perf-gate/mutation-killing |
| (+ 31 more files) | 1,536 | Additional coverage |
| TypeScript (Vitest) | 65 | Button, Badge, Card, ScanCard, FindingsTable… |
Thesis Evaluation
Research Questions
| RQ | Implementation | Metric |
|---|---|---|
| RQ1 Can RAG reduce LLM hallucination? | 66-rule KB + evidence-grounded prompts + entropy | consistency_score (0–1) |
| RQ2 How to ensure provenance? | Full PostgreSQL audit trail per LLM call | llm_explanations table |
| RQ3 What confidence scoring works? | score = severity × category × tool × rule × fix_validated | confidence_score (0–100) |
| RQ4 Does it match industry tools? | 10-tool pipeline vs CodeRabbit / SonarQube | Feature parity table above |
Benchmark Results (4 repositories)
| Repository | Findings | Precision | Security Precision | Recall |
|---|---|---|---|---|
| DVPWA | 44 | 81.8% | 100% | 50% |
| Pygoat | 440 | 96.4% | 100% | 100% |
| VulPy | 293 | 100% | 100% | 100% |
| DSVW | 59 | 100% | 100% | 100% |
| Overall | 836 | 97.1% | 100% | — |
DVPWA recall 50%: 3 of 6 known vulns detected; CSRF (runtime-only), hardcoded debug mode, and one abstracted cred require DAST — documented limitation, not a bug.
Architecture Decision Records
Key design decisions are documented in docs/adr/:
| ADR | Decision |
|---|---|
| 0001 | Scope: self-hosted thesis tool, not SaaS |
| 0002 | LanguageAdapter pattern for multi-language support |
| 0003 | RAG + semantic entropy over generic LLM prompts |
| 0004 | Groq free tier with 4-key rotation pool |
| 0005 | PostgreSQL for provenance storage |
Technology Stack
| Layer | Technology |
|---|---|
| Language | Python 3.11+ |
| Web Framework | FastAPI 0.115 (async) |
| Frontend | React 18 + TypeScript + Vite 5 + shadcn/ui |
| Database | PostgreSQL 15 |
| Cache / Rate Limiting | Redis 7 |
| AI Model | Groq Llama 3.3-70b (explanations) · Llama 3.1-8b (feasibility) |
| Static Analysis | Ruff, Semgrep, Bandit, Vulture, Radon, gosec, staticcheck, ESLint |
| Terminal UI | Rich |
| Schema Validation | Pydantic v2 |
| Containerisation | Docker + Docker Compose |
| Monitoring | Prometheus + Grafana |
| CI/CD | GitHub Actions + GitLab CI |
| Export Formats | SARIF v2.1.0, Markdown, JSON |
Interactive Demo Notebooks
Run these Marimo notebooks locally or view the exported HTML:
| Notebook | Description | HTML Preview |
|---|---|---|
notebooks/walkthrough.py |
12-cell full pipeline walkthrough | walkthrough.html |
notebooks/engine_demos/taint.py |
Taint analyzer — source→sink data-flow | demo_taint.html |
notebooks/engine_demos/exploit.py |
Exploit verifier — DAST-in-Docker | demo_exploit.html |
notebooks/engine_demos/attestation.py |
Attestation + post-quantum hybrid signatures | demo_attestation.html |
notebooks/engine_demos/offline.py |
Zero-egress / Ollama mode proof | demo_offline.html |
# Run interactively (edit cells live)
pip install marimo
marimo run notebooks/walkthrough.py
# Or open in edit mode for defense demo
marimo edit notebooks/walkthrough.py
Documentation
| Document | Description |
|---|---|
| C1–C4 Architecture | Full C4 model: context, containers, components, code flow |
| ADRs | Architecture decision records — why each major choice was made |
| API Reference | All 32 REST endpoints |
| Policy Engine | .acrqa.yml full reference |
| Evaluation Report | Precision/recall, OWASP coverage, competitive analysis |
| Per-Tool Evaluation | Per-engine accuracy across benchmark repos |
| User Study Protocol | 20-minute structured study protocol |
| Cloud Deployment | AWS / GCP / Railway deployment guide |
| Changelog | Full version history |
| Contributing | Development setup and contribution guide |
Academic Context
| Student | Ahmed Mahmoud Abbas |
| Supervisor | Dr. Samy AbdelNabi |
| Institution | King Salman International University (KSIU) |
| Timeline | October 2025 – June 2026 |
| Status | v4.6.0 · Distribution release · Evaluation complete · Cloud-deployed · 39/39 tasks done |
Remaining Thesis Work
- Cloud deployment — live at acrqa-api-production.up.railway.app (Railway + PostgreSQL + Redis)
- User study protocol —
docs/evaluation/USER_STUDY_PROTOCOL.md - 5-minute demo video (script) ← human task
- YouTube upload (follows demo video) ← human task
License
MIT — see LICENSE
Built with ❤️ at King Salman International University · ⭐ Star this repo
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file acrqa-4.6.0.tar.gz.
File metadata
- Download URL: acrqa-4.6.0.tar.gz
- Upload date:
- Size: 219.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df0ac996ddfb7350d806e10f1f9b6354135543b2f1649cba28871cecf7524106
|
|
| MD5 |
22226c88410a08ccde2987c5b5b1bd4c
|
|
| BLAKE2b-256 |
0a7ad5b09bc3f9bb800651a10fddc94c458c60a9606178c9ca80603e54566984
|
File details
Details for the file acrqa-4.6.0-py3-none-any.whl.
File metadata
- Download URL: acrqa-4.6.0-py3-none-any.whl
- Upload date:
- Size: 178.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbb88dd4b668f7519aeda03a7bf665ec199f015d9c5eb5f71df92df0a2f32c1b
|
|
| MD5 |
5f7be5609a1fb77b40138e991ea7045b
|
|
| BLAKE2b-256 |
e88a5a820ec9abeccc70952e9f8fc65e9ed00d473df9fac9b773c7e9f0f42fac
|