Skip to main content

Empirical Safety Harness for agentic AI coding systems. Scores AI-generated code on 5 metrics across 5 vendor conditions against one fixed spec.

Project description

AI Code Quality Auditor — the Referee Tool

CI PyPI License: MIT Live dashboard

An empirical Safety Harness for agentic AI coding systems. Quantifies where AI-assisted development fails at governance, security, and ethical alignment — before the code reaches production.

🟢 Try it in 30 seconds:

pipx install ai-code-quality-auditor
auditor --help

🚀 Or wire it into your CI in 6 lines (.github/workflows/auditor.yml):

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dominicrume/NEW-enterprise-ai-code-quality-auditor@main
        with:
          run-id: ${{ github.run_id }}
          conditions: claude_code,cursor_agent

📊 Live dashboard: https://auditor-dashboard.fly.dev (pending deploy — see below)

This is the experimental instrument for the MSc dissertation "AI-Assisted Coding Assessment Tool: Evaluating LLM Performance, Governance, and Security in an Agent Education System" (Aston University, MSc AI & Business Strategy). The same instrument is the working prototype for the PhD extension at the Aston-Capgemini Centre of Excellence for Enterprise AI.


What it does

Given a fixed specification (the "spec box"), the Auditor:

  1. Runs five experimental conditions against the same task (human control, visualisation→Claude→Replit, Cursor IDE, autonomous agent).
  2. Captures every output and every interaction event.
  3. Scores each result on five empirical metrics: security vulnerability density, cyclomatic complexity, code duplication, hallucination frequency (features outside spec), and keystroke dynamics (correction frequency).
  4. Emits CSV/JSON reports for statistical comparison.

Quick start

cp .env.example .env
pip install -e .
auditor run --spec specs/agent_education_system.yaml --workflow human_control
auditor report --out data/reports/

Read in this order

  1. docs/ARCHITECTURE.md — how the pieces fit
  2. docs/METHODOLOGY.md — how an experiment is run
  3. docs/METRICS.md — what each metric means and how it's computed
  4. docs/ETHICS.md — GDPR, synthetic data, academic integrity
  5. docs/DISSERTATION_LINKAGE.md — which folder serves which proposal section
  6. docs/ROADMAP.md — the PhD extension (API security + enterprise risk)

Principles

  • One analyzer per metric. One adapter per AI workflow. Single responsibility.
  • The spec is data, not code — externalised in specs/ for reproducibility.
  • Synthetic data only. No PII, no proprietary corporate records, ever.
  • Every analyzer has a test. Green tests = trustable experiment.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_code_quality_auditor-0.1.0.tar.gz (39.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_code_quality_auditor-0.1.0-py3-none-any.whl (44.1 kB view details)

Uploaded Python 3

File details

Details for the file ai_code_quality_auditor-0.1.0.tar.gz.

File metadata

  • Download URL: ai_code_quality_auditor-0.1.0.tar.gz
  • Upload date:
  • Size: 39.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ai_code_quality_auditor-0.1.0.tar.gz
Algorithm Hash digest
SHA256 70bbd8a201bb1607cc088af03518cc3cb33dd816b8eaf40d9d1a5fc972810e44
MD5 08039b3c27315bdddb56f2e79af477d3
BLAKE2b-256 b328a3dffea79076a8c4e859c28289beee37037b1c27bf5f11a69040580177f0

See more details on using hashes here.

File details

Details for the file ai_code_quality_auditor-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_code_quality_auditor-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b7c0f1d4a4475ec41b23adebe936d50fd81e8ba67a616cea59b331a726b8b3ea
MD5 0e0874bdb499518caa3941701592654b
BLAKE2b-256 2ce41b79e513abcbd94a57f58f9eb7addc5618ea98271f4141314f9ba60f9d7f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page