Skip to main content

Empirical Safety Harness for agentic AI coding systems. Scores AI-generated code on 5 metrics across 5 vendor conditions against one fixed spec.

Project description

AI Code Quality Auditor — the Referee Tool

CI PyPI License: MIT Live dashboard

An empirical Safety Harness for agentic AI coding systems. Quantifies where AI-assisted development fails at governance, security, and ethical alignment — before the code reaches production.

🟢 Try it in 30 seconds:

pipx install ai-code-quality-auditor
auditor --help

🚀 Or wire it into your CI in 6 lines (.github/workflows/auditor.yml):

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dominicrume/NEW-enterprise-ai-code-quality-auditor@main
        with:
          run-id: ${{ github.run_id }}
          conditions: claude_code,cursor_agent

📊 Live dashboard: https://auditor-dashboard.fly.dev (pending deploy — see below)

This is the experimental instrument for the MSc dissertation "AI-Assisted Coding Assessment Tool: Evaluating LLM Performance, Governance, and Security in an Agent Education System" (Aston University, MSc AI & Business Strategy). The same instrument is the working prototype for the PhD extension at the Aston-Capgemini Centre of Excellence for Enterprise AI.


What it does

Given a fixed specification (the "spec box"), the Auditor:

  1. Runs five experimental conditions against the same task (human control, visualisation→Claude→Replit, Cursor IDE, autonomous agent).
  2. Captures every output and every interaction event.
  3. Scores each result on five empirical metrics: security vulnerability density, cyclomatic complexity, code duplication, hallucination frequency (features outside spec), and keystroke dynamics (correction frequency).
  4. Emits CSV/JSON reports for statistical comparison.

Quick start

cp .env.example .env
pip install -e .
auditor run --spec specs/agent_education_system.yaml --workflow human_control
auditor report --out data/reports/

Read in this order

  1. docs/ARCHITECTURE.md — how the pieces fit
  2. docs/METHODOLOGY.md — how an experiment is run
  3. docs/METRICS.md — what each metric means and how it's computed
  4. docs/ETHICS.md — GDPR, synthetic data, academic integrity
  5. docs/DISSERTATION_LINKAGE.md — which folder serves which proposal section
  6. docs/ROADMAP.md — the PhD extension (API security + enterprise risk)

Principles

  • One analyzer per metric. One adapter per AI workflow. Single responsibility.
  • The spec is data, not code — externalised in specs/ for reproducibility.
  • Synthetic data only. No PII, no proprietary corporate records, ever.
  • Every analyzer has a test. Green tests = trustable experiment.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_code_quality_auditor-0.2.0.tar.gz (42.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_code_quality_auditor-0.2.0-py3-none-any.whl (48.0 kB view details)

Uploaded Python 3

File details

Details for the file ai_code_quality_auditor-0.2.0.tar.gz.

File metadata

  • Download URL: ai_code_quality_auditor-0.2.0.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for ai_code_quality_auditor-0.2.0.tar.gz
Algorithm Hash digest
SHA256 93aaf0a1970b2e8db7568f402af1e25c813179b54f1308064f72cc4ee104e640
MD5 dda23fb06c035ba2a12971edfed304c9
BLAKE2b-256 e2f3da0c497824acbfddcc1282008ac4ef5638fcb1b76fef03cdbf45d6bee4bd

See more details on using hashes here.

File details

Details for the file ai_code_quality_auditor-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_code_quality_auditor-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e97aef0e6b5fed2843a34a6166f501a047dd040f3c4b4d514d300ccde0656aa
MD5 e6f45189605d941666165dd247a506e9
BLAKE2b-256 221a7ddc4db9f4b680b2e3f21ce8d20a2238b7b311ee7f496c21c491141e67fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page