Skip to main content

Automated GitHub portfolio auditor with 12 analysis dimensions

Project description

GitHub Repo Auditor

Python License Tests

Know the truth about every project you've ever started — because git log across 100 repos doesn't tell you which ones are worth finishing.

GitHub Repo Auditor is a portfolio audit and operator tool for developers with a lot of repositories. It clones every repo on your GitHub account, runs 12 analyzers across completeness and interest dimensions, assigns letter grades and achievement badges, preserves historical state, and generates actionable dashboards you can actually use to decide what to work on next. Built for developers who ship fast, start often, and need a system to manage the sprawl.

Today the project is best understood as a GitHub portfolio operating system:

  • it tells you which repos are healthy, drifting, blocked, or safe to ignore for now
  • it gives you one workbook-first weekly review flow instead of a pile of disconnected reports
  • it tracks whether recommended follow-through is actually happening and whether that improvement is holding up over time
  • it keeps JSON, Markdown, HTML, workbook, and control-center outputs aligned so you do not have to switch mental models between surfaces

What This Project Is Today

This project started as a repo auditing tool and has grown into a workbook-first GitHub portfolio operating system.

Today it:

  • audits repositories across documentation, testing, CI, dependencies, activity, security, structure, community profile, completeness, and interest signals
  • scores repos on dual axes, classifies them into useful tiers, and surfaces quick wins
  • generates aligned JSON, Markdown, HTML, workbook, review-pack, and control-center outputs from the same audit facts
  • writes a report-only weekly command-center digest beside the control-center artifact so paused automation can consume one bounded summary instead of stale notes
  • generates a canonical workspace-level portfolio truth snapshot for a local projects folder and derives the shared registry/report compatibility artifacts from it
  • preserves historical state in SQLite so the operator loop can show change, regression, recovery, and follow-through
  • keeps the workbook and --control-center as the main day-to-day operating surfaces

If you are new here, the simplest way to think about it is: this project tells you which repos are healthy, which ones are drifting, and what to look at next.

Mode Map

The product now works best when you use one of four explicit modes:

  • First Run for setup, baseline creation, the first workbook, and the first control-center read
  • Weekly Review for the normal ongoing operator loop
  • Deep Dive for repo-level investigation and implementation hotspots
  • Action Sync for campaigns, writeback, GitHub Projects, and Notion mirroring

The flags stay the same underneath. The modes are the easiest way to understand when to use which workflow.

See docs/modes.md for the canonical mode guide.

Recommended Default Path

If you are starting fresh, use this sequence:

audit run <github-username> --doctor
audit run <github-username> --html
audit triage <github-username> --control-center

Then open the workbook and read it in this order:

  • Dashboard
  • Run Changes
  • Review Queue
  • Portfolio Explorer
  • Repo Detail
  • Executive Summary

That is the default path the product is optimized around.

Commands By Mode

First Run

audit run <github-username> --doctor
audit run <github-username> --html
audit triage <github-username> --control-center

Weekly Review

audit run <github-username> --html
audit triage <github-username> --control-center
audit report <github-username> --portfolio-truth

Deep Dive

audit run <github-username> --repos <repo-name> --html
audit triage <github-username> --control-center

Action Sync

audit report <github-username> --campaign security-review --writeback-target github
audit report <github-username> --campaign security-review --writeback-target all --github-projects
audit triage <github-username> --approval-center

Treat campaign/writeback, GitHub Projects, Notion sync, catalog overrides, scorecards overrides, and --excel-mode template as advanced paths.

Demo and Guides

Features

  • 12 Analyzers — README quality, test coverage, CI/CD, dependency freshness, commit patterns, bus factor, code complexity, security controls, license, build readiness, GraphQL signals, and more
  • Dual-Axis Scoring — Completeness (does this project have what shipped software should?) and Interest (is this worth anyone's time?) scored independently on 0.0–1.0 scales
  • Letter Grades + Tier Classification — A–F grades with Shipped / Functional / WIP / Skeleton / Abandoned tiers; 15 achievement badges ("Fully Tested", "CI Champion", "Zero Debt", etc.)
  • Quick Wins Engine — For each repo, shows exactly which single action moves it to the next tier and how far it is from getting there
  • Multiple Dashboard Outputs — Flagship Excel workbook with a stable standard mode and optional template mode, interactive HTML dashboard with scatter chart and tech radar, portfolio README, shields.io badges
  • Workbook-First Operator Review — Clear reading order through Dashboard, Run Changes, Review Queue, Portfolio Explorer, Repo Detail, and Executive Summary
  • Control Center Queue — Read-only daily triage that groups work into Blocked, Needs Attention Now, Ready for Manual Action, and Safe to Defer
  • Follow-Through Story — Tracks whether recommendations were untouched, attempted, waiting on evidence, stale, recovering, rebuilding, re-acquired, softening, or retired so the weekly review loop stays honest
  • Repo Drilldowns + Weekly Review Packs — One-repo briefings and weekly summaries that mirror the same action story across Markdown, HTML, and workbook
  • Notion Integration — Pushes audit signals into your Notion operating system: completeness cards, managed campaign records, and lifecycle-aware review sync
  • History & Regression Detection — Archives every run to SQLite, auto-diffs between runs, detects score regressions, and flags archive candidates
  • AI Narrative — Optional Claude-powered portfolio analysis that reads the audit data and writes a human-readable summary

Quick Start

Prerequisites

  • Python 3.11+
  • A GitHub account (public repos work without a token)
  • GITHUB_TOKEN env var or gh CLI authenticated (for private repos and higher rate limits)

Installation

The package is published as GitHub release artifacts today. PyPI/package-index publishing is not active yet, so registry commands like pip install github-repo-auditor are not the recommended public path. See docs/distribution.md for the current distribution policy.

Fastest no-clone path:

curl -LO https://github.com/saagpatel/GithubRepoAuditor/releases/latest/download/audit.pyz
chmod +x audit.pyz
./audit.pyz --help

Install from the public GitHub source:

# uv (recommended)
uv tool install 'git+https://github.com/saagpatel/GithubRepoAuditor.git'

# pipx
pipx install 'git+https://github.com/saagpatel/GithubRepoAuditor.git'

# local editable clone
git clone https://github.com/saagpatel/GithubRepoAuditor.git
cd GithubRepoAuditor
pip install -e ".[config]"

The self-contained .pyz binary is also available from the GitHub Releases page.

For the local web UI, install the [serve] extra from source:

uv tool install 'git+https://github.com/saagpatel/GithubRepoAuditor.git#egg=github-repo-auditor[serve]'
# or from a clone: pip install -e ".[serve]"

Try the safe demo

The demo uses committed fixture data and writes only to output/demo/.

git clone https://github.com/saagpatel/GithubRepoAuditor.git
cd GithubRepoAuditor
pip install -e ".[config]"
make demo

Expected outputs include output/demo/demo-report.json, output/demo/demo-workbook.xlsx, output/demo/dashboard-*.html, output/demo/operator-control-center-demo.json, and output/demo/operator-control-center-demo.md.

Quick start (subcommand form)

audit run <user>                       # fetch, clone, analyze, score
audit triage <user> --control-center   # read-only operator queue
audit report <user> --portfolio-truth  # regenerate workspace truth layer
audit serve                            # open browser dashboard

The flat form (audit <user> --html) still works and prints a one-time deprecation warning. It will not be removed until a future major version bump. See docs/audit-cli-migration.md for the flag-family mapping.

Daily flow

  1. audit serve — start the local web UI at http://127.0.0.1:8080/
  2. Browse to / for the portfolio dashboard; /runs/new to trigger a fresh audit
  3. After the run completes, check /repos/{name} for per-repo drill-downs
  4. Run audit triage <user> --control-center for the full operator queue in the terminal

Common invocations

# Doctor mode — recommended first step
audit run <github-username> --doctor

# Weekly Review — generate the native workbook + HTML dashboard
audit run <github-username> --html

# Weekly Review — daily read-only triage from the latest state
audit triage <github-username> --control-center

# Portfolio Truth — regenerate the canonical workspace truth layer
audit report <github-username> --portfolio-truth

# Semantic search across the portfolio index
audit triage <github-username> --ask "Python projects with no tests"

# Weekly operator briefing (requires Anthropic API key)
audit run <github-username> --briefing

# Deep Dive — targeted repo rerun merged into the latest baseline
audit run <github-username> --repos <repo-name> --html

# Action Sync — managed campaign preview / writeback
audit report <github-username> --campaign security-review --writeback-target github

Normal runs perform a lightweight automatic preflight before fetching repos. By default the run stops on blocking errors and continues on warnings. Use --preflight-mode strict to fail on warnings too, or --preflight-mode off to skip the automatic preflight.

audit triage --control-center is read-only. It loads the latest report + warehouse state, groups open work into Blocked, Needs Attention Now, Ready for Manual Action, and Safe to Defer, and writes operator-control-center-<username>-<date>.json plus .md.

audit triage --approval-center is also read-only. It loads the latest approval history, groups work into Needs Re-Approval, Ready For Review, Approved But Manual, and Blocked, and writes approval-center-<username>-<date>.json plus .md. Local approval capture stays separate from writeback apply.

Watch mode supports --watch-strategy adaptive|incremental|full. adaptive is the default and uses the stored baseline contract plus the scheduled full-refresh interval to decide whether each watch cycle should run full or incremental.

For a full description of all flags grouped by workflow, see docs/modes.md.

Run tests

pytest

Development

For local development, clone the repo and install with the dev + config extras:

git clone https://github.com/saagpatel/GithubRepoAuditor.git
cd GithubRepoAuditor
pip install -e ".[dev,serve,semantic,config]"

Common dev commands:

python3 -m pytest -q -p no:cacheprovider   # full test suite
python3 -m ruff check src/ tests/          # lint
python3 -m ruff format src/ tests/         # format
make workbook-gate                         # workbook invariant check
make release-gate                          # mutation testing gate

See docs/release-gates.md for the full gate checklist.

Tech Stack

Layer Technology
Language Python 3.11+
GitHub API REST v3 + GraphQL (raw requests)
Excel output openpyxl + committed workbook template
PDF output fpdf2
AI narrative Anthropic Claude API
Complexity analysis Radon
CLI output Rich
Storage SQLite (history warehouse)

Architecture

The auditor follows a pipeline architecture: fetch repo list via GitHub API → shallow-clone each repo → run all 12 analyzers in sequence → aggregate scores → generate outputs. Analyzers are pluggable via --analyzers-dir for custom extensions. The scoring engine computes completeness and interest independently, applies configurable scoring profiles, and derives letter grades from the combined result. All output writers (Excel, HTML, JSON, Markdown, Notion) are isolated from the analysis layer and consume the same scored result object. Workbook ranking and trend views always use the full filtered portfolio baseline, even for targeted or incremental reruns.

Partial reruns now require a compatible full-baseline report, not just any previous report. The stored baseline contract tracks the audit-affecting portfolio context used to produce the last trustworthy baseline, and targeted or incremental reruns will fail closed if that contract no longer matches the current request.

Before normal runs start, the CLI now performs a shared preflight that checks config validity, token/config readiness for requested integrations, template/workbook availability, output writability, and whether targeted or incremental paths have a usable baseline. --doctor runs the broader diagnostics set without auditing repos and writes a machine-readable JSON artifact to output/diagnostics-<username>-<date>.json.

For day-to-day operations, --control-center is now the clean read-only entrypoint. It reuses the latest report, review state, campaign history, governance drift, and setup health to build one shared operator queue without running a new audit or mutating any external system.

The portfolio truth layer now has its own dedicated generation path. --portfolio-truth scans the configured local projects workspace, produces output/portfolio-truth-latest.json plus dated historical truth snapshots, and regenerates the configured project-registry and portfolio-audit Markdown compatibility outputs from that same truth contract instead of treating either markdown file as canonical.

Phase 104 added a second standalone workspace mode: --portfolio-context-recovery. That mode freezes the active/recent weak-context cohort from the live truth snapshot, writes dry-run recovery plan artifacts into output/, skips dirty or temporary repos automatically, and can apply bounded minimum-context upgrades plus repo-level catalog seeds before regenerating the truth snapshot and compatibility outputs.

Watch mode now uses that same baseline contract in live execution. Each cycle records the requested watch strategy, the chosen mode, and the reason a full refresh was required or an incremental rerun remained safe.

pyproject.toml is the canonical dependency definition, and requirements.txt is kept as a synchronized compatibility mirror for environments that still prefer a flat requirements file.

Excel Workbook

The workbook now supports two modes:

  • --excel-mode standard — stable operational workbook path, the CLI default, and the recommended mode for automation and Mac Excel compatibility
  • --excel-mode template — template-backed workbook path using assets/excel/analyst-template.xlsx for controlled template work

Both modes read from the same report + warehouse facts. Python owns the hidden Data_* sheets, stable table names, and workbook facts. The template-backed workbook still owns the template shell, named-range bindings, native sparkline placement, and print layout, but the standard workbook path is now the safest default for automated generation and Excel compatibility.

Template mode is also validated during preflight: the committed workbook asset must exist and pass a lightweight shell check before the run will continue.

This workbook boundary is unchanged in the current phase: the project still emits one workbook artifact, visible sheets remain filter-based, and hidden Data_* sheets remain the contract surface for workbook facts and downstream bindings.

The workbook’s main visible flow is now:

  • Index for orientation
  • Dashboard for the big-picture read
  • Run Changes for what moved this run
  • Review Queue for action
  • Portfolio Explorer for comparison
  • Repo Detail for one-repo drilldown
  • Executive Summary for a one-page shareable readout

For workbook-facing changes, use the canonical release gate:

make workbook-gate

That command generates stable sample standard and template workbooks, validates the visible-sheet and hidden Data_* invariants, writes an authoritative workbook-gate-result.json, adds a human-readable gate summary, and produces a manual desktop Excel checklist. The final release step is still opening the generated standard workbook in desktop Excel and recording the local signoff outcome with make workbook-signoff.

After that manual desktop Excel check, record the outcome back into the gate artifacts:

make workbook-signoff ARGS="--reviewer <name> --outcome passed --check excel-open-no-repair=passed --check visible-tabs-present=passed --check normal-zoom-readable=passed --check chart-placement-clean=passed --check filters-work=passed"

Managed Campaigns and Governance

Campaign writeback is now lifecycle-aware rather than one-shot:

  • --campaign-sync-mode reconcile updates active managed records and closes stale ones
  • --campaign-sync-mode append-only leaves stale managed records open and marks them stale
  • --campaign-sync-mode close-missing aggressively closes previously managed records that no longer belong in the campaign

Managed state drift, rollback coverage, and campaign history are written into JSON, Markdown, HTML, Excel, and the warehouse snapshot. Governed security controls still remain manual and opt-in, but operator surfaces now distinguish ready, approved, applied, drifted, and rollback coverage states when governance data is present.

When writeback or governance-related actions are requested, preflight checks now validate the required GitHub and Notion prerequisites before any external mutation path starts.

Operator Loop

The daily operator loop is now:

  • Run audit run <github-username> --doctor
  • Run audit run <github-username> or audit run <github-username> --watch --watch-strategy adaptive
  • Run audit triage <github-username> --control-center
  • Review the handoff fields: what changed, why it matters, what to do next, whether the queue is improving or worsening, what was tried for the top target, whether it is only quieting down or now counts as confirmed resolved, and whether recent confidence has actually been validating
  • Open the workbook and review it in this order: Dashboard, Run Changes, Review Queue, Portfolio Explorer, Repo Detail, Executive Summary
  • Clear anything in Blocked first
  • Use the reported primary target as the single next thing to close before taking on newly ready work
  • Review Needs Attention Now for drift and high-severity changes
  • Work through Ready for Manual Action
  • Leave Safe to Defer items alone unless priorities change
  • Run make workbook-gate only when workbook-facing changes are in scope
  • Run make workbook-signoff ... after the manual Excel-open check for workbook-facing changes
  • Browse http://127.0.0.1:8080/ after audit serve to review the dashboard

Scheduled automation stays artifact-first. The weekly workflow now runs the audit, generates a control-center artifact plus a scheduled handoff summary, uploads output/, opens or updates one canonical GitHub issue only when blocked or urgent operator findings cross a meaningful threshold, and closes that same issue cleanly when later runs return to a quiet state. The handoff now also calls out whether the queue is getting better, worse, or staying stuck, what was tried most recently, whether that intervention actually helped, whether recovery is only quiet for now or confirmed resolved, whether recent high-confidence guidance has been validating or turning noisy, what trust policy now applies to the live recommendation (act-now, act-with-review, verify-first, or monitor), whether a soft exception or recent policy-flip drift should make the operator treat that recommendation more cautiously, and whether recent soft caution is still earning trust or has become cautious enough to recover toward a stronger policy.

In newer follow-through phases, that same weekly story also carries whether a recommendation is escalating, recovering, rebuilding, re-acquiring confidence, or aging back down. The important product principle is still the same: workbook, HTML, Markdown, and review-pack surfaces should tell the same story in different formats.

Troubleshooting

The fastest path for setup issues is:

audit run <github-username> --doctor

Common fixes:

  • Missing GitHub token: set GITHUB_TOKEN or pass --token for private-repo access, GitHub writeback, metadata apply flows, and other authenticated actions.
  • Missing or broken Notion config: create or fix config/notion-config.json before using --notion-sync, --notion-registry, or Notion writeback.
  • Starting from scratch: copy config/examples/audit-config.example.yaml to audit-config.yaml and config/examples/notion-config.example.json to config/notion-config.json.
  • Missing Excel template: restore assets/excel/analyst-template.xlsx or use --excel-mode standard.
  • Missing baseline report: run a full audit before using --repos, --incremental, or other baseline-dependent workflows.
  • Config/profile errors: fix audit-config.yaml syntax or choose an existing scoring profile under config/scoring-profiles/.

There is also a longer operator guide in docs/operator-troubleshooting.md.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

github_repo_auditor-0.1.2.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

github_repo_auditor-0.1.2-py3-none-any.whl (917.2 kB view details)

Uploaded Python 3

File details

Details for the file github_repo_auditor-0.1.2.tar.gz.

File metadata

  • Download URL: github_repo_auditor-0.1.2.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for github_repo_auditor-0.1.2.tar.gz
Algorithm Hash digest
SHA256 50d188fbef81ed6ea4b0b7cbb2e48cf1aec6eb5270e30a29c8f6b2857570cadf
MD5 a72a09977b0d9c4d77ba4f6d6ea36631
BLAKE2b-256 9c68862bcbf253f7022543902607d2f21b029d93886dd8bdd9105343efda3ee8

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_repo_auditor-0.1.2.tar.gz:

Publisher: pypi.yml on saagpatel/GithubRepoAuditor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file github_repo_auditor-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for github_repo_auditor-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cc8c3f7511f42f16202262b75aa968003549e1f962328f493efac468ec2073ec
MD5 8fa72c7d28eaf84b6d2ac4267f3837fb
BLAKE2b-256 a769f6e81085346478f1f863041f6950da284affc109b2810bdf9320d4f3d887

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_repo_auditor-0.1.2-py3-none-any.whl:

Publisher: pypi.yml on saagpatel/GithubRepoAuditor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page