Skip to main content

Audit existing Excel spreadsheets and financial models for correctness defects: formula errors, broken references, off-by-one ranges, reconciliation failures, circular references, hidden structure, and data-quality risks. Audit-only Agent Skill plus CLI.

Project description

Spreadsheet Auditor logo

Spreadsheet Auditor

Audit existing Excel spreadsheets and financial models for correctness defects — formula errors, broken references, bad ranges, totals that don't reconcile, and data-quality risks.

PyPI version MIT License Python 3.11+ Claude + Codex Agent Skill Supported formats Output formats


spreadsheet-auditor is a portable Agent Skill for Claude and Codex (and a standalone command-line toolkit) that answers one question: "Can I trust this spreadsheet?"

Most spreadsheet tooling helps you build workbooks. This one audits a workbook you already have — often one you inherited — and returns a severity-ranked report of where it is likely wrong, fragile, or inconsistent. It is deterministic-first: bundled Python checks produce candidate findings with exact cell locations and evidence, so you are not eyeballing cells and guessing.

It is intentionally audit-only: it reports defects and optional annotations, but never silently rewrites, reformats, or "fixes" your source workbook.

[!TIP] See the auditor in action without installing anything: open examples/demo_audit_report.md or examples/demo_audit_report.html.

Demo (60 seconds)

pip install "spreadsheet-auditor[all]"
spreadsheet-auditor --demo --summary

Or against a workbook of your own:

spreadsheet-auditor path/to/workbook.xlsx \
    --out audit_report.md \
    --json findings.json \
    --annotated audit_annotated.xlsx

A complete demo lives in examples/ with a generator (examples/make_demo.py), the seeded workbook, and the resulting report/JSON/HTML/annotated outputs ready to inspect.

Why use it

  • Pre-send model review — de-risk an LBO/DCF/budget model before it reaches an investment committee or counterparty.
  • "Find the error" debugging — a number looks wrong and you need the cell, not a guess.
  • Inherited-model trust assessment — you didn't build it; check whether it is trustworthy.
  • CI / batch gating — fail a pipeline when a committed workbook has Critical defects (see examples/github-actions/).
  • FP&A and month-end close — catch reconciliation and range mistakes in recurring workbooks.

What it detects

Category Checks
Formula integrity live errors (#REF!, #DIV/0!, #VALUE!, #N/A, ...), broken/deleted references, references to blank precedents, circular references, formula drift across a row/column, IFERROR/IFNA error masking
Hardcodes & inputs numeric literals embedded in formulas, hardcoded plug values inside a formula block
Ranges aggregate ranges that exclude adjacent data (off-by-one), ranges that include subtotal/total rows, inconsistent aggregate range lengths across peers, hidden rows/columns/sheets inside totals
Reconciliation stated totals that differ from their components, row totals vs column totals that don't cross-foot
Logic & structure volatile/fragile functions (OFFSET, INDIRECT, NOW, RAND, ...), whole-column references
Data hygiene numbers stored as text, leading/trailing whitespace in keys/labels, duplicate lookup keys, merged cells inside data ranges
Finance (opt-in HEUR) balance-sheet balance, sign convention on revenue/expense rows, quarterly period sequencing

Each finding carries a detection mode (DET deterministic / HEUR heuristic), an error-confidence level (Defect / Likely defect / Review / Info), a severity, evidence, and a suggested fix. Full rule list: references/check_catalog.md.

The seeded-defect benchmark is published at benchmarks/seeded_defects_matrix.md; methodology at references/benchmark_methodology.md.

Install

pip install spreadsheet-auditor             # core + .xlsx/.xlsm/.csv audit
pip install "spreadsheet-auditor[all]"      # adds defusedxml, networkx, PyYAML

For local development:

git clone https://github.com/petehottelet/spreadsheet-auditor.git
cd spreadsheet-auditor
pip install -e ".[dev]"
spreadsheet-auditor --healthcheck
python -m pytest tests -q

Releases (signed source archive + Claude/Codex skill zips + SHA-256 checksums) are published from the Releases page.

Run

# Quick look in the terminal
spreadsheet-auditor model.xlsx

# Write artifacts in every supported format
spreadsheet-auditor model.xlsx \
    --out report.md \
    --json findings.json \
    --annotated annotated.xlsx
spreadsheet-auditor model.xlsx --format html --out report.html
spreadsheet-auditor model.xlsx --format sarif --out report.sarif

# CI-friendly: one-screen summary + non-zero exit on High-or-worse
spreadsheet-auditor model.xlsx --summary --fail-on High

# Bundled demo for a 60-second tour
spreadsheet-auditor --demo --summary

See spreadsheet-auditor --help for the full flag reference, including --strict, --show-suppressed, --quiet, --config, --ignore, --recalc-timeout, and --healthcheck --json.

Outputs

  • Markdown report for human review, grouped by Confirmed/Likely/Review buckets so the worst items are easy to triage.
  • HTML report for browser review or sharing with non-technical reviewers (self-contained, no network).
  • JSON findings for reruns, CI, and downstream tooling. Validates against schemas/findings.schema.json. Each finding carries a stable fingerprint for diffing across runs and for fingerprint- based suppression.
  • SARIF 2.1.0 for GitHub code scanning. See examples/github-actions/code-scanning.yml.
  • Annotated workbook copy with comments at finding cells (--annotated). The source workbook is never modified.

Exit codes

Code Meaning
0 Completed; no findings at or above --fail-on
1 Completed; findings at or above --fail-on
2 Completed with coverage limitations (only with --strict or --fail-on None)
3 Healthcheck failed: required dependency missing
4 Preflight/security failure
5 Internal error

Benign limitations (no recalculation engine, missing optional packages) do not fail a normal run. Use --strict to surface them as exit code 2 in CI.

Agent Skill usage

Drop the Claude/Codex zip from the release into your Skills folder, or load it directly with Cursor Skills. Ask the agent something like:

"Audit Q3_Forecast.xlsx and tell me what's wrong with it. Write the report to audit_report.md and produce an annotated copy."

The skill maps the request onto the bundled CLI invocation and surfaces the findings inline.

Configuration

Optional config file (JSON or YAML) controls scope, materiality, suppression, performance limits, and which checks fire. Schema: schemas/config.schema.json.

Example:

scope:
  include_sheets: [Budget, Summary]
  headline_outputs: [Summary!C1, Summary!C2]
materiality:
  absolute: 1000.0
  relative: 0.001
finance:
  enabled: true        # turn on the finance HEUR pack
limits:
  max_formulas: 50000
  timeout_seconds: 120
  max_reported_findings: 200
checks:
  IFERROR_MASK: review   # warn instead of error
  LITERAL_CONSTANT: off
suppressions:
  - rule_id: BROKEN_REFERENCE
    range: Imports!A1:A100
    reason: External feed populated at runtime; cells start empty.

Safety

Spreadsheet files are treated as untrusted input. Macros are inventoried but never executed, external links are inventoried but never followed, recalculation runs headless in an isolated profile, and the original workbook is never overwritten. Details in SECURITY.md and references/limitations.md.

Limitations

Excel-specific behavior is approximated via LibreOffice. Value-dependent checks (TOTAL_MISMATCH, CROSS_FOOT_FAILURE) require either recalculation (LibreOffice/Calc) or cached values written by Excel. Dynamic arrays, data tables, Power Query/Data Model, and macros are inventoried but not executed. Full list: references/limitations.md.

Google Sheets: export to .xlsx and audit that. There is no native Sheets API integration. See references/google_sheets.md for a recipe.

FAQ

Does it fix my spreadsheet? No. It reports defect candidates with suggested fixes; applying changes is a separate, explicit step.

Is this an accounting or audit certification? No. It flags likely defects and review items; it does not certify business, accounting, tax, or legal correctness.

Does it work without Excel installed? Yes. It reads workbooks with openpyxl. LibreOffice is optional and only used to refresh cached values for value-dependent checks.

Does it support Google Sheets? Not directly. Export to .xlsx and audit that.

How are false positives handled? Suppress them by (rule_id, range, reason) or by fingerprint. A reason is required; suppressions missing a reason are ignored and called out in the report's coverage limitations. Suppressed findings stay in the JSON payload (auditable) but are hidden from the report unless --show-suppressed is passed.

Contributing

We welcome new checks, additional test workbooks, and benchmark improvements. See CONTRIBUTING.md and references/custom_checks.md.

License

MIT. See LICENSE.


Topics: spreadsheet audit, excel auditor, xlsx, xlsm, financial model review, formula error checker, reconciliation, cross-foot, range check, circular reference detection, data quality, openpyxl, python, CLI, agent skill, static analysis, claude, codex, FP&A, financial modeling, fpna.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spreadsheet_auditor-0.1.0.tar.gz (67.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spreadsheet_auditor-0.1.0-py3-none-any.whl (61.4 kB view details)

Uploaded Python 3

File details

Details for the file spreadsheet_auditor-0.1.0.tar.gz.

File metadata

  • Download URL: spreadsheet_auditor-0.1.0.tar.gz
  • Upload date:
  • Size: 67.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for spreadsheet_auditor-0.1.0.tar.gz
Algorithm Hash digest
SHA256 aeb0fd5f3e9df5dfebd9e63fa1981dc5859f171ed31da8634788093b11c58f5f
MD5 77645de9d37e35ba0259dd5dab18986f
BLAKE2b-256 78e168b1115b20dcae81d184cf5f240a131f84be771602a79950e81ca774c991

See more details on using hashes here.

Provenance

The following attestation bundles were made for spreadsheet_auditor-0.1.0.tar.gz:

Publisher: release.yml on petehottelet/spreadsheet-auditor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spreadsheet_auditor-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for spreadsheet_auditor-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 62b869d9dfd9e3b859886b6ecbecd6f3552d49750f5769012f629bf484db8a5b
MD5 87e4752b5508c3ce29fc8d24fc5e6114
BLAKE2b-256 0ccb424b263f291ecd622aaa254afa2f9d6dab443f7e010725f96d34cc9286c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for spreadsheet_auditor-0.1.0-py3-none-any.whl:

Publisher: release.yml on petehottelet/spreadsheet-auditor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page