How Am I Doing — local-only self-audit & coaching for Claude Code sessions
Project description
How Am I Doing (HAID)
A self-audit and coaching layer for Claude Code sessions.
HAID reads your own Claude Code session transcripts, builds a graph of what happened, and produces annotated, coaching-oriented reports. The aim is less "here is your bill" and more "here is where you and the agent diverged, why, and what to change."
Nothing leaves your machine unless you explicitly choose to submit aggregate metrics.
Status: the full coaching pipeline runs end to end on real sessions — stdlib-only, with no model in the loop inside the CLI (model judgment is delegated to the host agent via job manifests; see Activating in Claude Code). The chain is
metrics → tag → episodes → score → why → report:
- Session parsing (
src/haid/session/) — forest-aware JSONL parsing: dedup, branch/rewind classification, subagent stitching, overflow resolution, SQLite cache.- Session graph (
src/haid/graph/) — L0 spine + L1 action/IO graph (reads/produces/edits fromstructuredPatch), signatures, per-timeline scoping.- Waste metrics (
src/haid/metrics/) —rereads,retries,retouched,unused_context: one rule each, run at session and window scope, as benchmarkable token-rates placed against a per-scope baseline (haid metrics).- Analysis window (
src/haid/window.py) — the multi-session unit metrics run over (a project over a timeframe, default 30 days).- Bridge (
src/haid/bridge/) — reconstructs a window's net code diff from the transcript alone (replay, no git) plus its normalized-token cost (haid bridge).- Scoring (
src/haid/scoring/) — the relative achievement/cost value scorer (difficulty + cleanliness placement, volume, normalized-token cost, value combiner), calibration-validated (haid volume/cost/place/value).- Intent tagging (
src/haid/intent/) — move × work-type + purpose-snapshot labels for every user message (haid tag).- Episodes (
src/haid/episodes/) — group whole sessions into the git-free PR proxy and score each as a per-episode value distribution (haid episodes,haid score).- Why-pass (
src/haid/why/) — per-anchor investigation agents over the top waste instances, with cited evidence and hedged remedies (haid why).- Report (
src/haid/report/) — the compositor: a deterministic what/why digest plus a composed coaching report, with validated recommendations (haid report).- Visualization (
src/haid/viz/) — a self-contained HTML render of the window (the time-layered bus diagram) from the same substrate (haid viz).- Community benchmark (
src/haid/report/benchmark.py) — a summary-only, opt-in payload (haid benchmark), a read-only local comparison vs the board (haid rank), and the PR-based opt-in submission (haid submit).All validated on real transcripts (
python -m pytest, stdlib-only). The user-facing report and visualization are the final product. See plans/roadmap.md.
Installation
HAID is on PyPI (stdlib-only, no dependencies, Python ≥ 3.10):
pip install haid
On Ubuntu/WSL without a venv set up, a user install works fine:
python3 -m pip install --user haid
# CLI lands at ~/.local/bin/haid — make sure that's on your PATH
Verify it works:
haid --help
haid metrics --project ~/path/to/some/project --days 30
haid metrics is fully deterministic (no model calls) and runs against the
Claude Code transcripts already on your machine — it's the quickest smoke test.
Where to install: HAID reads transcripts from ~/.claude/projects/ on the
machine where the sessions ran. If you use Claude Code inside WSL, install HAID
inside WSL too. (A Windows-side install can still reach WSL transcripts via UNC
--session paths like //wsl.localhost/Ubuntu/home/<user>/.claude/projects/<slug>/*.jsonl,
but --project discovery won't cross the boundary.)
Activating in Claude Code
The haid CLI never calls a model itself. The full coaching pipeline
(tag → episodes → score → why → report) is driven by Claude Code through the
haid-report skill: the CLI writes job
manifests at each model boundary, and the skill tells Claude how to fulfill
them with subagents and resume.
-
Install the CLI (above) so
haidis on the PATH of the machine/shell where Claude Code runs. -
Copy the skill from this repo into Claude Code's skills directory:
# available in every project: mkdir -p ~/.claude/skills/haid-report cp .claude/skills/haid-report/SKILL.md ~/.claude/skills/haid-report/ # …or for a single project only: mkdir -p <project>/.claude/skills/haid-report cp .claude/skills/haid-report/SKILL.md <project>/.claude/skills/haid-report/
(If you're working inside this repo, the skill is already active — it's a project skill here.)
-
Start a new Claude Code session and ask "how am I doing?", or invoke
/haid-reportdirectly. Claude will run the chain and present the coaching report. For a zero-cost, fully deterministic answer, ask for the--digest-onlyreport or just the waste metrics.
What this is not
Not another token counter. Raw usage accounting is already well covered (ccusage and similar). The entire value lives one layer up, in diagnosis and coaching — telling you not what you spent but how to get better. A tool that confidently misdiagnoses is worse than nothing, because people act on it, so trustworthiness of the diagnosis is the central design constraint throughout. See docs/trust-discipline.md.
The one big idea: the session graph
Underneath everything is one data structure: a graph of the session(s). Turns and tool-calls are nodes; edges capture responds-to, reads, and produces relationships. The two headline features are just two operations on this one graph:
- "Why did you do X?" → a backwards traversal from X to its trigger.
- "Where did the tokens go?" → a weighting over the same nodes.
Build the graph once; get both views from it. Design in docs/session-graph-design.md.
Two orthogonal analysis passes
- User-anchored pass — catches misalignment. Works backwards from user messages; corrections are ground truth ("no, I meant…", "that's wrong").
- Signature-scanning pass — catches silent inefficiency. Scans for objective, reasoning-free waste signatures (redundant re-reads, retry loops, re-touched lines, unused context).
The two are orthogonal: one finds where the agent did the wrong thing, the other where it did the right thing wastefully. See docs/architecture.md.
Documentation map
| Doc | What's in it |
|---|---|
| docs/vision.md | The full concept, goals, and the canonical test case |
| docs/architecture.md | The two-pass method and how the pieces fit |
| docs/session-graph-design.md | Node/edge taxonomy, episodes, the two core operations |
| docs/detectors.md | Detector catalog + waste metrics as graph queries |
| docs/intent-taxonomy.md | Two-axis message classification + purpose timeline + drift |
| docs/scoring-rubric.md | Achievement vs. cost — the relative value verdict (revised; see ladder/playbook) |
| docs/difficulty-ladder.md | The validated difficulty scorer (reference ladder + placement) |
| docs/cleanliness-ladder.md | The cleanliness/parsimony scorer (reference ladder + placement) |
| docs/metrics-output-schema.md | The haid metrics --json contract consumed by the later passes |
| docs/treatments.md | The remedy catalog matched mechanically in haid report |
| docs/axis-calibration-playbook.md | Self-contained recipe to calibrate a new scoring axis (worked example: cleanliness; originality calibrated then dropped) |
| docs/visualization.md | The time-layered bus diagram (left-in/right-out, bundled) |
| docs/claude-code-data-format.md | Verified Claude Code on-disk data reference |
| docs/data-inventory.md | Field catalog from 38 real sessions: what's auto-taggable vs. inferred |
| docs/data-structure-report.md | Real annotated records → the graph they produce (Tier 1 & Tier 2 walkthrough) |
| docs/trust-discipline.md | Cite-or-unknown, hedging, no-traceable-origin |
| docs/tooling-landscape.md | Existing tools and what to build on |
| docs/decisions/ | Architecture Decision Records (ADRs) |
| plans/roadmap.md | Phased delivery plan |
| plans/agent-analysis.md | The model-in-the-loop "why" pass design (episodes, anchors, two-stage) |
| plans/community-benchmark.md | The opt-in self-reported benchmark design (ADR-0005) |
| plans/open-questions.md | Decisions to make / behaviors to verify early |
The shipped Phase-1 build logs (mvp.md, phase1-build.md, step4-build.md) are kept for
history under plans/archive/.
Repository layout
HAID/
├── README.md # you are here
├── docs/ # design & reference documentation
│ └── decisions/ # ADRs
├── plans/ # roadmap + active design notes (shipped build-logs in plans/archive/)
├── src/haid/ # implementation
│ ├── session/ # parse: forest model, subagents, overflow, cache
│ ├── graph/ # L0 spine + L1 IO graph (incl. Bash read/write parsing)
│ ├── metrics/ # the four waste metrics + baseline + `haid metrics` emitter
│ ├── window.py # the multi-session analysis window
│ ├── bridge/ # transcript→(diff, usage) reconstruction (the bridge)
│ ├── scoring/ # relative value scorer (difficulty/cleanliness/volume/cost/value)
│ ├── intent/ # move × work-type message tagging (`haid tag`)
│ ├── episodes/ # session→episode grouping + per-episode scoring
│ ├── why/ # per-anchor investigation agents (`haid why`)
│ └── report/ # digest + composed coaching report (`haid report`)
├── tests/ # session/ graph/ metrics/ scoring/ bridge/ intent/ episodes/ why/ report/
└── scripts/ # build_metric_baselines.py (regenerates shipped data)
The one-time scoring-axis calibration harness and the raw research probes that seeded the docs live on the
archive/experimentsbranch — their validated output already ships insrc/haid/data/, so they're kept for provenance rather than onmain.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file haid-0.0.3.tar.gz.
File metadata
- Download URL: haid-0.0.3.tar.gz
- Upload date:
- Size: 204.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6a0a4e7b90258bd5e2fe34fd30e21c282d876b9cf9f3280a8a53da91f693a48
|
|
| MD5 |
e4409081f2986dfe5f11d0615ad1d467
|
|
| BLAKE2b-256 |
40f71908480ceab4436bd78a88bd28bb94d22b03f55063c7c68881925c2e83db
|
Provenance
The following attestation bundles were made for haid-0.0.3.tar.gz:
Publisher:
publish.yml on dv-hart/haid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
haid-0.0.3.tar.gz -
Subject digest:
c6a0a4e7b90258bd5e2fe34fd30e21c282d876b9cf9f3280a8a53da91f693a48 - Sigstore transparency entry: 1944837806
- Sigstore integration time:
-
Permalink:
dv-hart/haid@16b446943d9dacaab00683b35ac192b46f67283b -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/dv-hart
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@16b446943d9dacaab00683b35ac192b46f67283b -
Trigger Event:
push
-
Statement type:
File details
Details for the file haid-0.0.3-py3-none-any.whl.
File metadata
- Download URL: haid-0.0.3-py3-none-any.whl
- Upload date:
- Size: 235.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e017a93b5196d568add7b7f9b2a991a8716bdbbdb2b9dc3cf9eede0ff39b1e58
|
|
| MD5 |
89d9e5304443fb444efc80cfdb46bc3a
|
|
| BLAKE2b-256 |
8a8ff66b8d0c806e815a4a368f4eecd5e5d958271e57dcdf1fec3b089d42fad6
|
Provenance
The following attestation bundles were made for haid-0.0.3-py3-none-any.whl:
Publisher:
publish.yml on dv-hart/haid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
haid-0.0.3-py3-none-any.whl -
Subject digest:
e017a93b5196d568add7b7f9b2a991a8716bdbbdb2b9dc3cf9eede0ff39b1e58 - Sigstore transparency entry: 1944837913
- Sigstore integration time:
-
Permalink:
dv-hart/haid@16b446943d9dacaab00683b35ac192b46f67283b -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/dv-hart
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@16b446943d9dacaab00683b35ac192b46f67283b -
Trigger Event:
push
-
Statement type: