Skip to main content

CLI for tracking AI agent task metrics: token cost, retry pressure, and outcome quality.

Project description

ai-agents-metrics — track AI agent token cost and retry pressure

CI

Measure the real cost and effectiveness of AI-assisted engineering work.

ai-agents-metrics is a local CLI tool that records goals, attempts, token spend, and retry patterns for every AI coding session — so you can see which workflows are productive and which are burning tokens on rework.

Why

AI coding agents (Claude Code, Codex, and similar) generate real costs and vary widely in effectiveness. Common questions without this tool:

  • "How much did my Claude Code session cost?"
  • "How do I track AI agent retries across tasks?"
  • "What is my token spend per task?"
  • "Did this workflow change actually improve anything?"
  • "Which model is more cost-effective for my work?"

ai-agents-metrics gives you a lightweight, local ledger to answer all of these from real data.

When to use this

  • You use Claude Code, Codex, or another AI coding agent and want to know what each task actually cost
  • You suspect certain types of tasks require too many correction passes and want the numbers to confirm it
  • You changed a prompt strategy or workflow and want to verify it improved outcome quality or reduced cost
  • You run AI agents as part of a paid engineering workflow and need to track whether AI cost is eating into project margins
  • You want an AI agent to analyze your workflow history and recommend what to change next

What It Tracks

  • Goals and attempts — what you asked the agent to do, how many tries it took
  • Token cost — input, output, and cached-input tokens per session, mapped to USD
  • Retry pressure — how often attempts fail or require correction
  • Model usage — which model ran each session and what it cost
  • History analysis — parse conversation transcripts to reconstruct past sessions

Example output

$ ai-agents-metrics show

Codex Metrics Summary

Operational summary:
Closed goals:                    8
Successes:                       8
Fails:                           0
Total attempts:                  8
Success Rate:                    100.00%
Attempts per Closed Goal:        1.00

Known total cost (USD):          9.27
Known total tokens:              26,337,605
  input:                         260
  cached:                        26,088,225
  output:                        44,883

Known Cost per Success (USD):    1.32
Known Cost per Success (Tokens): 3,762,515

Model coverage: 7/8 closed goals with an unambiguous model
By model:
  claude-sonnet-4-6: 7 closed, 7 successes, 0 fails

Closed entries:     8
Entry successes:    8
Entry fails:        0
Entry Success Rate: 100.00%

Install

python -m pip install -e .

Or install the standalone binary:

make package-standalone
./dist/standalone/codex-metrics install-self

Quick Start

Bootstrap a project:

ai-agents-metrics bootstrap

Start tracking a goal:

ai-agents-metrics start-task --title "implement login endpoint" --task-type product

Record another attempt if the agent needed a correction:

ai-agents-metrics continue-task --task-id 2026-04-08-001 --failure-reason wrong_scope

Close it when done:

ai-agents-metrics finish-task --task-id 2026-04-08-001 --outcome success --result-fit exact_fit

Show current metrics:

ai-agents-metrics show

Verify Your Install

make verify

Runs lint, security scan, typecheck, tests, and the public boundary check.

Public Boundary

This repository contains the public-safe core only. Private retrospectives, internal audits, and local metrics history are kept in a separate private overlay. The boundary is enforced automatically:

make verify-public-boundary

Repository

github.com/sg4tech/codex-metrics-public

Contributing

Read CONTRIBUTING.md. In short: keep changes public-safe, run make verify, include tests for behavior changes.

Security

See SECURITY.md for how to report potential private-data leaks or security issues.

Changelog

Notable public changes are tracked in CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_agents_metrics-0.0.0.dev0.tar.gz (185.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_agents_metrics-0.0.0.dev0-py3-none-any.whl (110.8 kB view details)

Uploaded Python 3

File details

Details for the file ai_agents_metrics-0.0.0.dev0.tar.gz.

File metadata

  • Download URL: ai_agents_metrics-0.0.0.dev0.tar.gz
  • Upload date:
  • Size: 185.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_agents_metrics-0.0.0.dev0.tar.gz
Algorithm Hash digest
SHA256 1fe228b756ecaf6c3992a640ed2f9aa641c5e8b2f8b0bd878d4ed46bb68a9a25
MD5 dc068b9c3674beae2e1eef5ea4fb51ab
BLAKE2b-256 c6454d318a5cb74ec8c624ba59d39601c7e9d36ac88b06c7e9274d51aae7c9f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_agents_metrics-0.0.0.dev0.tar.gz:

Publisher: publish.yml on sg4tech/ai-agents-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_agents_metrics-0.0.0.dev0-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_agents_metrics-0.0.0.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 4fe17c74183739e41d3440d1b70449387b164805dde73e21661d3d39348c3559
MD5 592eb44bdd66b8ef077fad9ac3e476df
BLAKE2b-256 27c4f62db9cc7835d22c8da997ea62d7fd752ec355c810a181a2b289ca60e18f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_agents_metrics-0.0.0.dev0-py3-none-any.whl:

Publisher: publish.yml on sg4tech/ai-agents-metrics

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page