Local-first agent analytics with prompt diagnostics

These details have not been verified by PyPI

Project description

AgentFluent

Local-first agent analytics with behavior-to-improvement diagnostics. The tools that exist tell you what your agent did — AgentFluent tells you how to make it better.

AI agents are in production at 57% of organizations, and quality is the single top barrier to deployment. When an agent misbehaves — wrong tool choice, retry loops, hallucinated outputs — developers iterate on prompts blind. Existing observability platforms show what happened: traces, latency, token counts. They don't tell you why the agent misbehaved or what in its configuration to change.

AgentFluent reads your local Claude Code and Claude Agent SDK session JSONL, extracts agent invocations and tool patterns, scores each agent's configuration against a best-practice rubric, and correlates observed behavior back to specific fixes — a prompt gap, a missing tool constraint, or a stale model selection. No cloud services, no API keys, no data leaves your machine.

Born from CodeFluent research that identified the agent-quality gap in 2026. See docs/AGENT_ANALYTICS_RESEARCH.md for additional market analysis.

How It Compares

The agent observability space is crowded — several tools capture what agents do. None diagnose why they misbehave or what to change from locally-persisted session data. In the table below, "What's missing" is what the tool does not do (not what it provides):

Tool	What it measures	What's missing
Langfuse / LangSmith / Arize Phoenix	Production traces, latency, token counts, errors	Behavior-to-prompt diagnosis; local agent config audit
Braintrust / Galileo / DeepEval	LLM-as-judge scoring against rubrics	Requires cloud instrumentation and author-provided test sets; no local agent config audit
ccusage / claude-code-analytics / agents-observe	Usage stats, token counts, subagent trees	Quality scoring; actionable config recommendations
claude-code-otel	OpenTelemetry export of Claude Code sessions	Analysis itself — it's a bridge to other tools
Anthropic Console	Per-request cost, rate-limit tracking	Session-level diagnostics; agent config recommendations

Where AgentFluent fits. AgentFluent reads the session JSONL your agent already produced, scores each agent's configuration against a best-practice rubric, and correlates observed behavior back to the specific config line that most likely explains it. It complements the tools above rather than replacing them — use Langfuse/Phoenix for production traces, Braintrust for test-set evals, ccusage for usage dashboards, and AgentFluent for what in the agent's config to change. The question "my Agent SDK agent ran 500 sessions last week — were any of them actually good, and how can I update my agent's configuration to make it better?" has no answer from the tools above. AgentFluent is built to answer it.

Why This Is Different

Research-grounded. Every diagnostic maps to a specific gap in the agent's prompt, tool list, or model selection — not vibes. See the research doc for the feasibility and positioning analysis.
Behavior-to-improvement, not just traces. When the agent retries Bash 40% of the time, AgentFluent tells you which prompt clause is missing — not just that the retry happened.
The config is the agent. In interactive sessions, the human course-corrects. In programmatic agents, the prompt and tool setup are the agent — a flaw compounds at scale. AgentFluent scores description, tools (allowed_tools / disallowedTools), model, and prompt on every agent definition, and audits MCP server configuration (configured-but-unused, observed-but-missing) against real tool usage. Hook coverage and cross-agent pattern detection are on the roadmap.
Local-first and private. All analysis runs on your machine. Zero outbound network calls. No API key required.
CLI-native. agentfluent analyze --format json | jq ... — fits agent developer workflows (terminal, CI/CD, PR checks) without a web dashboard dependency.
JSON output envelope is a contract. A stable {version, command, data} schema lets you build PR gates, trend dashboards, and regression detectors on top without tracking AgentFluent's internal refactors.
Correct cost accounting. Distinguishes pay-per-token API rate from subscription plan flat cost, with per-model pricing that AgentFluent actively maintains (#80 will add per-session historical pricing).
CodeFluent sibling. Shares the JSONL parsing heritage but asks a different question. CodeFluent scores human AI fluency in interactive sessions; AgentFluent scores agent quality and tells you what configuration to change. Not forked — two products with a common data source.

AgentFluent vs CodeFluent

Both read ~/.claude/projects/ session JSONL. They answer different questions:

	CodeFluent	AgentFluent
Unit of analysis	Conversations in interactive sessions, plus the supporting `.claude/` config (CLAUDE.md, rules, hooks, commands)	Agent definitions + their observed behavior
Scoring target	Developer's AI collaboration fluency and project-config maturity	Agent's prompt, tools, model, hooks
Feedback loop	Coaches the human to interact with Claude Code better	Tells the developer what config to change
Delivery	VS Code extension + web app	CLI-first (dashboard deferred)
API calls	Anthropic API for LLM-as-judge scoring	None — fully local

If you write your own prompts each session, use CodeFluent. If your prompts live in ClaudeAgentOptions, AgentDefinition, or .claude/agents/*.md files, use AgentFluent.

Screenshots

Execution Analytics — agentfluent analyze --project <name>

Execution Analytics: token usage, per-model cost, tool frequency, and Agent Invocations tables

Behavior Diagnostics — agentfluent analyze --project <name> --diagnostics

Behavior Diagnostics: aggregated Recommendations table with Count column and built-in-aware action text

Suggested Subagents with copy-paste-ready YAML draft — agentfluent analyze --project <name> --diagnostics --verbose

Suggested Subagents: medium-confidence cluster + YAML subagent definition ready to save as ~/.claude/agents/<name>.md

Config Assessment — agentfluent config-check

Config Assessment: per-agent 0-100 scoring across description, tools, model, prompt dimensions with recommendations

_{Screenshots are regenerated from real session data via scripts/generate_readme_screenshots.py.}

Getting Started

Prerequisites

Python 3.12 or newer. Check with python --version.
Claude Code or Agent SDK session data. Generated automatically at ~/.claude/projects/ whenever you use Claude Code or run an Agent SDK script — nothing to configure.
Platforms: Linux, macOS, Windows. Pure-Python package; the path handling resolves ~/.claude/ on every platform.

Install

# Preferred — isolated tool install via uv (https://docs.astral.sh/uv/)
uv tool install agentfluent

# Fallback — pip into a venv of your choice
pip install agentfluent

# Zero-install one-shot
uvx agentfluent list

First run

# Discover which projects have session data
agentfluent list

# Analyze agent behavior + cost in a specific project
agentfluent analyze --project myproject

# Score your agent definitions against the config rubric
agentfluent config-check

Commands

`agentfluent list` — discover projects and sessions

agentfluent list                                     # All projects
agentfluent list --project codefluent                # Sessions in one project
agentfluent list --format json | jq '.data.projects[].name'

Lists every Claude Code / Agent SDK project found under ~/.claude/projects/, with session counts, total size, and last-modified timestamp. Pass --project to drill into one project and list its individual session files.

`agentfluent analyze` — token, cost, and behavior metrics

agentfluent analyze --project codefluent                    # Full project analysis
agentfluent analyze --project codefluent --agent pm         # Filter to one subagent
agentfluent analyze --project codefluent --latest 5         # Last 5 sessions only
agentfluent analyze --project codefluent --diagnostics      # Show behavior diagnostics
agentfluent analyze --project codefluent --diagnostics -v   # + YAML subagent drafts
agentfluent analyze --project codefluent --format json | jq '.data.token_metrics.total_cost'

# Save the top-confidence cluster as a real subagent definition:
agentfluent analyze --project codefluent --diagnostics --format json \
  | jq -r '.data.diagnostics.delegation_suggestions[0].yaml_draft' \
  > ~/.claude/agents/new-agent.md

Produces a token-usage table, per-model cost breakdown (labeled as API rate — subscription plans differ), tool usage concentration, and an Agent Invocations table summarizing each subagent's token, duration, and tool-use count. --diagnostics surfaces the full v0.3 signal surface:

Metadata-level (from invocation summaries): tool-error keywords, token-per-tool-use outliers, duration outliers.
Trace-level (from ~/.claude/projects/<session>/subagents/): retry loops, stuck patterns, permission failures, consecutive tool-error sequences — each with per-tool-call evidence.
Aggregate: model mismatch (complexity class wrong for declared/observed model), delegation clustering (recurring general-purpose patterns → proposed specialized subagents), MCP server audit (configured-but-unused, observed-but-missing).

Near-duplicate recommendations are aggregated per (agent, target, signal) shape into one row with an occurrence Count and metric range (e.g. "4 invocations (4.9x–8.0x above 5,064 mean). Consider adding more specific instructions..."). Each recommendation carries a specific config surface to change (prompt, tools, model, mcp) and a pointer to the file to edit. Recommendations for built-in agents (Explore, general-purpose, Plan, etc.) use concern-specific action text — wrapper subagent for scope issues, retry bounds on the delegating agent for recovery issues, reroute for tools/model — since built-in agents have no user-editable prompt or tool config.

Cost numbers reflect current per-token pricing; historical sessions are priced at today's rates until #80 (time-series pricing) lands.

`agentfluent config-check` — score agent definitions

agentfluent config-check                          # All user + project agents
agentfluent config-check --scope user             # Only ~/.claude/agents/
agentfluent config-check --agent pm --verbose     # One agent with detailed recs
agentfluent config-check --format json | jq '.data.scores[] | select(.overall_score < 60)'

Walks ~/.claude/agents/*.md and ./.claude/agents/*.md, parses each agent's YAML frontmatter and body, and scores against a 4-dimension rubric (description trigger quality, tool access appropriateness, model selection, prompt completeness). Outputs a score per agent plus ranked recommendations — e.g. "Prompt body doesn't mention error handling."

Configuration

AgentFluent's "configuration" is CLI flags — no config file, no environment variables beyond the defaults. Sensible defaults keep most invocations flagless.

Flag	Default	What it controls
`--project`	(required on `analyze`)	Filter to a specific project slug or display name
`--scope`	`all`	`config-check` scope: `user`, `project`, or `all`
`--agent`	(none)	Filter `analyze` or `config-check` to one subagent type
`--latest N`	(all sessions)	`analyze` only the N most recent sessions
`--session`	(all)	`analyze` a specific session filename within the project
`--diagnostics`	off	`analyze`: show behavior-correlation signals
`--min-cluster-size`	5	Delegation clustering: minimum invocations per cluster (requires `agentfluent[clustering]`)
`--min-similarity`	0.7	Delegation dedup: cosine-similarity threshold against existing agents
`--claude-config-dir`	`~/.claude/`	Override the Claude config root (also honors `$CLAUDE_CONFIG_DIR`)
`--format`	`table`	Output format: `table` (Rich) or `json` (envelope)
`--verbose`	off	Extra detail: per-session breakdown, per-invocation detail, raw (un-aggregated) recommendations, and YAML subagent drafts for suggested clusters
`--quiet`	off	Suppress non-essential output (useful in CI)

Output formats

Default (table): Rich-rendered tables in the terminal, designed to be readable at a glance. Colors auto-adapt to terminal theme.

JSON envelope (--format json): Stable schema {version, command, data} intended as a contract — pipe to jq, integrate with CI, build regression gates on top. Example:

{
  "version": 1,
  "command": "analyze",
  "data": {
    "token_metrics": { "total_cost": 15.42, "total_tokens": 82940115, ... },
    "by_model": { "claude-opus-4-7": {...}, "claude-sonnet-4-6": {...} },
    "tool_usage": [...],
    "agent_invocations": [...]
  }
}

No ANSI escapes in JSON output, guaranteed. The key total_cost is the pay-per-token equivalent; subscribers on Pro/Max/Team/Enterprise plans see a flat monthly charge regardless.

How It Works

flowchart LR
    subgraph Local["Local filesystem — nothing leaves this boundary"]
        S["Session JSONL<br/>~/.claude/projects/"]
        ST["Subagent traces<br/>&lt;session&gt;/subagents/"]
        A["Agent definitions<br/>~/.claude/agents/"]
        M["MCP config<br/>~/.claude.json<br/>.mcp.json"]
    end

    S --> P[Parser]
    ST --> TP[Trace Parser<br/>+ Linker]
    P --> X[Agent Extractor]
    P --> TM[Token &amp; Cost<br/>Metrics]
    P --> TU[Tool Usage<br/>Patterns]
    TP --> X
    A --> CS[Config Scanner]
    CS --> SC[Config Scorer]
    M --> MD[MCP Discovery]

    X --> DX[Delegation<br/>Clustering]
    X --> MR[Model-Routing<br/>Analysis]
    X --> SIG[Signal Extraction<br/>metadata + trace]
    SIG --> COR[Correlator]
    MR --> COR
    DX --> COR
    MD --> COR
    SC --> COR

    COR --> OUT["Rich tables<br/>or JSON envelope"]
    TM --> OUT
    TU --> OUT
    SC --> OUT

Step by step:

Parse JSONL — core/parser.py reads each session file into typed SessionMessage objects. Handles streaming snapshot deduplication, plain-string vs. array content shapes, and Claude Code's real toolUseResult format (see CLAUDE.md for the format spec).
Parse subagent traces — traces/parser.py reads per-session subagent files under <session>/subagents/agent-<agentId>.jsonl and reconstructs the internal tool-call sequence with is_error flags. traces/linker.py attaches each trace back to its parent invocation via agentId. traces/retry.py detects retry sequences within a trace.
Discover projects and sessions — core/discovery.py enumerates ~/.claude/projects/ and surfaces friendly display names.
Extract agent invocations — agents/extractor.py walks messages, pairs Agent tool_use blocks with their tool_result content blocks, and pulls per-invocation metadata (tokens, duration, tool-use count) from the containing user message's toolUseResult sibling.
Compute token and cost metrics — analytics/tokens.py aggregates usage per model with <synthetic> sentinel filtering; analytics/pricing.py applies per-token rates labeled as API rate.
Score agent configurations — config/scanner.py parses YAML frontmatter from each .md in .claude/agents/ and ~/.claude/agents/; config/scoring.py scores description, tools, model, and prompt on a 4-dimension rubric.
Discover MCP servers — config/mcp_discovery.py reads mcpServers from ~/.claude.json (user + project-local scopes) and .mcp.json (project-shared), honoring the enabledMcpjsonServers / disabledMcpjsonServers gating arrays. Used by the audit phase to compare against observed mcp__* tool usage.
Diagnose behavior — diagnostics/ extracts metadata signals (signals.py), trace-level signals (trace_signals.py — retry loops, stuck patterns, permission failures, error sequences), model-routing mismatches (model_routing.py), and MCP audit signals (mcp_assessment.py). correlator.py routes each signal to a config target (prompt/tools/model/mcp) and emits an actionable recommendation.
Propose new subagents — diagnostics/delegation.py clusters recurring general-purpose invocations via TF-IDF + KMeans and drafts candidate subagent definitions with name, model, tool list, and prompt scaffold. Under --verbose, each draft is emitted as a copy-paste-ready YAML frontmatter block. Deduped against existing agents by cosine similarity.
Render — cli/formatters/table.py emits Rich tables; cli/formatters/json.py emits the stable JSON envelope. Format is selected by --format.

Everything runs locally. No outbound network calls, ever. No API key needed.

Features

Project and Session Discovery — Enumerates ~/.claude/projects/, groups sessions by project, shows per-project session count, total size, and last-modified timestamp. Handles Claude Code subagent sidechain files and Agent SDK sessions uniformly.
Execution Analytics — Token usage, API-rate cost, cache efficiency, per-model breakdown, tool-call concentration, and per-agent invocation metrics (tokens, duration, tool-use count). Cache creation and cache read tokens are tracked separately so you can see where your prompt caching is working.
Agent Config Assessment — 4-dimension rubric (description, tools, model, prompt) applied to every .md file in ~/.claude/agents/ and ./.claude/agents/. Produces a 0–100 score plus ranked, specific recommendations ("Prompt body doesn't mention error handling"). Catches agents that are technically valid but miss well-known best practices.
Subagent Trace Parsing — Parses the internal tool-call sequences Claude Code emits under ~/.claude/projects/<session>/subagents/agent-<agentId>.jsonl, links them back to the delegating invocation, and detects retry sequences. Gives diagnostics per-call evidence (which tool, which attempt, which error) instead of just an invocation-level summary.
Behavior Diagnostics — --diagnostics emits signals across three layers. Metadata: tool-error keywords, token-per-tool-use outliers, duration outliers. Trace-level: retry loops, stuck patterns (same call repeated with no progress), permission failures, consecutive tool-error sequences. Aggregate: model mismatch (declared/observed model wrong for the workload's complexity), MCP server audit (configured-but-unused, observed-but-missing). Near-duplicate recommendations collapse into one row per (agent, target, signal) shape with an occurrence Count and metric range, sorted severity-desc then count-desc so the highest-impact findings surface first. Recommendations for built-in agents (Explore, general-purpose, Plan, code-reviewer, etc.) use concern-specific action text since built-ins have no user-editable config. Each signal routes to a target config surface — prompt, tools, model, or mcp — and the recommendation names the file to edit and the specific change to make.
Delegation Clustering — TF-IDF + KMeans on recurring general-purpose invocations surfaces patterns that would benefit from their own specialized subagent. Proposes a complete draft: name, description, recommended model (with cost reasoning), tool list derived from the cluster's trace data, and a prompt-body scaffold. Under --verbose, each cluster emits a copy-paste-ready YAML subagent definition block (frontmatter + prompt body) that can be saved directly as ~/.claude/agents/<name>.md. Low-confidence clusters are kept but prefixed with a REVIEW BEFORE USE comment so loose groupings don't land in production blindly. Confidence tiers (high/medium/low) are calibrated against real-world cohesion distributions from multi-contributor datasets. Suppresses drafts that overlap existing agents and annotates the overlap. Requires the optional agentfluent[clustering] extra.
Model-Routing Diagnostics — Per-agent-type classification of observed complexity (tool-call counts, token footprint, error rate, write-tool presence) compared against the agent's declared model tier. Flags overspec (complex model on simple workload — cost savings estimate included) and underspec (simple model struggling). Consumes trace-based model inference when frontmatter is absent.
MCP Server Assessment — Reads configured MCP servers from ~/.claude.json (user + project-local) and .mcp.json (project-shared), honoring per-user enable/disable gating. Compares against observed mcp__<server>__* tool usage from both parent sessions and subagent traces. Emits MCP_UNUSED_SERVER (INFO, configured but zero calls) and MCP_MISSING_SERVER (WARNING, failing calls to an unconfigured server) signals with actionable recommendations.
JSON Output Envelope — Stable {version, command, data} schema. No ANSI escapes. Intended as a programmatic contract for CI integration, PR gates, and regression tracking.
Quiet and Verbose Modes — --quiet for CI-friendly one-line summaries; --verbose for per-session breakdown and per-invocation detail tables. Defaults target interactive humans.

Privacy and Security

AgentFluent is designed so data stays on your machine. The attack surface is small by construction — no web server, no HTML rendering, no webview, no outbound network calls — but this table summarizes the layers that protect it anyway:

Layer	Mechanism	Protects Against
Zero network calls	No outbound connections — all analysis is local	Data exfiltration
Path handling	All paths resolved within `~/.claude/`	Path traversal
Input validation	Pydantic models with strict type constraints	Malformed JSONL crashing the parser
Safe YAML loading	`yaml.safe_load` only	Arbitrary code execution via frontmatter
CI security review	Claude-powered review on every PR	New vulnerabilities
Automated testing	730+ unit tests incl. security-focused cases	Regressions

Secrets handling

Claude Code persists every tool output to ~/.claude/projects/<slug>/*.jsonl — including any .env, credentials.json, or shell rc file that Claude ever read. .gitignore does not protect against this. AgentFluent itself emits only aggregate metrics, so it cannot leak secrets that weren't already on disk — but because the tool reads that data, contributors working on AgentFluent risk re-leaking while they work.

This repo ships two Claude Code hooks in .claude/settings.json to reduce that risk:

PreToolUse block (.claude/hooks/block_secret_reads.py) — denies reads of .env*, .envrc, credentials.json, secrets.{yaml,yml,json}, *.pem, SSH private keys, and shell rc files. Blocks before execution, so the file's contents never enter the session transcript.
PostToolUse detect (.claude/hooks/detect_secrets_in_output.py) — scans tool output for sk-ant-*, sk-proj-*, ghp_*, github_pat_*, AKIA*, or AIza* patterns. If a match is found, blocks Claude from echoing or summarizing it. The raw value is already on disk at this point, so treat any caught value as compromised and rotate.

Any future AgentFluent feature that surfaces raw session content (diff viewers, prompt excerpts, recommendation snippets that quote session text) must re-apply secret-pattern redaction at the display layer — historical JSONL on users' machines may still contain pre-hook leaks.

See docs/SECURITY.md for the full policy: leak vector, defense architecture, discipline rules, historical-leak audit one-liner, user-scope deployment, and the bypass surface the hooks do not cover.

Tech Stack

Python 3.12+
Typer + Rich — CLI framework and terminal formatting
Pydantic v2 — data models across module boundaries
PyYAML — agent definition frontmatter parsing (safe_load only)
pytest + pytest-cov — 730+ tests
mypy strict mode — full type coverage
ruff — linting and formatting
uv — package and dependency management

Project Structure

src/agentfluent/
├── cli/                 # Typer app, commands, formatters (table + JSON envelope)
├── core/                # JSONL parser, session models, project/session discovery
├── agents/              # Agent invocation extraction and AgentInvocation model
├── analytics/           # Token/cost metrics, tool patterns, model pricing
├── config/              # Agent definition scanner + scoring + MCP server discovery
├── traces/              # Subagent trace parsing, linking, and retry detection
└── diagnostics/         # Behavior signals (metadata + trace), correlation,
                         # model routing, delegation clustering, MCP audit

Full architecture and conventions are documented in CLAUDE.md.

Development

git clone https://github.com/frederick-douglas-pearce/agentfluent.git
cd agentfluent
uv sync
uv run agentfluent --help

Testing

uv run pytest -m "not integration"            # 730+ unit tests (CI default)
uv run pytest                                 # Full suite incl. integration tests against your real ~/.claude/projects/
uv run pytest --cov=agentfluent               # With coverage

Integration tests (tests/integration/) are skipped in CI because they require real session data — they pass on contributor machines with populated ~/.claude/projects/.

Lint and type check

uv run ruff check src/ tests/
uv run mypy src/agentfluent/

Both must pass cleanly before a PR merges.

CI/CD

Five GitHub Actions workflows run automatically:

CI (ci.yml) — Every PR: ruff, mypy strict, full unit-test suite. Must pass to merge.
Security Review (security-review.yml) — Claude-powered security review of code-changing PRs (markdown and image changes skip it).
Claude Code Review (claude-review.yml) — AI-powered PR review, triggered by the needs-review label or @claude mentions.
Release Please (release-please.yml) — Auto-generates release PRs with changelog and version bumps from Conventional Commits.
Dependabot Auto-Merge (dependabot-auto-merge.yml) — Auto-merges dependabot PRs once CI passes.

Roadmap

v0.2 (shipped):

Parser fix for real Claude Code toolUseResult shape (#84)
Cost label clarity for subscription-plan users (#76)
Pricing data correction + opus-4-7 + synthetic filter (#75)

v0.3 (shipped):

Subagent trace parser (E2) — reconstructs the full internal tool-call sequence per subagent with is_error flags and retry detection, linked back to the delegating invocation.
Deep diagnostics engine (E3) — trace-level signals: retry loops, stuck patterns, permission failures, consecutive tool-error sequences, each carrying per-tool-call evidence.
Delegation clustering (#92) — TF-IDF + KMeans over recurring general-purpose invocations; proposes complete draft subagent definitions deduped against existing agents.
Model-routing diagnostics (#95) — per-agent-type complexity classification vs. declared model; overspec/underspec flags with cost-savings estimates. Trace-based model inference when frontmatter is absent.
MCP server assessment (#100) — configured-vs-observed audit with MCP_UNUSED_SERVER and MCP_MISSING_SERVER signals.
Recommendation aggregation (#165) — near-duplicate rows collapse per (agent, target, signal) shape with occurrence count and metric range; raw list preserved for --verbose and JSON drill-down.
Built-in vs custom agent differentiation (#166) — concern-specific action text (scope / recovery / tools / model) for built-in agents that have no user-editable config; nine of ten correlation rules updated.
YAML subagent draft in --verbose (#168) — copy-paste-ready ~/.claude/agents/<name>.md block for each cluster; exposed as yaml_draft field in --format json for jq-pipe workflows.
Cluster confidence re-calibration (#167) — thresholds validated against two real datasets; MEDIUM now surfaces actionable candidates instead of everything landing in LOW.
Aggregated row signal-type clarity (#181) — same-(agent, target) rows that fire on different signals now name the trigger in the prefix (e.g., tool_error_sequence: vs retry_loop:) instead of looking interchangeable.
Unknown-agent attribution fix (#169) — invocations missing subagent_type (older skills, certain Claude Code versions) now correctly default to general-purpose instead of falling out of clustering as "unknown".
--claude-config-dir flag and $CLAUDE_CONFIG_DIR env var for non-default session paths (#90).
Empirical threshold calibration via a committed Jupyter notebook (#140).

v0.4+:

Parent-thread offload analysis (#189) — detect repeating tool-use patterns in the parent Claude Code thread and recommend subagent / skill candidates that move that work onto cheaper-tier models. The dominant cost lever for users who deploy agents at scale.
In-product glossary (#190, #191) — definitions for token types, tool names, agent types, signal types, severity / confidence levels surfaced via static markdown (Phase 1) and agentfluent explain <term> CLI subcommand (Phase 2).
Outlier-detection distribution recalibration (#186) — replace ratio-to-mean outlier signals with distribution-aware detection (z-score / IQR) backed by per-agent distribution analysis.
Time-series pricing data structure (#80) + session-timestamp-aware cost calculation (#81) + automated pricing updates (#82).
Agent SDK main-session MCP + tool extraction (#112).
Per-invocation token input/output split for more accurate cost estimates (#143).
Hosted documentation site (#97).
Prompt regression detection (agentfluent diff) across agent config versions.
Hook coverage in the config rubric.

Future:

Webapp dashboard for trend visualization
agentfluent diff — side-by-side comparison of behavior before/after a prompt change
Closed-loop self-improvement — use AgentFluent's diagnostic output as a feedback signal the agent itself consumes to propose config edits against its own past sessions
Agent ROI reporting — roll up cost, usage, and task-completion signals over time so a business can evaluate whether an optimized agent is worth continuing to run

Browse open issues for the full backlog.

Troubleshooting

Problem	Solution
No projects found	Verify `~/.claude/projects/` exists and contains per-project subdirectories with `.jsonl` session files. Claude Code creates these automatically the first time you use it.
No agent invocations	Agent invocation rows require the session to actually call a subagent (`Agent` tool_use with a `subagent_type`). A session that never delegated has no agent data to analyze — this is not an error.
Zero tokens / dashes in Agent Invocations	If you're on AgentFluent ≤ 0.1.0, this is the #84 parser bug — upgrade with `uv tool upgrade agentfluent`.
Python version error	AgentFluent requires Python 3.12+. Check with `python --version` and upgrade if needed.
Non-default session path	Pass `--claude-config-dir /path/to/.claude` or set `$CLAUDE_CONFIG_DIR` before invoking any command. The override applies to project discovery, agent configs, and MCP server discovery together.
`Malformed JSON at <file>:<line>` warning	A session file has a corrupted line — usually null bytes left behind when Claude Code was killed mid-write. The parser skips the line and continues; analytics are unaffected. Safe to ignore, or delete the line with `sed -i '<line>d' <file>` to silence the warning.
Stale tool install after local build	If `uv tool install --from <path> agentfluent` seems to reuse cached code, run `uv tool uninstall agentfluent && uv cache clean agentfluent` before reinstalling.

Research Foundations

AgentFluent's behavior-to-improvement approach is grounded in research on agent quality, observability gaps, and production failure modes:

docs/AGENT_ANALYTICS_RESEARCH.md — Full market analysis, competitive landscape (Langfuse, LangSmith, Arize, Braintrust, DeepEval, etc.), and technical feasibility study. This is the document that motivated AgentFluent's existence as a separate product from CodeFluent.
LangChain 2026 State of AI Agents — 57% of orgs have agents in production; quality is the top blocker.
Anthropic Claude Agent SDK docs — Agent configuration surface and best practices.
Anthropic Claude Code subagents docs — Subagent definition format and delegation mechanics.

Contributing

Contributions welcome. Start by reading CONTRIBUTING.md for dev setup, conventions, and the PR checklist. The architecture overview in CLAUDE.md is the canonical reference for package layout, naming, and the JSONL format.

Branching: feature/<issue>-description for features, fix/<issue>-description for bugs. Commit messages follow Conventional Commits — release-please uses them to cut versions and write the changelog automatically.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Apr 25, 2026

0.2.0

Apr 20, 2026

0.1.0

Apr 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentfluent-0.3.0.tar.gz (101.5 kB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentfluent-0.3.0-py3-none-any.whl (114.5 kB view details)

Uploaded Apr 25, 2026 Python 3

File details

Details for the file agentfluent-0.3.0.tar.gz.

File metadata

Download URL: agentfluent-0.3.0.tar.gz
Upload date: Apr 25, 2026
Size: 101.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentfluent-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`de361d00b10332ca18b2bba149414f2cb2d9d0377fc4928b10b6da00cd45ebe0`
MD5	`9a329c46276edd47f438d314b46746f0`
BLAKE2b-256	`af491f68a9f4bc67c115a8180e14c72d11fbdbc09e828104ab5d0278d9a773a4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentfluent-0.3.0.tar.gz:

Publisher: release-please.yml on frederick-douglas-pearce/agentfluent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentfluent-0.3.0.tar.gz
- Subject digest: de361d00b10332ca18b2bba149414f2cb2d9d0377fc4928b10b6da00cd45ebe0
- Sigstore transparency entry: 1379301848
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: frederick-douglas-pearce/agentfluent@f2bd7cf8bec250b8a6bc399b2eabf89995293bb1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/frederick-douglas-pearce
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@f2bd7cf8bec250b8a6bc399b2eabf89995293bb1
- Trigger Event: push

File details

Details for the file agentfluent-0.3.0-py3-none-any.whl.

File metadata

Download URL: agentfluent-0.3.0-py3-none-any.whl
Upload date: Apr 25, 2026
Size: 114.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentfluent-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd00c7b1766d5aa5282ed8200a78ff5fa999be297d4d891d269f1d5fa96da70a`
MD5	`db1d19359d1dc8427f4a8922a19061e6`
BLAKE2b-256	`05cbca2db33dbb05af5bdb63b93e8d13371e20226574eadde197c168b124a3d7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentfluent-0.3.0-py3-none-any.whl:

Publisher: release-please.yml on frederick-douglas-pearce/agentfluent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentfluent-0.3.0-py3-none-any.whl
- Subject digest: fd00c7b1766d5aa5282ed8200a78ff5fa999be297d4d891d269f1d5fa96da70a
- Sigstore transparency entry: 1379301934
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: frederick-douglas-pearce/agentfluent@f2bd7cf8bec250b8a6bc399b2eabf89995293bb1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/frederick-douglas-pearce
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@f2bd7cf8bec250b8a6bc399b2eabf89995293bb1
- Trigger Event: push

agentfluent 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

AgentFluent

How It Compares

Why This Is Different

AgentFluent vs CodeFluent

Screenshots

Getting Started

Prerequisites

Install

First run

Commands

agentfluent list — discover projects and sessions

agentfluent analyze — token, cost, and behavior metrics

agentfluent config-check — score agent definitions

Configuration

Output formats

How It Works

Features

Privacy and Security

Secrets handling

Tech Stack

Project Structure

Development

Testing

Lint and type check

CI/CD

Roadmap

Troubleshooting

Research Foundations

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`agentfluent list` — discover projects and sessions

`agentfluent analyze` — token, cost, and behavior metrics

`agentfluent config-check` — score agent definitions