Skip to main content

Framework-agnostic profiler for LLM agent context windows

Project description

context-profiler

PyPI version Python License: MIT

The evidence layer for context engineering. Profile before you prune.

context-profiler turns raw provider requests, observability exports, and agent trajectories into evidence about how context grows, repeats, and concentrates — so you know what to compact and where it's safe to cut.

context-profiler: Icicle view showing token distribution across a 31-turn SWE-agent session

Why

Compression tools (LLMLingua, /compact, Mem0) execute blindly. They don't tell you what's redundant, what's safe to remove, or what downstream references will break. context-profiler fills the missing step:

trace → profile → human review → prune/compact decision

Built for both humans and agents:

  • HTML reports — interactive timeline, icicle, persistence heatmap, tools, diff, findings
  • JSON contracts — stable issue codes and evidence for automated agent workflows
  • Trace-source agnostic — same analysis across OpenAI, Anthropic, Langfuse, and public trajectory datasets

Report Views

Icicle view with semantic color and diff mode
Icicle
Token distribution per request. Semantic color by role, diff mode for additions/removals.
Persistence heatmap showing content blocks across requests
Persistence
Which content blocks survive across turns. Blue = token cost. Red = compact candidate.
Tools view with token table and invocation detail
Tools
Which tools dominate the context budget and their invocation details.
Findings drawer with grouped diagnosis issues
Findings
Issue codes with severity, evidence, and actionable recommendations.

Findings Across Public Datasets

Profiled on real multi-turn agent trajectories from public benchmarks:

Dataset Domain Turns Total Tokens Redundancy Top Issue Carryover
SWE-agent Coding agent 31 27.1K 26.9% REPEATED_CONTENT_BLOCK 231K across 20 blocks
lmcache KV-cache traces 35 36.5K 1.4% REPEATED_CONTENT_BLOCK 403K across 20 blocks
OpenHands Tool-heavy agent 34 23.9K 0.2% REPEATED_CONTENT_BLOCK 383K across 20 blocks

All examples are included in examples/ with conversion scripts and pre-converted session files.

Install

For agent/CLI use, prefer an isolated executable install:

pipx install context-profiler
# or
uv tool install context-profiler
context-profiler --version
which -a context-profiler  # ensure a stale executable is not shadowing pipx/uv

Or install from source:

git clone https://github.com/Turdot/context-profiler.git
cd context-profiler
uv tool install -e .
# for local development in this repo:
PYTHONPATH=src uv run context-profiler --version

Quick Start

Analyze a multi-turn agent session (SWE-agent trajectory included):

context-profiler analyze examples/swe_agent/session.jsonl --format openai --html report.html

Analyze a raw provider request:

context-profiler analyze request.json --format auto
context-profiler diagnose request.json --format auto --json

Analyze a Langfuse export:

context-profiler validate trace.json --format langfuse --json
context-profiler diagnose trace.json --format langfuse --json
context-profiler analyze trace.json --format langfuse --html report.html

Fetch a Langfuse trace through the public API, then analyze it:

TRACE_ID="<trace-id>"
HOST="${LANGFUSE_HOST%/}"
OUT="/tmp/langfuse-trace-${TRACE_ID}"
mkdir -p "$OUT"

curl -fsS \
  -u "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" \
  "$HOST/api/public/traces/$TRACE_ID" \
  -o "$OUT/trace.json"

curl -fsS \
  -u "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" \
  "$HOST/api/public/observations?traceId=$TRACE_ID&limit=100&page=1" \
  -o "$OUT/observations-page-1.json"

context-profiler diagnose "$OUT/trace.json" --format langfuse --json

Analyze a public academic agent trajectory:

context-profiler diagnose examples/agent-trace/sample.json --format agent-trace --json
context-profiler analyze examples/agent-trace/sample.json --format agent-trace --html report.html

Generate an interactive report:

context-profiler analyze session.jsonl --html report.html

CLI Output

The terminal report gives a quick read on context budget, repeated content, and tool hotspots before you open the HTML report.

╭──────────────────────────────────────────────────────────────────────────────╮
│ context-profiler  |  mode: snapshot  |  source:                              │
│ tests/fixtures/repeated_tool_calls.json                                      │
╰──────────────────────────────────────────────────────────────────────────────╯

⚠ Warnings
  • Content duplication: 476 redundant tokens (60.2% of total)

Token Distribution
  Category                  Tokens    % of Total
  Total Input                  791          100%
    System Prompt               13          1.6%
    Tool Definitions            83         10.5%
    Messages (assistant)       609         77.0%
    Messages (tool)             70          8.8%
    Messages (user)             16          2.0%

  Top Tools by Token Usage
      generate_canvas_component    595    75.2%

Agent-Friendly CLI Harness

context-profiler is strict about supported formats but helpful when input does not match. Agents can discover contracts and adapt unsupported traces without asking users to reshape data manually.

# Discover supported formats
context-profiler formats list --json
context-profiler formats describe cursor-jsonl --json

# Discover canonical contracts
context-profiler schema trace --json
context-profiler schema diagnosis --json

# Validate and normalize
context-profiler validate trace.json --format auto --json
context-profiler normalize trace.json --from auto --json

# Diagnose for agent consumption; '-' reads JSON/JSONL from stdin
context-profiler diagnose trace.json --format auto --json

If validation fails, the JSON response includes errors[].agent_action and next_steps so the agent can convert the trace into ContextTrace.

Agent Skill Distribution

This repository ships an analyze-agent-context skill for Cursor, Claude Code, and other Agent Skills / Open Plugins compatible tools.

The skill does not make context-profiler fetch traces itself. It teaches agents to fetch Langfuse trace ids with the Langfuse public API via curl, then route the fetched JSON into context-profiler for diagnosis whenever the user asks to analyze a trace, loop, transcript, agent run, context growth, stale context, or tool bloat. It intentionally avoids langfuse-cli for trace fetching because the CLI may omit fields needed for complete analysis.

Canonical skill:

skills/analyze-agent-context/SKILL.md

Plugin manifests:

.plugin/plugin.json
.claude-plugin/marketplace.json

Supported Inputs

Use context-profiler formats list --json for the current machine-readable registry.

Kind Formats Confidence
Provider request OpenAI, Anthropic exact
Observability trace Langfuse, planned OTel/OpenInference high
Agent transcript Cursor JSONL, Claude Code JSONL partial
Benchmark trajectory planned agent-trace, agent_trajectories, SWE-agent dataset-dependent

For agent-transcript, analysis is intentionally marked partial: hidden system prompts, rules, tool definitions, MCP schemas, and provider compaction may not be present.

Example Diagnosis

{
  "issues": [
    {
      "code": "TOOL_USE_DOMINATES_CONTEXT",
      "severity": "critical",
      "message": "Tool inputs dominate the visible context."
    },
    {
      "code": "TOP_TOOL_CONTEXT_HOTSPOT",
      "message": "ApplyPatch is the largest visible tool context hotspot."
    }
  ],
  "diff_hints": [
    {
      "type": "large_addition",
      "request_index": 76,
      "evidence": {
        "added_tokens": 7473,
        "top_added_tool": "ApplyPatch"
      }
    }
  ]
}

Academic trajectory sample:

context-profiler analyze examples/agent-trace/sample.json --format agent-trace --html report.html

Total Input: 11.7K
Messages (assistant): 10.7K
Tool: python_interpreter 2.9K
Warnings: Content duplication 2.3K redundant tokens

Examples

See examples/README.md for runnable fixtures and conversion patterns.

Recommended demo order:

  1. Raw OpenAI/Anthropic request.
  2. Cursor or Claude Code transcript.
  3. Langfuse trace export.
  4. Multi-turn academic trajectories such as pagarsky/agent-trace, cx-cmu/agent_trajectories, or SWE-agent traces.

Research Context

context-profiler is motivated by recent work showing that long-horizon agents are constrained not only by model quality, but also by how their working context is retained, compressed, and reused across turns.

Related work:

  • ByteDance Seed — Scaling Long-Horizon LLM Agent via Context-Folding
    Studies context management for long-horizon agents through folding and summarizing intermediate sub-trajectories. This motivates context-profiler's focus on turn-to-turn context diffs, retained observations, and compression/pruning evidence.
  • SWE-agent — Agent-Computer Interfaces Enable Automated Software Engineering
    Shows the importance of the agent-computer interface for software-engineering agents, motivating analysis of tool calls, terminal output, and artifact churn.
  • WebArena — A Realistic Web Environment for Building Autonomous Agents
    Demonstrates the value of realistic multi-step agent trajectories, motivating support for loop/transcript analysis rather than only single prompt snapshots.

Docs

What It Does Not Do

  • It does not fetch traces from Langfuse, Hugging Face, Cursor, or Claude Code.
  • It does not replay agent loops.
  • It does not execute tools.
  • It does not replace observability platforms.
  • It does not pretend agent transcripts are exact raw provider requests.

Development

PYTHONPATH=src uv run --with pytest pytest tests/test_smoke.py -v

Acknowledgements

This project is inspired by and learned from:

  • context-lens — local proxy for capturing and visualizing LLM API calls
  • ContextFlame — flamegraph-based token profiling for Claude Code
  • speedscope — the icicle / flamegraph UI design is inspired by speedscope's interactive visualization

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_profiler-0.2.0.tar.gz (84.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_profiler-0.2.0-py3-none-any.whl (85.3 kB view details)

Uploaded Python 3

File details

Details for the file context_profiler-0.2.0.tar.gz.

File metadata

  • Download URL: context_profiler-0.2.0.tar.gz
  • Upload date:
  • Size: 84.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for context_profiler-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7bb728d1577235b16edda9c15837d0f54f6c481379713838cefd65cec9eee052
MD5 4e45140d24b9f12f420f2c1f348b2f2a
BLAKE2b-256 e974c3abc803f55ed9e97422faadecb5e770710ebc4b7c1d23703d1aee087744

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_profiler-0.2.0.tar.gz:

Publisher: publish.yml on yanpgwang/context-profiler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file context_profiler-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for context_profiler-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb14617051bbd9ff361e3def06ca1cdfbed32e011c4a6bdeb44b26f521f59e88
MD5 6815db0c684d64e6333dbde0a7ce9960
BLAKE2b-256 eda5bbfe49b214d8883fd58f036b0ff45dafaf88aeca01983bfea80ff5cd7ee9

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_profiler-0.2.0-py3-none-any.whl:

Publisher: publish.yml on yanpgwang/context-profiler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page