Framework-agnostic profiler for LLM agent context windows
Project description
context-profiler
The evidence layer for context engineering. Profile before you prune.
context-profiler turns raw provider requests, observability exports, and agent trajectories into evidence about how context grows, repeats, and concentrates — so you know what to compact and where it's safe to cut.
Why
Compression tools (LLMLingua, /compact, Mem0) execute blindly. They don't tell you what's redundant, what's safe to remove, or what downstream references will break. context-profiler fills the missing step:
trace → profile → human review → prune/compact decision
Built for both humans and agents:
- HTML reports — interactive timeline, icicle, persistence heatmap, tools, diff, findings
- JSON contracts — stable issue codes and evidence for automated agent workflows
- Trace-source agnostic — same analysis across OpenAI, Anthropic, Langfuse, and public trajectory datasets
Report Views
|
Icicle Token distribution per request. Semantic color by role, diff mode for additions/removals. |
Persistence Which content blocks survive across turns. Blue = token cost. Red = compact candidate. |
|
Tools Which tools dominate the context budget and their invocation details. |
Findings Issue codes with severity, evidence, and actionable recommendations. |
Findings Across Public Datasets
Profiled on real multi-turn agent trajectories from public benchmarks:
| Dataset | Domain | Turns | Total Tokens | Redundancy | Top Issue | Carryover |
|---|---|---|---|---|---|---|
| SWE-agent | Coding agent | 31 | 27.1K | 26.9% | REPEATED_CONTENT_BLOCK |
231K across 20 blocks |
| lmcache | KV-cache traces | 35 | 36.5K | 1.4% | REPEATED_CONTENT_BLOCK |
403K across 20 blocks |
| OpenHands | Tool-heavy agent | 34 | 23.9K | 0.2% | REPEATED_CONTENT_BLOCK |
383K across 20 blocks |
All examples are included in examples/ with conversion scripts and pre-converted session files.
Install
For agent/CLI use, prefer an isolated executable install:
pipx install context-profiler
# or
uv tool install context-profiler
context-profiler --version
which -a context-profiler # ensure a stale executable is not shadowing pipx/uv
Or install from source:
git clone https://github.com/Turdot/context-profiler.git
cd context-profiler
uv tool install -e .
# for local development in this repo:
PYTHONPATH=src uv run context-profiler --version
Quick Start
Analyze a multi-turn agent session (SWE-agent trajectory included):
context-profiler analyze examples/swe_agent/session.jsonl --format openai --html report.html
Analyze a raw provider request:
context-profiler analyze request.json --format auto
context-profiler diagnose request.json --format auto --json
Analyze a Langfuse export:
context-profiler validate trace.json --format langfuse --json
context-profiler diagnose trace.json --format langfuse --json
context-profiler analyze trace.json --format langfuse --html report.html
Fetch a Langfuse trace through the public API, then analyze it:
TRACE_ID="<trace-id>"
HOST="${LANGFUSE_HOST%/}"
OUT="/tmp/langfuse-trace-${TRACE_ID}"
mkdir -p "$OUT"
curl -fsS \
-u "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" \
"$HOST/api/public/traces/$TRACE_ID" \
-o "$OUT/trace.json"
curl -fsS \
-u "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" \
"$HOST/api/public/observations?traceId=$TRACE_ID&limit=100&page=1" \
-o "$OUT/observations-page-1.json"
context-profiler diagnose "$OUT/trace.json" --format langfuse --json
Analyze a public academic agent trajectory:
context-profiler diagnose examples/agent-trace/sample.json --format agent-trace --json
context-profiler analyze examples/agent-trace/sample.json --format agent-trace --html report.html
Generate an interactive report:
context-profiler analyze session.jsonl --html report.html
CLI Output
The terminal report gives a quick read on context budget, repeated content, and tool hotspots before you open the HTML report.
╭──────────────────────────────────────────────────────────────────────────────╮
│ context-profiler | mode: snapshot | source: │
│ tests/fixtures/repeated_tool_calls.json │
╰──────────────────────────────────────────────────────────────────────────────╯
⚠ Warnings
• Content duplication: 476 redundant tokens (60.2% of total)
Token Distribution
Category Tokens % of Total
Total Input 791 100%
System Prompt 13 1.6%
Tool Definitions 83 10.5%
Messages (assistant) 609 77.0%
Messages (tool) 70 8.8%
Messages (user) 16 2.0%
Top Tools by Token Usage
generate_canvas_component 595 75.2%
Agent-Friendly CLI Harness
context-profiler is strict about supported formats but helpful when input does not match. Agents can discover contracts and adapt unsupported traces without asking users to reshape data manually.
# Discover supported formats
context-profiler formats list --json
context-profiler formats describe cursor-jsonl --json
# Discover canonical contracts
context-profiler schema trace --json
context-profiler schema diagnosis --json
# Validate and normalize
context-profiler validate trace.json --format auto --json
context-profiler normalize trace.json --from auto --json
# Diagnose for agent consumption; '-' reads JSON/JSONL from stdin
context-profiler diagnose trace.json --format auto --json
If validation fails, the JSON response includes errors[].agent_action and next_steps so the agent can convert the trace into ContextTrace.
Agent Skill Distribution
This repository ships an analyze-agent-context skill for Cursor, Claude Code, and other Agent Skills / Open Plugins compatible tools.
The skill does not make context-profiler fetch traces itself. It teaches agents to fetch Langfuse trace ids with the Langfuse public API via curl, then route the fetched JSON into context-profiler for diagnosis whenever the user asks to analyze a trace, loop, transcript, agent run, context growth, stale context, or tool bloat. It intentionally avoids langfuse-cli for trace fetching because the CLI may omit fields needed for complete analysis.
Canonical skill:
skills/analyze-agent-context/SKILL.md
Plugin manifests:
.plugin/plugin.json
.claude-plugin/marketplace.json
Supported Inputs
Use context-profiler formats list --json for the current machine-readable registry.
| Kind | Formats | Confidence |
|---|---|---|
| Provider request | OpenAI, Anthropic | exact |
| Observability trace | Langfuse, planned OTel/OpenInference | high |
| Agent transcript | Cursor JSONL, Claude Code JSONL | partial |
| Benchmark trajectory | planned agent-trace, agent_trajectories, SWE-agent | dataset-dependent |
For agent-transcript, analysis is intentionally marked partial: hidden system prompts, rules, tool definitions, MCP schemas, and provider compaction may not be present.
Example Diagnosis
{
"issues": [
{
"code": "TOOL_USE_DOMINATES_CONTEXT",
"severity": "critical",
"message": "Tool inputs dominate the visible context."
},
{
"code": "TOP_TOOL_CONTEXT_HOTSPOT",
"message": "ApplyPatch is the largest visible tool context hotspot."
}
],
"diff_hints": [
{
"type": "large_addition",
"request_index": 76,
"evidence": {
"added_tokens": 7473,
"top_added_tool": "ApplyPatch"
}
}
]
}
Academic trajectory sample:
context-profiler analyze examples/agent-trace/sample.json --format agent-trace --html report.html
Total Input: 11.7K
Messages (assistant): 10.7K
Tool: python_interpreter 2.9K
Warnings: Content duplication 2.3K redundant tokens
Examples
See examples/README.md for runnable fixtures and conversion patterns.
Recommended demo order:
- Raw OpenAI/Anthropic request.
- Cursor or Claude Code transcript.
- Langfuse trace export.
- Multi-turn academic trajectories such as
pagarsky/agent-trace,cx-cmu/agent_trajectories, or SWE-agent traces.
Research Context
context-profiler is motivated by recent work showing that long-horizon agents are constrained not only by model quality, but also by how their working context is retained, compressed, and reused across turns.
Related work:
- ByteDance Seed — Scaling Long-Horizon LLM Agent via Context-Folding
Studies context management for long-horizon agents through folding and summarizing intermediate sub-trajectories. This motivatescontext-profiler's focus on turn-to-turn context diffs, retained observations, and compression/pruning evidence. - SWE-agent — Agent-Computer Interfaces Enable Automated Software Engineering
Shows the importance of the agent-computer interface for software-engineering agents, motivating analysis of tool calls, terminal output, and artifact churn. - WebArena — A Realistic Web Environment for Building Autonomous Agents
Demonstrates the value of realistic multi-step agent trajectories, motivating support for loop/transcript analysis rather than only single prompt snapshots.
Docs
What It Does Not Do
- It does not fetch traces from Langfuse, Hugging Face, Cursor, or Claude Code.
- It does not replay agent loops.
- It does not execute tools.
- It does not replace observability platforms.
- It does not pretend agent transcripts are exact raw provider requests.
Development
PYTHONPATH=src uv run --with pytest pytest tests/test_smoke.py -v
Acknowledgements
This project is inspired by and learned from:
- context-lens — local proxy for capturing and visualizing LLM API calls
- ContextFlame — flamegraph-based token profiling for Claude Code
- speedscope — the icicle / flamegraph UI design is inspired by speedscope's interactive visualization
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context_profiler-0.2.0.tar.gz.
File metadata
- Download URL: context_profiler-0.2.0.tar.gz
- Upload date:
- Size: 84.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bb728d1577235b16edda9c15837d0f54f6c481379713838cefd65cec9eee052
|
|
| MD5 |
4e45140d24b9f12f420f2c1f348b2f2a
|
|
| BLAKE2b-256 |
e974c3abc803f55ed9e97422faadecb5e770710ebc4b7c1d23703d1aee087744
|
Provenance
The following attestation bundles were made for context_profiler-0.2.0.tar.gz:
Publisher:
publish.yml on yanpgwang/context-profiler
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_profiler-0.2.0.tar.gz -
Subject digest:
7bb728d1577235b16edda9c15837d0f54f6c481379713838cefd65cec9eee052 - Sigstore transparency entry: 1572585988
- Sigstore integration time:
-
Permalink:
yanpgwang/context-profiler@da2e3489ba238f8153a99201d7e677bf5b917a72 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/yanpgwang
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@da2e3489ba238f8153a99201d7e677bf5b917a72 -
Trigger Event:
release
-
Statement type:
File details
Details for the file context_profiler-0.2.0-py3-none-any.whl.
File metadata
- Download URL: context_profiler-0.2.0-py3-none-any.whl
- Upload date:
- Size: 85.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb14617051bbd9ff361e3def06ca1cdfbed32e011c4a6bdeb44b26f521f59e88
|
|
| MD5 |
6815db0c684d64e6333dbde0a7ce9960
|
|
| BLAKE2b-256 |
eda5bbfe49b214d8883fd58f036b0ff45dafaf88aeca01983bfea80ff5cd7ee9
|
Provenance
The following attestation bundles were made for context_profiler-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on yanpgwang/context-profiler
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_profiler-0.2.0-py3-none-any.whl -
Subject digest:
cb14617051bbd9ff361e3def06ca1cdfbed32e011c4a6bdeb44b26f521f59e88 - Sigstore transparency entry: 1572586074
- Sigstore integration time:
-
Permalink:
yanpgwang/context-profiler@da2e3489ba238f8153a99201d7e677bf5b917a72 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/yanpgwang
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@da2e3489ba238f8153a99201d7e677bf5b917a72 -
Trigger Event:
release
-
Statement type: