Framework-agnostic profiler for LLM agent context windows

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

YanpengWang

These details have not been verified by PyPI

Project description

context-profiler

The evidence layer for context engineering. Profile before you prune.

context-profiler turns raw provider requests, observability exports, and agent trajectories into evidence about how context grows, repeats, and concentrates — so you know what to compact and where it's safe to cut.

context-profiler: Icicle view showing token distribution across a 31-turn SWE-agent session

Why

Compression tools (LLMLingua, /compact, Mem0) execute blindly. They don't tell you what's redundant, what's safe to remove, or what downstream references will break. context-profiler fills the missing step:

trace → profile → human review → prune/compact decision

Built for both humans and agents:

HTML reports — interactive timeline, icicle, persistence heatmap, tools, diff, findings
JSON contracts — stable issue codes and evidence for automated agent workflows
Trace-source agnostic — same analysis across OpenAI, Anthropic, Langfuse, and public trajectory datasets

Report Views

Icicle Token distribution per request. Semantic color by role, diff mode for additions/removals.	Persistence Which content blocks survive across turns. Blue = token cost. Red = compact candidate.
Tools Which tools dominate the context budget and their invocation details.	Findings Issue codes with severity, evidence, and actionable recommendations.

Findings Across Public Datasets

Profiled on real multi-turn agent trajectories from public benchmarks:

Dataset	Domain	Turns	Total Tokens	Redundancy	Top Issue	Carryover
SWE-agent	Coding agent	31	27.1K	26.9%	`REPEATED_CONTENT_BLOCK`	231K across 20 blocks
lmcache	KV-cache traces	35	36.5K	1.4%	`REPEATED_CONTENT_BLOCK`	403K across 20 blocks
OpenHands	Tool-heavy agent	34	23.9K	0.2%	`REPEATED_CONTENT_BLOCK`	383K across 20 blocks

All examples are included in examples/ with conversion scripts and pre-converted session files.

Install

For agent/CLI use, prefer an isolated executable install:

pipx install context-profiler
# or
uv tool install context-profiler
context-profiler --version
which -a context-profiler  # ensure a stale executable is not shadowing pipx/uv

Or install from source:

git clone https://github.com/Turdot/context-profiler.git
cd context-profiler
uv tool install -e .
# for local development in this repo:
PYTHONPATH=src uv run context-profiler --version

Quick Start

Analyze a multi-turn agent session (SWE-agent trajectory included):

context-profiler analyze examples/swe_agent/session.jsonl --format openai --html report.html

Analyze a raw provider request:

context-profiler analyze request.json --format auto
context-profiler diagnose request.json --format auto --json

Analyze a Langfuse export:

context-profiler validate trace.json --format langfuse --json
context-profiler diagnose trace.json --format langfuse --json
context-profiler analyze trace.json --format langfuse --html report.html

Fetch a Langfuse trace through the public API, then analyze it:

TRACE_ID="<trace-id>"
HOST="${LANGFUSE_HOST%/}"
OUT="/tmp/langfuse-trace-${TRACE_ID}"
mkdir -p "$OUT"

curl -fsS \
  -u "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" \
  "$HOST/api/public/traces/$TRACE_ID" \
  -o "$OUT/trace.json"

curl -fsS \
  -u "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" \
  "$HOST/api/public/observations?traceId=$TRACE_ID&limit=100&page=1" \
  -o "$OUT/observations-page-1.json"

context-profiler diagnose "$OUT/trace.json" --format langfuse --json

Analyze a public academic agent trajectory:

context-profiler diagnose examples/agent-trace/sample.json --format agent-trace --json
context-profiler analyze examples/agent-trace/sample.json --format agent-trace --html report.html

Generate an interactive report:

context-profiler analyze session.jsonl --html report.html

CLI Output

The terminal report gives a quick read on context budget, repeated content, and tool hotspots before you open the HTML report.

╭──────────────────────────────────────────────────────────────────────────────╮
│ context-profiler  |  mode: snapshot  |  source:                              │
│ tests/fixtures/repeated_tool_calls.json                                      │
╰──────────────────────────────────────────────────────────────────────────────╯

⚠ Warnings
  • Content duplication: 476 redundant tokens (60.2% of total)

Token Distribution
  Category                  Tokens    % of Total
  Total Input                  791          100%
    System Prompt               13          1.6%
    Tool Definitions            83         10.5%
    Messages (assistant)       609         77.0%
    Messages (tool)             70          8.8%
    Messages (user)             16          2.0%

  Top Tools by Token Usage
      generate_canvas_component    595    75.2%

Agent-Friendly CLI Harness

context-profiler is strict about supported formats but helpful when input does not match. Agents can discover contracts and adapt unsupported traces without asking users to reshape data manually.

# Discover supported formats
context-profiler formats list --json
context-profiler formats describe cursor-jsonl --json

# Discover canonical contracts
context-profiler schema trace --json
context-profiler schema diagnosis --json

# Validate and normalize
context-profiler validate trace.json --format auto --json
context-profiler normalize trace.json --from auto --json

# Diagnose for agent consumption; '-' reads JSON/JSONL from stdin
context-profiler diagnose trace.json --format auto --json

If validation fails, the JSON response includes errors[].agent_action and next_steps so the agent can convert the trace into ContextTrace.

Agent Skill Distribution

This repository ships an analyze-agent-context skill for Cursor, Claude Code, and other Agent Skills / Open Plugins compatible tools.

The skill does not make context-profiler fetch traces itself. It teaches agents to fetch Langfuse trace ids with the Langfuse public API via curl, then route the fetched JSON into context-profiler for diagnosis whenever the user asks to analyze a trace, loop, transcript, agent run, context growth, stale context, or tool bloat. It intentionally avoids langfuse-cli for trace fetching because the CLI may omit fields needed for complete analysis.

Canonical skill:

skills/analyze-agent-context/SKILL.md

Plugin manifests:

.plugin/plugin.json
.claude-plugin/marketplace.json

Supported Inputs

Use context-profiler formats list --json for the current machine-readable registry.

Kind	Formats	Confidence
Provider request	OpenAI, Anthropic	exact
Observability trace	Langfuse, planned OTel/OpenInference	high
Agent transcript	Cursor JSONL, Claude Code JSONL	partial
Benchmark trajectory	planned agent-trace, agent_trajectories, SWE-agent	dataset-dependent

For agent-transcript, analysis is intentionally marked partial: hidden system prompts, rules, tool definitions, MCP schemas, and provider compaction may not be present.

Example Diagnosis

{
  "issues": [
    {
      "code": "TOOL_USE_DOMINATES_CONTEXT",
      "severity": "critical",
      "message": "Tool inputs dominate the visible context."
    },
    {
      "code": "TOP_TOOL_CONTEXT_HOTSPOT",
      "message": "ApplyPatch is the largest visible tool context hotspot."
    }
  ],
  "diff_hints": [
    {
      "type": "large_addition",
      "request_index": 76,
      "evidence": {
        "added_tokens": 7473,
        "top_added_tool": "ApplyPatch"
      }
    }
  ]
}

Academic trajectory sample:

context-profiler analyze examples/agent-trace/sample.json --format agent-trace --html report.html

Total Input: 11.7K
Messages (assistant): 10.7K
Tool: python_interpreter 2.9K
Warnings: Content duplication 2.3K redundant tokens

Examples

See examples/README.md for runnable fixtures and conversion patterns.

Recommended demo order:

Raw OpenAI/Anthropic request.
Cursor or Claude Code transcript.
Langfuse trace export.
Multi-turn academic trajectories such as pagarsky/agent-trace, cx-cmu/agent_trajectories, or SWE-agent traces.

Research Context

context-profiler is motivated by recent work showing that long-horizon agents are constrained not only by model quality, but also by how their working context is retained, compressed, and reused across turns.

Related work:

ByteDance Seed — Scaling Long-Horizon LLM Agent via Context-Folding
Studies context management for long-horizon agents through folding and summarizing intermediate sub-trajectories. This motivates context-profiler's focus on turn-to-turn context diffs, retained observations, and compression/pruning evidence.
SWE-agent — Agent-Computer Interfaces Enable Automated Software Engineering
Shows the importance of the agent-computer interface for software-engineering agents, motivating analysis of tool calls, terminal output, and artifact churn.
WebArena — A Realistic Web Environment for Building Autonomous Agents
Demonstrates the value of realistic multi-step agent trajectories, motivating support for loop/transcript analysis rather than only single prompt snapshots.

Docs

What It Does Not Do

It does not fetch traces from Langfuse, Hugging Face, Cursor, or Claude Code.
It does not replay agent loops.
It does not execute tools.
It does not replace observability platforms.
It does not pretend agent transcripts are exact raw provider requests.

Development

PYTHONPATH=src uv run --with pytest pytest tests/test_smoke.py -v

Acknowledgements

This project is inspired by and learned from:

context-lens — local proxy for capturing and visualizing LLM API calls
ContextFlame — flamegraph-based token profiling for Claude Code
speedscope — the icicle / flamegraph UI design is inspired by speedscope's interactive visualization

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

YanpengWang

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

May 19, 2026

0.1.0

Mar 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_profiler-0.2.0.tar.gz (84.5 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

context_profiler-0.2.0-py3-none-any.whl (85.3 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file context_profiler-0.2.0.tar.gz.

File metadata

Download URL: context_profiler-0.2.0.tar.gz
Upload date: May 19, 2026
Size: 84.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for context_profiler-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`7bb728d1577235b16edda9c15837d0f54f6c481379713838cefd65cec9eee052`
MD5	`4e45140d24b9f12f420f2c1f348b2f2a`
BLAKE2b-256	`e974c3abc803f55ed9e97422faadecb5e770710ebc4b7c1d23703d1aee087744`

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_profiler-0.2.0.tar.gz:

Publisher: publish.yml on yanpgwang/context-profiler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: context_profiler-0.2.0.tar.gz
- Subject digest: 7bb728d1577235b16edda9c15837d0f54f6c481379713838cefd65cec9eee052
- Sigstore transparency entry: 1572585988
- Sigstore integration time: May 19, 2026
Source repository:
- Permalink: yanpgwang/context-profiler@da2e3489ba238f8153a99201d7e677bf5b917a72
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/yanpgwang
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@da2e3489ba238f8153a99201d7e677bf5b917a72
- Trigger Event: release

File details

Details for the file context_profiler-0.2.0-py3-none-any.whl.

File metadata

Download URL: context_profiler-0.2.0-py3-none-any.whl
Upload date: May 19, 2026
Size: 85.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for context_profiler-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cb14617051bbd9ff361e3def06ca1cdfbed32e011c4a6bdeb44b26f521f59e88`
MD5	`6815db0c684d64e6333dbde0a7ce9960`
BLAKE2b-256	`eda5bbfe49b214d8883fd58f036b0ff45dafaf88aeca01983bfea80ff5cd7ee9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_profiler-0.2.0-py3-none-any.whl:

Publisher: publish.yml on yanpgwang/context-profiler

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: context_profiler-0.2.0-py3-none-any.whl
- Subject digest: cb14617051bbd9ff361e3def06ca1cdfbed32e011c4a6bdeb44b26f521f59e88
- Sigstore transparency entry: 1572586074
- Sigstore integration time: May 19, 2026
Source repository:
- Permalink: yanpgwang/context-profiler@da2e3489ba238f8153a99201d7e677bf5b917a72
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/yanpgwang
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@da2e3489ba238f8153a99201d7e677bf5b917a72
- Trigger Event: release

context-profiler 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

context-profiler

Why

Report Views

Findings Across Public Datasets

Install

Quick Start

CLI Output

Agent-Friendly CLI Harness

Agent Skill Distribution

Supported Inputs

Example Diagnosis

Examples

Research Context

Docs

What It Does Not Do

Development

Acknowledgements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance