Skip to main content

Open-source runtime ROI diagnosis toolkit for AI Agent applications.

Project description

TokenSaver

CI Python 3.10+ Apache-2.0 Local first Demo input tokens

Find wasted context, tool calls, model calls, and workflow routes in AI agents, locally. Then generate a repair brief for Codex or Claude Code.

TokenSaver records real Agent runs, diagnoses low-ROI patterns with deterministic local rules, and produces an offline report showing what to repair next.

Agent run -> Local trace -> ROI diagnosis -> Repair brief -> Before/after comparison

No hosted account. No required LLM call. No prompt or trace upload by default.

TokenSaver demo changing a high-cost Agent run into a healthy run

See It In 30 Seconds

uvx tokensaver-agent demo

The offline demo writes a before/after benchmark and local HTML panel to .tokensaver-demo/.

Or install it:

python3 -m pip install tokensaver-agent
tokensaver demo
tokensaver open
Input tokens: 32540 -> 2460 (-92.4%)
Output tokens: 7400 -> 580 (-92.2%)
Latency: 31000 -> 1700 (-94.5%)
ROI score: 35 -> 100 (+65)
Result: ACCEPTED

These numbers come from the bundled deterministic demo fixture. They demonstrate the workflow and are not a claim about every Agent application.

The generated share-card.svg can be attached to a PR, issue, release, or post without exposing prompts or tool payloads.

What It Finds

TokenSaver currently detects patterns such as:

  • short requests routed through deep research workflows
  • oversized or repeated context
  • raw tool payloads and repeated uncached tool calls
  • excessive model input and ReAct loop amplification
  • slow tools, latency budget violations, and missing fallbacks
  • answers that are too long for the delivery channel
  • quality guardrail regressions during optimization

It writes:

.tokensaver/
  runs.jsonl
  reports/latest.md
  briefs/latest.md
  panel/index.html

Integrate With A Coding Agent

Paste this into Codex or Claude Code inside your Agent repository:

Integrate TokenSaver into this Agent application:
https://github.com/zhangtao-jayce/TokenSaver

Find the user-message entrypoint, trace route/context/tool/model/final-answer
data for each run, keep all data local, run one test request, and show:
- .tokensaver/reports/latest.md
- .tokensaver/briefs/latest.md
- .tokensaver/panel/index.html

The detailed integration prompt and verification checklist are in docs/集成指南.md.

Minimal Python Integration

from tokensaver import TokenSaver
from tokensaver.integrations import trace_openai_chat_completion

tokensaver = TokenSaver(app="my-agent", channel="chat")

def handle_message(message: str) -> str:
    with tokensaver.run(user_message=message) as run:
        run.set_task(task_type="quick_question", route="default")
        run.add_context("ticket", load_ticket(message), kind="crm")

        response = trace_openai_chat_completion(
            run,
            client=openai_client,
            model="gpt-4.1-mini",
            messages=[{"role": "user", "content": message}],
        )
        answer = response.choices[0].message.content
        run.record_final_answer(answer)
        return answer

Dependency-free adapters are included for:

  • OpenAI Chat Completions and Responses
  • Anthropic Messages
  • LiteLLM
  • LangChain and LangGraph callbacks
  • framework-agnostic callbacks
  • TypeScript and Vercel AI SDK JSON imports

Compare A Repair

After changing the Agent workflow, record an equivalent run and compare it:

tokensaver compare \
  --before BEFORE_RUN_ID \
  --after AFTER_RUN_ID

TokenSaver reports token, latency, ROI score, resolved findings, new findings, and quality blockers. An optimization is rejected when it introduces tracked quality regressions.

Generate a public Markdown report and anonymous SVG card directly from two run files:

tokensaver benchmark \
  --before-file before.json \
  --after-file after.json \
  --output-dir .tokensaver-benchmark

Three deterministic cases are included:

See examples/case-studies/README.md for exact commands.

CLI

# Product demo
tokensaver demo
tokensaver open

# Installation and environment checks
tokensaver version --verbose
tokensaver doctor
tokensaver init-profile --template coding-agent

# Record and inspect a run
tokensaver record-run --file examples/run.json
tokensaver latest --kind summary
tokensaver latest --kind brief
tokensaver latest --kind panel

# Analyze multiple runs
tokensaver list --limit 20
tokensaver top-tools --last 50
tokensaver compare --before RUN_ID --after RUN_ID
tokensaver benchmark --before-file before.json --after-file after.json

If the console script is not on PATH, use python3 -m tokensaver.cli in place of tokensaver.

Profiles

Profiles keep project-specific budgets and quality requirements outside application code:

app: my_agent
channel: chat
budgets:
  quick_question:
    input_tokens: 3000
    output_tokens: 500
    latency_ms: 20000
required_fields:
  quick_question:
    - conclusion
    - next_action

Built-in templates:

chatbot, coding-agent, crm-agent, finance-assistant,
legal-assistant, research-agent, support-bot

MCP

Start the dependency-free stdio server:

tokensaver-mcp

Main tools include:

  • tokensaver.plan_task
  • tokensaver.record_agent_run
  • tokensaver.diagnose_roi
  • tokensaver.generate_repair_brief
  • tokensaver.eval_fixtures
  • tokensaver.doctor

Privacy By Default

TokenSaver is local-first:

  • prompts, context, traces, and tool results are not uploaded by default
  • the core diagnosis loop does not call an LLM
  • stored traces omit raw context and tool text after estimating their size
  • the HTML panel is a static offline file

See OPEN_SOURCE_SCOPE.md and SECURITY.md for the current boundary.

Project Status

TokenSaver is beta software. The local trace, diagnosis, repair brief, comparison, GUI panel, integration helpers, CLI, and MCP server are implemented. It is not currently a hosted observability platform or automatic LLM gateway.

Useful project documents:

Development

git clone https://github.com/zhangtao-jayce/TokenSaver.git
cd TokenSaver
python3 -m unittest discover -s tests
python3 -m py_compile tokensaver/*.py
python3 -m tokensaver.cli demo --store-dir /private/tmp/tokensaver-demo

Contributions that improve real Agent integrations, diagnosis rules, benchmark fixtures, and before/after case studies are especially useful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokensaver_agent-0.6.0.tar.gz (63.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokensaver_agent-0.6.0-py3-none-any.whl (59.7 kB view details)

Uploaded Python 3

File details

Details for the file tokensaver_agent-0.6.0.tar.gz.

File metadata

  • Download URL: tokensaver_agent-0.6.0.tar.gz
  • Upload date:
  • Size: 63.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokensaver_agent-0.6.0.tar.gz
Algorithm Hash digest
SHA256 48053932c9246252e520aa9674728d86801c8d3f4c54dc3f063b805333e3ea89
MD5 b505a40c491795feb4c9fec9ffb65cfe
BLAKE2b-256 0e8a78998b27d7c3e3f6820867b584dd5274d5a423fc59599e161228dc7777d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokensaver_agent-0.6.0.tar.gz:

Publisher: release.yml on zhangtao-jayce/TokenSaver

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tokensaver_agent-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tokensaver_agent-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 11538a895717b6ec5df9f1b42a3b5c8102293199a7842b046e54cbad2bc7be57
MD5 941238057def3f29c78d964d1954e95c
BLAKE2b-256 a0eac52bb4dad4d1e4de2f75fbeeeb36408ba76e0906e8019f5abbfc0da3010d

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokensaver_agent-0.6.0-py3-none-any.whl:

Publisher: release.yml on zhangtao-jayce/TokenSaver

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page