Open-source runtime ROI diagnosis toolkit for AI Agent applications.
Project description
TokenSaver
Find wasted context, tool calls, model calls, and workflow routes in AI agents, locally. Then generate a repair brief for Codex or Claude Code.
TokenSaver records real Agent runs, diagnoses low-ROI patterns with deterministic local rules, and produces an offline report showing what to repair next.
Agent run -> Local trace -> ROI diagnosis -> Repair brief -> Before/after comparison
No hosted account. No required LLM call. No prompt or trace upload by default.
See It In 30 Seconds
uvx tokensaver-agent demo
The offline demo writes a before/after benchmark and local HTML panel to .tokensaver-demo/.
Or install it:
python3 -m pip install tokensaver-agent
tokensaver demo
tokensaver open
Input tokens: 32540 -> 2460 (-92.4%)
Output tokens: 7400 -> 580 (-92.2%)
Latency: 31000 -> 1700 (-94.5%)
ROI score: 35 -> 100 (+65)
Result: ACCEPTED
These numbers come from the bundled deterministic demo fixture. They demonstrate the workflow and are not a claim about every Agent application.
The generated share-card.svg can be attached to a PR, issue, release, or post without exposing prompts or tool payloads.
What It Finds
TokenSaver currently detects patterns such as:
- short requests routed through deep research workflows
- oversized or repeated context
- raw tool payloads and repeated uncached tool calls
- excessive model input and ReAct loop amplification
- slow tools, latency budget violations, and missing fallbacks
- answers that are too long for the delivery channel
- quality guardrail regressions during optimization
It writes:
.tokensaver/
runs.jsonl
reports/latest.md
briefs/latest.md
panel/index.html
Integrate With A Coding Agent
Paste this into Codex or Claude Code inside your Agent repository:
Integrate TokenSaver into this Agent application:
https://github.com/zhangtao-jayce/TokenSaver
Find the user-message entrypoint, trace route/context/tool/model/final-answer
data for each run, keep all data local, run one test request, and show:
- .tokensaver/reports/latest.md
- .tokensaver/briefs/latest.md
- .tokensaver/panel/index.html
The detailed integration prompt and verification checklist are in docs/集成指南.md.
Minimal Python Integration
from tokensaver import TokenSaver
from tokensaver.integrations import trace_openai_chat_completion
tokensaver = TokenSaver(app="my-agent", channel="chat")
def handle_message(message: str) -> str:
with tokensaver.run(user_message=message) as run:
run.set_task(task_type="quick_question", route="default")
run.add_context("ticket", load_ticket(message), kind="crm")
response = trace_openai_chat_completion(
run,
client=openai_client,
model="gpt-4.1-mini",
messages=[{"role": "user", "content": message}],
)
answer = response.choices[0].message.content
run.record_final_answer(answer)
return answer
Dependency-free adapters are included for:
- OpenAI Chat Completions and Responses
- Anthropic Messages
- LiteLLM
- LangChain and LangGraph callbacks
- framework-agnostic callbacks
- TypeScript and Vercel AI SDK JSON imports
Compare A Repair
After changing the Agent workflow, record an equivalent run and compare it:
tokensaver compare \
--before BEFORE_RUN_ID \
--after AFTER_RUN_ID
TokenSaver reports token, latency, ROI score, resolved findings, new findings, and quality blockers. An optimization is rejected when it introduces tracked quality regressions.
Generate a public Markdown report and anonymous SVG card directly from two run files:
tokensaver benchmark \
--before-file before.json \
--after-file after.json \
--output-dir .tokensaver-benchmark
Three deterministic cases are included:
See examples/case-studies/README.md for exact commands.
CLI
# Product demo
tokensaver demo
tokensaver open
# Installation and environment checks
tokensaver version --verbose
tokensaver doctor
tokensaver init-profile --template coding-agent
# Record and inspect a run
tokensaver record-run --file examples/run.json
tokensaver latest --kind summary
tokensaver latest --kind brief
tokensaver latest --kind panel
# Analyze multiple runs
tokensaver list --limit 20
tokensaver top-tools --last 50
tokensaver compare --before RUN_ID --after RUN_ID
tokensaver benchmark --before-file before.json --after-file after.json
If the console script is not on PATH, use python3 -m tokensaver.cli in place of tokensaver.
Profiles
Profiles keep project-specific budgets and quality requirements outside application code:
app: my_agent
channel: chat
budgets:
quick_question:
input_tokens: 3000
output_tokens: 500
latency_ms: 20000
required_fields:
quick_question:
- conclusion
- next_action
Built-in templates:
chatbot, coding-agent, crm-agent, finance-assistant,
legal-assistant, research-agent, support-bot
MCP
Start the dependency-free stdio server:
tokensaver-mcp
Main tools include:
tokensaver.plan_tasktokensaver.record_agent_runtokensaver.diagnose_roitokensaver.generate_repair_brieftokensaver.eval_fixturestokensaver.doctor
Privacy By Default
TokenSaver is local-first:
- prompts, context, traces, and tool results are not uploaded by default
- the core diagnosis loop does not call an LLM
- stored traces omit raw context and tool text after estimating their size
- the HTML panel is a static offline file
See OPEN_SOURCE_SCOPE.md and SECURITY.md for the current boundary.
Project Status
TokenSaver is beta software. The local trace, diagnosis, repair brief, comparison, GUI panel, integration helpers, CLI, and MCP server are implemented. It is not currently a hosted observability platform or automatic LLM gateway.
Useful project documents:
Development
git clone https://github.com/zhangtao-jayce/TokenSaver.git
cd TokenSaver
python3 -m unittest discover -s tests
python3 -m py_compile tokensaver/*.py
python3 -m tokensaver.cli demo --store-dir /private/tmp/tokensaver-demo
Contributions that improve real Agent integrations, diagnosis rules, benchmark fixtures, and before/after case studies are especially useful.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokensaver_agent-0.6.0.tar.gz.
File metadata
- Download URL: tokensaver_agent-0.6.0.tar.gz
- Upload date:
- Size: 63.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48053932c9246252e520aa9674728d86801c8d3f4c54dc3f063b805333e3ea89
|
|
| MD5 |
b505a40c491795feb4c9fec9ffb65cfe
|
|
| BLAKE2b-256 |
0e8a78998b27d7c3e3f6820867b584dd5274d5a423fc59599e161228dc7777d9
|
Provenance
The following attestation bundles were made for tokensaver_agent-0.6.0.tar.gz:
Publisher:
release.yml on zhangtao-jayce/TokenSaver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokensaver_agent-0.6.0.tar.gz -
Subject digest:
48053932c9246252e520aa9674728d86801c8d3f4c54dc3f063b805333e3ea89 - Sigstore transparency entry: 1789342268
- Sigstore integration time:
-
Permalink:
zhangtao-jayce/TokenSaver@f0b2a333d7b701df06bdfac378dc9d15c2035589 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/zhangtao-jayce
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f0b2a333d7b701df06bdfac378dc9d15c2035589 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tokensaver_agent-0.6.0-py3-none-any.whl.
File metadata
- Download URL: tokensaver_agent-0.6.0-py3-none-any.whl
- Upload date:
- Size: 59.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11538a895717b6ec5df9f1b42a3b5c8102293199a7842b046e54cbad2bc7be57
|
|
| MD5 |
941238057def3f29c78d964d1954e95c
|
|
| BLAKE2b-256 |
a0eac52bb4dad4d1e4de2f75fbeeeb36408ba76e0906e8019f5abbfc0da3010d
|
Provenance
The following attestation bundles were made for tokensaver_agent-0.6.0-py3-none-any.whl:
Publisher:
release.yml on zhangtao-jayce/TokenSaver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokensaver_agent-0.6.0-py3-none-any.whl -
Subject digest:
11538a895717b6ec5df9f1b42a3b5c8102293199a7842b046e54cbad2bc7be57 - Sigstore transparency entry: 1789342289
- Sigstore integration time:
-
Permalink:
zhangtao-jayce/TokenSaver@f0b2a333d7b701df06bdfac378dc9d15c2035589 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/zhangtao-jayce
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f0b2a333d7b701df06bdfac378dc9d15c2035589 -
Trigger Event:
workflow_dispatch
-
Statement type: