Skip to main content

Optimized MCP server for AI coding agents. Structural codebase navigation + persistent memory engine — 97.9% accuracy at −80% tokens on tsbench.

Project description

⚡ Token Savior — v4.0

One MCP server. One profile. 97.9% on tsbench at −80% tokens. Structural code navigation + persistent memory engine for AI coding agents.

Version PyPI Tests Benchmark Python 3.11+ MCP CI

📖 mibayy.github.io/token-savior — project site + benchmark landing 🧪 github.com/Mibayy/tsbench — benchmark source + fixtures


Benchmark — 96 real coding tasks (Claude Opus 4.7, May 2026)

Plain Claude Code With Token Savior v4.0
Score 141 / 180 (78.3%) 188 / 192 (97.9%)
Active tokens / task 17 221 3 395 (−80%)
Wall time / task 110.6 s 18.9 s (−83%)

Reproduces with the optimized profile (single env var). See BENCHMARK-SUMMARY.


Quick start

pip install "token-savior-recall[mcp]"

Add to your MCP config (e.g. Claude Code):

{
  "mcpServers": {
    "token-savior-recall": {
      "command": "/path/to/venv/bin/token-savior",
      "env": {
        "WORKSPACE_ROOTS": "/path/to/project1,/path/to/project2",
        "TOKEN_SAVIOR_CLIENT": "claude-code",
        "TOKEN_SAVIOR_PROFILE": "optimized"
      }
    }
  }
}

That's it. TOKEN_SAVIOR_PROFILE=optimized ships the Pareto-optimum config that wins tsbench. It bundles :

  • tiny_plus (15 hot tools manifest)
  • thin inputSchema (−44% manifest)
  • capture sandbox disabled
  • memory hooks gated for cross-project safety

No other tuning needed.


What it does

Claude Code reads whole files to answer questions about three lines, and forgets everything the moment a session ends. Token Savior fixes both.

It indexes your codebase by symbol — functions, classes, imports, call graph — so the model navigates by pointer instead of by cat. Measured reduction: 97% fewer chars injected across 170+ real sessions.

On top of that sits a persistent memory engine. Every decision, bugfix, convention, guardrail and session rollup is stored in SQLite WAL + FTS5 + vector embeddings, ranked by Bayesian validity and ROI, and re-injected as a compact delta at the start of the next session.


Profile comparison

Profile Tools exposed Manifest tokens When to use
optimized 15 ~1.5 KT Recommended default — Pareto win on tsbench
auto adaptive ~1-2 KT Per-client telemetry-based (experimental)
tiny 6 ~0.6 KT Minimal hot loop
lean 51 ~4 KT Legacy — broader surface
full 68 ~6 KT Everything exposed

You probably want optimized.


Token savings

Operation Plain Claude Token Savior Reduction
find_symbol("send_message") 41M chars (full read) 67 chars −99.9%
get_function_source("compile") grep + cat chain 4.5K chars direct
get_change_impact("LLMClient") impossible 16K chars new capability
96-task tsbench (Opus, plain vs ts) 17 221 active/task 3 395 active/task −80%

Install

pip (MCP server)

pip install "token-savior-recall[mcp]"
# Optional hybrid vector search :
pip install "token-savior-recall[mcp,memory-vector]"

uvx (no venv, no clone)

uvx token-savior-recall

Claude Code one-liner

claude mcp add token-savior -- /path/to/venv/bin/token-savior

Development

git clone https://github.com/Mibayy/token-savior
cd token-savior
python3 -m venv .venv
.venv/bin/pip install -e ".[mcp,dev]"
pytest tests/ -q

Bonus : ts CLI for non-MCP agents

If you use an agent without MCP support (Cursor, Aider, Continue, scripts, CI), there's also a ts command that exposes a subset of the tools via shell :

ts use /path/to/project
ts get my_function          # JSON output
ts search 'pattern'
ts daemon start             # ~145ms per call vs 1.5s cold fork

On Claude Code, prefer the MCP server — measured cheaper than CLI on Opus 4.7. The CLI is there for the portability case.


Optional env vars

Var Purpose
TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID Critical-observation feed
TS_VIEWER_PORT Web viewer dashboard
TS_AUTO_EXTRACT=1 + TS_API_KEY LLM auto-extraction of memory observations
TS_CAPTURE_DISABLED=1 Skip read-side capture sandboxing (default in optimized)
TS_MEMORY_DISABLE=1 Silence memory hooks (clean-context workloads)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_savior_recall-4.0.0.tar.gz (743.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

token_savior_recall-4.0.0-py3-none-any.whl (424.7 kB view details)

Uploaded Python 3

File details

Details for the file token_savior_recall-4.0.0.tar.gz.

File metadata

  • Download URL: token_savior_recall-4.0.0.tar.gz
  • Upload date:
  • Size: 743.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for token_savior_recall-4.0.0.tar.gz
Algorithm Hash digest
SHA256 74685e9b59bbdc96d4abd77ac0b10a6eaa86a1f3f3a6aea0b79f28dd2d84d7f1
MD5 a66d5437df138262fc071e5236652e34
BLAKE2b-256 46b06728fe22f9505f1309a5aa4939bbfa3024f3af706ba7fd527c115b39c9c1

See more details on using hashes here.

File details

Details for the file token_savior_recall-4.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for token_savior_recall-4.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 75cbbe12a061158cc6497c2549f4149d01b96616c07167ed388961ec7f871dc9
MD5 11c0225c5cbbee1485ea10eb5fa1a292
BLAKE2b-256 65de97c07848e933415b28e0d382934ddf1010145f1c914088815b0ebeddd9f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page