Skip to main content

Cuts AI agent web-fetching costs by up to 99.9% by stripping junk from web pages before they reach your LLM

Reason this release was yanked:

Deprecated. Use xelektron-token-enhancer instead

Project description

Token Enhancer

A local proxy that strips web pages down to clean text before they enter your AI agent's context window.

One fetch of Yahoo Finance: 704,760 tokens → 2,625 tokens. 99.6% reduction.

No API key. No LLM. No GPU. Just Python.

The Problem

AI agents waste most of their token budget loading raw HTML pages into context. A single Yahoo Finance page is 704K tokens of navigation bars, ads, scripts, and junk. Your agent pays for all of it before any reasoning happens.

The Solution

Token Enhancer sits between your agent and the web. It fetches the page, strips the noise, caches the result, and returns only clean data.

Source Raw Tokens After Proxy Reduction
Yahoo Finance (AAPL) 704,760 2,625 99.6%
Wikipedia article 154,440 19,479 87.4%
Hacker News 8,662 859 90.1%

Quick Start

git clone https://github.com/Boof-Pack/token-enhancer.git
cd token-enhancer
chmod +x install.sh
./install.sh
source .venv/bin/activate
python3 test_all.py --live

Usage

As a standalone proxy

source .venv/bin/activate
python3 proxy.py

Then in another terminal:

curl -s http://localhost:8080/fetch \
  -H "content-type: application/json" \
  -d '{"url": "https://finance.yahoo.com/quote/AAPL/"}' \
  | python3 -m json.tool

As an MCP Server (Claude Desktop, Cursor, OpenClaw)

This is the plug and play option. Your AI agent discovers the tools automatically and uses them on its own.

Install the MCP dependency:

source .venv/bin/activate
pip install mcp

Claude Desktop: Add to your config file

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "token-enhancer": {
      "command": "python3",
      "args": ["/FULL/PATH/TO/token-enhancer/mcp_server.py"]
    }
  }
}

Replace /FULL/PATH/TO/ with the actual path to your clone.

Cursor: Add to .cursor/mcp.json in your project:

{
  "mcpServers": {
    "token-enhancer": {
      "command": "python3",
      "args": ["/FULL/PATH/TO/token-enhancer/mcp_server.py"]
    }
  }
}

Once connected, your agent gets three tools:

fetch_clean fetches any URL and returns clean text (86 to 99% smaller)

fetch_clean_batch fetches multiple URLs at once

refine_prompt optional prompt cleanup, shows both versions so you decide

As a LangChain Tool

from langchain.tools import tool
import requests

@tool
def fetch_clean(url: str) -> str:
    """Fetch a URL and return clean text with HTML noise removed."""
    r = requests.post("http://localhost:8080/fetch", json={"url": url})
    return r.json()["content"]

Add fetch_clean to your agent's tool list. Start python3 proxy.py first.

Features

Data Proxy (Layer 2) Fetches any URL, strips HTML/JSON noise, returns clean text. Caches results so repeat fetches are instant. Handles HTML, JSON, and plain text.

Prompt Refiner (Layer 1, opt in) Strips filler words and hedging while protecting tickers, dates, money values, negations, and conversation references. You see both versions and choose.

MCP Server Plug into Claude Desktop, Cursor, OpenClaw, or any MCP client. Agent discovers the tools and uses them automatically.

API Endpoints (proxy mode)

Endpoint Method Description
/fetch POST Fetch URL, strip noise, return clean data
/fetch/batch POST Fetch multiple URLs at once
/refine POST Opt in prompt refinement
/stats GET Session statistics

Run Tests

python3 test_all.py           # Layer 1 only (offline)
python3 test_all.py --live    # Layer 1 + Layer 2 (needs internet)

Roadmap

  • Layer 1: Prompt refiner
  • Layer 2: Data proxy with caching
  • MCP server integration
  • LangChain tool example
  • Browser fallback (Playwright) for bot blocked sites
  • Authenticated session management
  • Layer 3: Output/history compression
  • CLI tool
  • Dashboard UI

Requirements

Python 3.10+. No API keys. No GPU.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_enhancer-0.1.0.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

token_enhancer-0.1.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file token_enhancer-0.1.0.tar.gz.

File metadata

  • Download URL: token_enhancer-0.1.0.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for token_enhancer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d8cbe8b20b72554fe6d296cbf99a5864936de41055434efaf30df3b730a8592d
MD5 cc982e1a60fa64428c8be6d22b6eda9f
BLAKE2b-256 bc5d4fd9d457ebf9d84c26ae034f48db5f6709e80102a5ded6fe604d9b84dcaa

See more details on using hashes here.

File details

Details for the file token_enhancer-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: token_enhancer-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for token_enhancer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d2b6fa11b028e198809f2efac1c041fb5a8ff9028ba4b3c1b9ab878968c9bd1b
MD5 c40969ffd9b8940dd44efb8d8cd4817a
BLAKE2b-256 72dc7543fc94873803538af68e2e13d38ae919b82f6c2db412634458fcf08197

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page