Skip to main content

LLMコスト最適化SDKラッパー - 1行追加でキャッシュ・ルーティング・コスト追跡

Project description

TokenShield

Cut your LLM API costs by up to 93%. One line of code. Zero quality loss.

TokenShield is a Python SDK wrapper that sits between your code and LLM providers (Anthropic, OpenAI, etc.), automatically optimizing every API call through intelligent caching and routing.


Quick Start

pip install tokenshield
import anthropic
from tokenshield import shield

# Wrap your client — that's it
client = shield(anthropic.Anthropic())

# Use it exactly as before
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, world!"}],
)

Your existing code stays the same. TokenShield optimizes behind the scenes.


How It Works

Your Code
  │
  ▼
┌─────────────────────────┐
│      TokenShield        │
│                         │
│  1. Cache Check         │  ← Same query? Return instantly ($0)
│  2. Smart Routing       │  ← Simple task? Route to DeepSeek (10x cheaper)
│  3. Cost Tracking       │  ← Dashboard shows exactly how much you saved
│                         │
└─────────────────────────┘
  │
  ▼
Claude / GPT / DeepSeek

What gets optimized:

Optimization How Savings
Cache Identical queries return cached responses 100% per hit
Smart Routing Simple tasks (translation, summarization) → DeepSeek V3 ~93%
Cost Tracking See exactly where your money goes Awareness

What stays the same:

  • Your code (1 line change only)
  • API response format (100% compatible)
  • Quality for complex tasks (routed to original model)

Works With Every Provider

# Anthropic (Claude)
from tokenshield import shield
client = shield(anthropic.Anthropic())

# OpenAI (GPT-4o)
client = shield(openai.OpenAI())

# Groq
client = shield(openai.OpenAI(base_url="https://api.groq.com/openai/v1", api_key="..."))

# Any OpenAI-compatible API
client = shield(openai.OpenAI(base_url="https://...", api_key="..."))

Configuration

from tokenshield import shield, ShieldConfig

client = shield(anthropic.Anthropic(), config=ShieldConfig(
    # Routing
    enable_routing=True,
    deepseek_api_key="sk-xxx",      # Enable DeepSeek routing (optional)
    routing_threshold=0.5,           # 0 = route everything, 1 = route nothing

    # Cache
    enable_cache=True,
    cache_ttl=300,                   # Cache lifetime in seconds

    # Tracking
    enable_tracking=True,
    tracking_dir="~/.tokenshield",   # Where to store usage data
))

Without DeepSeek key

TokenShield works without a DeepSeek API key — you still get caching benefits. Add a DeepSeek key to unlock routing savings.


Dashboard

# CLI dashboard
python -m tokenshield.dashboard

# Web dashboard (Streamlit)
python -m tokenshield.dashboard --web
╔══════════════════════════════════════════════╗
║         TokenShield  Savings Report          ║
╚══════════════════════════════════════════════╝

  Total Savings:    $127.50 (85.6%)
  Requests:         12,340
  Cache Hit Rate:   23.4%
  DeepSeek Routed:  52.1%

Benchmarks

Tested on 10 real business tasks (translation, code generation, analysis, strategy):

Metric Without Shield With Shield
Cost $0.098 $0.054
Quality 8.3/10 8.0/10
Savings 45%

DeepSeek V3 vs Claude Sonnet quality test (9 tasks):

Metric Result
Success Rate 9/9 (100%)
Cost Reduction 93%
Quality Equivalent

Routing Logic

TokenShield routes based on task complexity:

Complexity Signals Routed to
Simple Translation, summarization, templates DeepSeek V3
Medium Analysis, code generation DeepSeek V3
Complex Strategy, multi-step reasoning, tool use Original model

Tool use (tools= parameter) always stays on the original model for reliability.


How Routing Decides

The router uses lightweight heuristics (no API call needed):

  • Keywords: "translate" / "summarize" → simple
  • Tool use: Any tools defined → keep original
  • System prompt length: Long system prompts → keep original
  • Message complexity: Short, single-turn → simple

No LLM is called to make routing decisions. Zero overhead.


Privacy & Security

  • No data leaves your environment — TokenShield runs locally in your Python process
  • No external servers — Cache is in-memory or local Redis
  • No telemetry — Usage data stays in ~/.tokenshield/
  • DeepSeek routing is optional — Disable with enable_routing=False

Requirements

  • Python 3.9+
  • anthropic and/or openai SDK
  • No other dependencies

License

MIT


Links


Built by TSUNAGU Inc. — Battle-tested on 80 AI agents running 24/7.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenshield-0.1.0.tar.gz (28.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenshield-0.1.0-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file tokenshield-0.1.0.tar.gz.

File metadata

  • Download URL: tokenshield-0.1.0.tar.gz
  • Upload date:
  • Size: 28.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for tokenshield-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4e1ee5a85e1f77b90cbec42385d2353f399e0862751ebd72e4ee58f6f989562a
MD5 975a4e5c02e7afdcf489bf75f2fcc2de
BLAKE2b-256 e4425e7545f4f5c3b198f915e6d3f5bbc906b9c05b8ed38d7527a63486a39ee1

See more details on using hashes here.

File details

Details for the file tokenshield-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tokenshield-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for tokenshield-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e3341510ce60c7c5de26eef5cd79b73661535c035163ae43bdd34a58c74c391
MD5 6a2b85c8c8e53bb7b3f1e14f96b58591
BLAKE2b-256 993f51e2aab441ba1824758ee7a59e9f4c8d53d04e292db27096d77472da4afa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page