Skip to main content

LLMコスト最適化SDKラッパー - 1行追加でキャッシュ・ルーティング・コスト追跡

Project description

TokenShield

Cut your LLM API costs by up to 93%. One line of code. Zero quality loss.

TokenShield sits between your code and LLM providers, automatically optimizing every API call through intelligent routing, caching, and PII protection.


Quick Start

pip install tokenshield
import anthropic
from tokenshield import shield

client = shield(anthropic.Anthropic())

# Use it exactly as before — TokenShield optimizes behind the scenes
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, world!"}],
)

Two Ways to Use

Option A: SDK Wrapper (1 line change)

from tokenshield import shield
client = shield(anthropic.Anthropic())   # Anthropic
client = shield(openai.OpenAI())         # OpenAI

Option B: API Gateway (URL change only)

python -m tokenshield.gateway
client = openai.OpenAI(
    base_url="http://localhost:8800/v1",
    api_key="your-key",
)

How It Works

Your Code
  |
  v
+---------------------------+
|       TokenShield         |
|                           |
|  1. PII Protection        |  <- Mask personal data before external routing
|  2. Cache Check            |  <- Same query? Return instantly ($0)
|  3. Smart Routing          |  <- Simple task? Route to cheaper model
|  4. Audit Log              |  <- Full compliance trail
|  5. Cost Tracking          |  <- See exactly what you saved
|                           |
+---------------------------+
  |
  v
Claude / GPT / DeepSeek / Gemini

Features

Smart Multi-Model Routing

Automatically routes requests to the cheapest model that can handle the task:

Task Difficulty Routed To Cost
Simple (translation, formatting) DeepSeek V3 / Gemini Flash $0.08-0.28/M
Medium (analysis, code) DeepSeek V3 $0.28/M
Complex (strategy, multi-step reasoning) Original model Full price

Routing is based on keyword analysis, message structure, code complexity, and confidence scoring. No LLM call needed for routing decisions.

PII Auto-Removal

Personal data is detected and masked before routing to external models:

  • Email addresses
  • Phone numbers (Japanese / international)
  • Credit card numbers
  • API keys and tokens
  • Japanese addresses
  • IP addresses

Masking is reversible — PII is restored in the response before returning to your code.

Semantic Cache

Goes beyond exact-match caching:

"Translate hello to Chinese"    -> cache hit
"translate hello to Chinese"    -> cache hit (case normalization)
"Translate hello to Chinese please" -> cache hit (polite form normalization)
"Write a Python server"         -> cache miss (different query)

Audit Log & Compliance

Every request is logged with full detail:

  • Timestamp, model, routing decision
  • PII detection results
  • Cost (original vs actual)
  • Latency

Generate compliance reports:

from tokenshield.audit import AuditLogger
report = AuditLogger().generate_compliance_report(hours=720)

Dashboard

python -m tokenshield.dashboard          # CLI
python -m tokenshield.dashboard --web    # Streamlit Web UI

Works With Every Provider

from tokenshield import shield

# Anthropic
client = shield(anthropic.Anthropic())

# OpenAI
client = shield(openai.OpenAI())

# Groq, Mistral, Together, etc.
client = shield(openai.OpenAI(base_url="https://api.groq.com/openai/v1", api_key="..."))

Configuration

from tokenshield import shield, ShieldConfig

client = shield(anthropic.Anthropic(), config=ShieldConfig(
    # Routing
    enable_routing=True,
    deepseek_api_key="sk-xxx",           # Optional: enables DeepSeek routing
    routing_threshold_easy=0.3,          # Below = easy (route to cheap model)
    routing_threshold_hard=0.7,          # Above = hard (keep original)

    # PII
    enable_pii_removal=True,             # Mask personal data before routing

    # Cache
    enable_cache=True,
    cache_ttl=300,                       # Seconds

    # Tracking
    enable_tracking=True,
))

Without any API keys for alternative models, you still get caching and PII protection.


Benchmarks

Tested on 10 real business tasks:

Metric Without Shield With Shield
Cost $0.098 $0.054
Quality 8.3/10 8.0/10
Savings 45%

DeepSeek V3 quality test (9 tasks):

Metric Result
Success Rate 9/9
Cost Reduction 93%

Privacy & Security

  • No data leaves your environment
  • PII is masked before any external routing
  • No telemetry — all data stays in ~/.tokenshield/
  • Full audit trail for ISMS/SOC2 compliance
  • DeepSeek routing is optional

Testing

pip install pytest
python -m pytest tests/ -v
# 118 tests, all passing

Requirements

  • Python 3.9+
  • anthropic and/or openai SDK

License

MIT


Built by TSUNAGU Inc. — Battle-tested on 80 AI agents running 24/7.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenshield-0.4.0.tar.gz (53.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenshield-0.4.0-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file tokenshield-0.4.0.tar.gz.

File metadata

  • Download URL: tokenshield-0.4.0.tar.gz
  • Upload date:
  • Size: 53.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for tokenshield-0.4.0.tar.gz
Algorithm Hash digest
SHA256 3695dfde4f22d39c17a48a1ce7319d272654821655550155c2f7795b3ff5484f
MD5 c92f800fddb0661fe3104fd45842e685
BLAKE2b-256 e8dc852c949c44e690e14e85c87a26ae5ba81a6214ee6e93eb361e6d2dbfb9c5

See more details on using hashes here.

File details

Details for the file tokenshield-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: tokenshield-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 48.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.12

File hashes

Hashes for tokenshield-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e14ca7346d74df31c6f72d3c01832568c7296de993a4c3c4b629e454594f0bf6
MD5 f2815a35613bd2b8bb493fc177152a53
BLAKE2b-256 f40ee1edebddad9a3ba6260bc85e6c742e4eb87cc0b457a09480795fc76add12

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page