LLMコスト最適化SDKラッパー - 1行追加でキャッシュ・ルーティング・コスト追跡
Project description
TokenShield
Cut your LLM API costs by up to 93%. One line of code. Zero quality loss.
TokenShield sits between your code and LLM providers, automatically optimizing every API call through intelligent routing, caching, and PII protection.
Quick Start
pip install tokenshield
import anthropic
from tokenshield import shield
client = shield(anthropic.Anthropic())
# Use it exactly as before — TokenShield optimizes behind the scenes
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello, world!"}],
)
Two Ways to Use
Option A: SDK Wrapper (1 line change)
from tokenshield import shield
client = shield(anthropic.Anthropic()) # Anthropic
client = shield(openai.OpenAI()) # OpenAI
Option B: API Gateway (URL change only)
python -m tokenshield.gateway
client = openai.OpenAI(
base_url="http://localhost:8800/v1",
api_key="your-key",
)
How It Works
Your Code
|
v
+---------------------------+
| TokenShield |
| |
| 1. PII Protection | <- Mask personal data before external routing
| 2. Cache Check | <- Same query? Return instantly ($0)
| 3. Smart Routing | <- Simple task? Route to cheaper model
| 4. Audit Log | <- Full compliance trail
| 5. Cost Tracking | <- See exactly what you saved
| |
+---------------------------+
|
v
Claude / GPT / DeepSeek / Gemini
Features
Smart Multi-Model Routing
Automatically routes requests to the cheapest model that can handle the task:
| Task Difficulty | Routed To | Cost |
|---|---|---|
| Simple (translation, formatting) | DeepSeek V3 / Gemini Flash | $0.08-0.28/M |
| Medium (analysis, code) | DeepSeek V3 | $0.28/M |
| Complex (strategy, multi-step reasoning) | Original model | Full price |
Routing is based on keyword analysis, message structure, code complexity, and confidence scoring. No LLM call needed for routing decisions.
PII Auto-Removal
Personal data is detected and masked before routing to external models:
- Email addresses
- Phone numbers (Japanese / international)
- Credit card numbers
- API keys and tokens
- Japanese addresses
- IP addresses
Masking is reversible — PII is restored in the response before returning to your code.
Semantic Cache
Goes beyond exact-match caching:
"Translate hello to Chinese" -> cache hit
"translate hello to Chinese" -> cache hit (case normalization)
"Translate hello to Chinese please" -> cache hit (polite form normalization)
"Write a Python server" -> cache miss (different query)
Audit Log & Compliance
Every request is logged with full detail:
- Timestamp, model, routing decision
- PII detection results
- Cost (original vs actual)
- Latency
Generate compliance reports:
from tokenshield.audit import AuditLogger
report = AuditLogger().generate_compliance_report(hours=720)
Dashboard
python -m tokenshield.dashboard # CLI
python -m tokenshield.dashboard --web # Streamlit Web UI
Works With Every Provider
from tokenshield import shield
# Anthropic
client = shield(anthropic.Anthropic())
# OpenAI
client = shield(openai.OpenAI())
# Groq, Mistral, Together, etc.
client = shield(openai.OpenAI(base_url="https://api.groq.com/openai/v1", api_key="..."))
Configuration
from tokenshield import shield, ShieldConfig
client = shield(anthropic.Anthropic(), config=ShieldConfig(
# Routing
enable_routing=True,
deepseek_api_key="sk-xxx", # Optional: enables DeepSeek routing
routing_threshold_easy=0.3, # Below = easy (route to cheap model)
routing_threshold_hard=0.7, # Above = hard (keep original)
# PII
enable_pii_removal=True, # Mask personal data before routing
# Cache
enable_cache=True,
cache_ttl=300, # Seconds
# Tracking
enable_tracking=True,
))
Without any API keys for alternative models, you still get caching and PII protection.
Benchmarks
Tested on 10 real business tasks:
| Metric | Without Shield | With Shield |
|---|---|---|
| Cost | $0.098 | $0.054 |
| Quality | 8.3/10 | 8.0/10 |
| Savings | — | 45% |
DeepSeek V3 quality test (9 tasks):
| Metric | Result |
|---|---|
| Success Rate | 9/9 |
| Cost Reduction | 93% |
Privacy & Security
- No data leaves your environment
- PII is masked before any external routing
- No telemetry — all data stays in
~/.tokenshield/ - Full audit trail for ISMS/SOC2 compliance
- DeepSeek routing is optional
Testing
pip install pytest
python -m pytest tests/ -v
# 118 tests, all passing
Requirements
- Python 3.9+
anthropicand/oropenaiSDK
License
MIT
Built by TSUNAGU Inc. — Battle-tested on 80 AI agents running 24/7.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenshield-0.4.0.tar.gz.
File metadata
- Download URL: tokenshield-0.4.0.tar.gz
- Upload date:
- Size: 53.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3695dfde4f22d39c17a48a1ce7319d272654821655550155c2f7795b3ff5484f
|
|
| MD5 |
c92f800fddb0661fe3104fd45842e685
|
|
| BLAKE2b-256 |
e8dc852c949c44e690e14e85c87a26ae5ba81a6214ee6e93eb361e6d2dbfb9c5
|
File details
Details for the file tokenshield-0.4.0-py3-none-any.whl.
File metadata
- Download URL: tokenshield-0.4.0-py3-none-any.whl
- Upload date:
- Size: 48.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e14ca7346d74df31c6f72d3c01832568c7296de993a4c3c4b629e454594f0bf6
|
|
| MD5 |
f2815a35613bd2b8bb493fc177152a53
|
|
| BLAKE2b-256 |
f40ee1edebddad9a3ba6260bc85e6c742e4eb87cc0b457a09480795fc76add12
|