The Ultimate All-in-One AI Coding Enhancement MCP - Zero-Loss Token Optimization · Intelligent Model Routing · Quality Amplification
Project description
🪙 Tokenette
The Ultimate All-in-One AI Coding Enhancement MCP
Zero-Loss Token Optimization · Intelligent Model Routing · Quality Amplification
Tokenette makes any AI model perform like GPT-4.5 quality at GPT-4o cost. It achieves 90-99% token reduction without sacrificing code quality through intelligent caching, semantic compression, and dynamic tool discovery.
✨ Key Features
🧠 Intelligent Model Routing
- Complexity Detection: Automatically routes tasks to the cheapest model that can handle them
- Budget Tracking: Tracks premium request usage (300/month for Pro)
- Auto-Mode Discount: Exploits 10% discount when using auto model selection
- Adaptive Learning: Improves routing based on past interactions
📦 Token Optimization Stack
- Multi-Layer Cache (L1-L4): 99.8% savings on repeated data
- Minification Engine: 20-61% savings (JSON/Code/TOON formats)
- Semantic Compression: 30-50% additional savings
- Cross-File Deduplication: 40-60% savings on shared code
✨ Quality Amplification
- Expert Role Framing: Makes cheap models think like senior engineers
- Chain-of-Thought Injection: Adds reasoning steps automatically
- Few-Shot Examples: Injects category-specific examples
- Structured Output: Enforces consistent response formats
🔧 Smart File Operations
- AST-Based Reading: Extract structure without full content
- Diff-Based Writing: 97% savings vs. full file rewrites
- Batch Operations: Combine multiple reads/writes into one request
- Semantic Search: Find code by meaning, not just text
📚 Context7 Integration
- Up-to-Date Docs: Fetches current library documentation
- Intelligent Caching: Caches docs with appropriate TTLs
- Semantic Search: Find relevant docs across libraries
📊 Real Model Costs (GitHub Copilot Pro)
| Model | Multiplier | Effective Uses/Month | Best For |
|---|---|---|---|
| GPT-5 mini | 0× (FREE) | ∞ | Quick edits, prototyping |
| GPT-4.1 | 0× (FREE) | ∞ | General coding, boilerplate |
| GPT-4o | 0× (FREE) | ∞ | Multimodal, general tasks |
| Gemini 2.0 Flash | 0.25× | 1,200 | Speed-critical tasks |
| o4-mini | 0.33× | 900 | Cost-efficient reasoning |
| Claude Sonnet 4 | 1× (0.9× auto) | 300 | Complex logic, multi-file |
| Gemini 2.5 Pro | 1× | 300 | Large context, architecture |
| Claude Opus 4.5 | 3× | 100 | Critical reasoning |
| Claude Opus 4 | 10× | 30 | Expert-level tasks |
| GPT-4.5 | 50× | 6 | AVOID |
🚀 Quick Start
Installation
# Clone the repository
git clone https://github.com/yourusername/tokenette.git
cd tokenette
# Install with pip
pip install -e .
# Or with uv (recommended)
uv pip install -e .
Basic Usage
# Start the MCP server (stdio transport - default)
tokenette run
# Start with SSE transport
tokenette run --transport sse --port 8000
# View metrics
tokenette metrics
# Analyze code
tokenette analyze src/
VS Code / GitHub Copilot Integration
Add to your mcp.json or settings:
{
"mcpServers": {
"tokenette": {
"command": "tokenette",
"args": ["run"]
}
}
}
Python API
from tokenette import mcp, TaskRouter, QualityAmplifier
# Get optimal model for a task
router = TaskRouter()
decision = router.route("refactor authentication module", {"affected_files": 5})
print(f"Use: {decision.model} ({decision.multiplier}×)")
# Amplify a prompt for cheaper models
amplifier = QualityAmplifier()
result = amplifier.amplify(
"Write a user authentication service",
boosters=["expert_role_framing", "chain_of_thought_injection"],
category="generation",
context={}
)
print(result.enhanced_prompt)
🛠️ Available Tools
Meta Tools (Start Here!)
tokenette_discover_tools- List available tools efficiently (96% token savings)tokenette_get_tool_details- Get full schema for a specific tool
File Operations
tokenette_read_file- Smart file reading with multiple strategiestokenette_write_file- Diff-based file writing (97% savings)tokenette_search_code- Semantic code searchtokenette_get_structure- AST-based file structuretokenette_batch_read- Read multiple files with deduplication
Code Analysis
tokenette_analyze- Full code analysis (complexity, security, style)tokenette_find_bugs- Bug and security issue detectiontokenette_complexity- Cyclomatic complexity metrics
Documentation (Context7)
tokenette_resolve_lib- Resolve library names to Context7 IDstokenette_get_docs- Fetch library documentationtokenette_search_docs- Search across library docs
Optimization
tokenette_optimize- Apply full optimization pipelinetokenette_route_task- Get model routing recommendationtokenette_amplify- Enhance prompts for cheaper modelstokenette_metrics- View session statistics
⚙️ Configuration
Create .tokenette.json in your project root:
{
"cache": {
"l1_max_size": 104857600,
"l1_ttl": 1800,
"l2_max_size": 2147483648,
"l2_ttl": 14400,
"l2_path": ".tokenette/cache/l2"
},
"router": {
"monthly_budget": 300,
"prefer_free_models": true,
"auto_mode_discount": 0.1
},
"compression": {
"min_size": 1000,
"quality_threshold": 0.95
}
}
Or use environment variables:
export TOKENETTE_CACHE__L1_MAX_SIZE=104857600
export TOKENETTE_ROUTER__MONTHLY_BUDGET=300
📁 Project Structure
tokenette/
├── src/tokenette/
│ ├── __init__.py # Package exports
│ ├── config.py # Pydantic configuration
│ ├── server.py # FastMCP server
│ ├── cli.py # Typer CLI
│ ├── core/
│ │ ├── cache.py # Multi-layer cache (L1-L4)
│ │ ├── minifier.py # JSON/Code/TOON minification
│ │ ├── compressor.py # Semantic compression
│ │ ├── optimizer.py # Full pipeline orchestrator
│ │ ├── router.py # Task routing engine
│ │ └── amplifier.py # Quality amplification
│ └── tools/
│ ├── meta.py # Dynamic tool discovery
│ ├── file_ops.py # File operations
│ ├── analysis.py # Code analysis
│ └── context7.py # Documentation integration
├── tests/
├── pyproject.toml
└── README.md
🔬 How It Works
The Three Pillars
- Route Right: Assign tasks to the cheapest model that can handle them
- Amplify Low: Make free/cheap models produce premium-quality output
- Shrink Everything: Minify, compress, cache, batch, deduplicate
Token Optimization Pipeline
Input Data
↓
┌─────────────────────────────────────┐
│ Stage 1: Cache Check (L1→L4) │ 99.8% savings on cache hit
└─────────────────────────────────────┘
↓ (cache miss)
┌─────────────────────────────────────┐
│ Stage 2: Minification │ 20-61% savings
│ • JSON → compact │
│ • Code → remove comments/blanks │
│ • Arrays → TOON format │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ Stage 3: Deduplication │ 40-60% savings
│ • Remove repeated structures │
│ • Cross-file shared code refs │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ Stage 4: Reference Extraction │ 20-40% savings
│ • Replace repeated objects w/ refs │
│ • Nested object flattening │
└─────────────────────────────────────┘
↓
┌─────────────────────────────────────┐
│ Stage 5: Semantic Compression │ 30-50% savings
│ • Large text summarization │
│ • Quality threshold: 0.95 │
└─────────────────────────────────────┘
↓
Optimized Output (cache & transmit)
TOON Format (Token-Optimized Object Notation)
For homogeneous arrays, TOON achieves 61% token savings:
Before (JSON):
[{"file":"auth.js","func":"validate","line":45},
{"file":"auth.js","func":"refresh","line":67}]
After (TOON):
items[2]{file,func,line}:
auth.js,validate,45
auth.js,refresh,67
🧪 Development
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Type checking
mypy src/tokenette
# Linting
ruff check src/tokenette
📜 License
MIT License - see LICENSE for details.
🤝 Contributing
Contributions are welcome! Please read the contributing guidelines first.
Made with ❤️ for the AI coding community
"Make any model perform like GPT-4.5 quality at GPT-4o cost."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenette-2.0.0.tar.gz.
File metadata
- Download URL: tokenette-2.0.0.tar.gz
- Upload date:
- Size: 76.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9b8319d6c652e54b5ec4be1b8a3dae5bff233c91ae69979c7d8b689350fd63f
|
|
| MD5 |
149104fa1c8d85fd63d0654c72a3d796
|
|
| BLAKE2b-256 |
1d9b02101fdff22d7e1cce2d60dd5496228194478d05481dc8999d1998dc4607
|
Provenance
The following attestation bundles were made for tokenette-2.0.0.tar.gz:
Publisher:
publish.yml on itsmeadarsh2008/tokenette
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenette-2.0.0.tar.gz -
Subject digest:
b9b8319d6c652e54b5ec4be1b8a3dae5bff233c91ae69979c7d8b689350fd63f - Sigstore transparency entry: 893719239
- Sigstore integration time:
-
Permalink:
itsmeadarsh2008/tokenette@81434dbeb72116a01b43b5eae20002a55cafd0c7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/itsmeadarsh2008
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@81434dbeb72116a01b43b5eae20002a55cafd0c7 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tokenette-2.0.0-py3-none-any.whl.
File metadata
- Download URL: tokenette-2.0.0-py3-none-any.whl
- Upload date:
- Size: 84.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b61ef41f6bd51b640dabea27c1391c88b5ee231b94d391787e205115f0aa35e
|
|
| MD5 |
81b190abb1bef748917513337bdc19a3
|
|
| BLAKE2b-256 |
d46257a4fe5e4b88870d361d26cf7852a010a9e74cf43022f084acffc9fb2423
|
Provenance
The following attestation bundles were made for tokenette-2.0.0-py3-none-any.whl:
Publisher:
publish.yml on itsmeadarsh2008/tokenette
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenette-2.0.0-py3-none-any.whl -
Subject digest:
1b61ef41f6bd51b640dabea27c1391c88b5ee231b94d391787e205115f0aa35e - Sigstore transparency entry: 893719288
- Sigstore integration time:
-
Permalink:
itsmeadarsh2008/tokenette@81434dbeb72116a01b43b5eae20002a55cafd0c7 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/itsmeadarsh2008
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@81434dbeb72116a01b43b5eae20002a55cafd0c7 -
Trigger Event:
workflow_dispatch
-
Statement type: