Skip to main content

The Ultimate All-in-One AI Coding Enhancement MCP - Zero-Loss Token Optimization · Intelligent Model Routing · Quality Amplification

Project description

🪙 Tokenette

The Ultimate All-in-One AI Coding Enhancement MCP

Python 3.11+ FastMCP License: MIT

Zero-Loss Token Optimization · Intelligent Model Routing · Quality Amplification

Tokenette makes any AI model perform like GPT-4.5 quality at GPT-4o cost. It achieves 90-99% token reduction without sacrificing code quality through intelligent caching, semantic compression, and dynamic tool discovery.

✨ Key Features

🧠 Intelligent Model Routing

  • Complexity Detection: Automatically routes tasks to the cheapest model that can handle them
  • Budget Tracking: Tracks premium request usage (300/month for Pro)
  • Auto-Mode Discount: Exploits 10% discount when using auto model selection
  • Adaptive Learning: Improves routing based on past interactions

📦 Token Optimization Stack

  • Multi-Layer Cache (L1-L4): 99.8% savings on repeated data
  • Minification Engine: 20-61% savings (JSON/Code/TOON formats)
  • Semantic Compression: 30-50% additional savings
  • Cross-File Deduplication: 40-60% savings on shared code

✨ Quality Amplification

  • Expert Role Framing: Makes cheap models think like senior engineers
  • Chain-of-Thought Injection: Adds reasoning steps automatically
  • Few-Shot Examples: Injects category-specific examples
  • Structured Output: Enforces consistent response formats

🔧 Smart File Operations

  • AST-Based Reading: Extract structure without full content
  • Diff-Based Writing: 97% savings vs. full file rewrites
  • Batch Operations: Combine multiple reads/writes into one request
  • Semantic Search: Find code by meaning, not just text

📚 Context7 Integration

  • Up-to-Date Docs: Fetches current library documentation
  • Intelligent Caching: Caches docs with appropriate TTLs
  • Semantic Search: Find relevant docs across libraries

📊 Real Model Costs (GitHub Copilot Pro)

Model Multiplier Effective Uses/Month Best For
GPT-5 mini 0× (FREE) Quick edits, prototyping
GPT-4.1 0× (FREE) General coding, boilerplate
GPT-4o 0× (FREE) Multimodal, general tasks
Gemini 2.0 Flash 0.25× 1,200 Speed-critical tasks
o4-mini 0.33× 900 Cost-efficient reasoning
Claude Sonnet 4 1× (0.9× auto) 300 Complex logic, multi-file
Gemini 2.5 Pro 300 Large context, architecture
Claude Opus 4.5 100 Critical reasoning
Claude Opus 4 10× 30 Expert-level tasks
GPT-4.5 50× 6 AVOID

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/yourusername/tokenette.git
cd tokenette

# Install with pip
pip install -e .

# Or with uv (recommended)
uv pip install -e .

Basic Usage

# Start the MCP server (stdio transport - default)
tokenette run

# Start with SSE transport
tokenette run --transport sse --port 8000

# View metrics
tokenette metrics

# Analyze code
tokenette analyze src/

VS Code / GitHub Copilot Integration

Add to your mcp.json or settings:

{
  "mcpServers": {
    "tokenette": {
      "command": "tokenette",
      "args": ["run"]
    }
  }
}

Python API

from tokenette import mcp, TaskRouter, QualityAmplifier

# Get optimal model for a task
router = TaskRouter()
decision = router.route("refactor authentication module", {"affected_files": 5})
print(f"Use: {decision.model} ({decision.multiplier}×)")

# Amplify a prompt for cheaper models
amplifier = QualityAmplifier()
result = amplifier.amplify(
    "Write a user authentication service",
    boosters=["expert_role_framing", "chain_of_thought_injection"],
    category="generation",
    context={}
)
print(result.enhanced_prompt)

🛠️ Available Tools

Meta Tools (Start Here!)

  • tokenette_discover_tools - List available tools efficiently (96% token savings)
  • tokenette_get_tool_details - Get full schema for a specific tool

File Operations

  • tokenette_read_file - Smart file reading with multiple strategies
  • tokenette_write_file - Diff-based file writing (97% savings)
  • tokenette_search_code - Semantic code search
  • tokenette_get_structure - AST-based file structure
  • tokenette_batch_read - Read multiple files with deduplication

Code Analysis

  • tokenette_analyze - Full code analysis (complexity, security, style)
  • tokenette_find_bugs - Bug and security issue detection
  • tokenette_complexity - Cyclomatic complexity metrics

Documentation (Context7)

  • tokenette_resolve_lib - Resolve library names to Context7 IDs
  • tokenette_get_docs - Fetch library documentation
  • tokenette_search_docs - Search across library docs

Optimization

  • tokenette_optimize - Apply full optimization pipeline
  • tokenette_route_task - Get model routing recommendation
  • tokenette_amplify - Enhance prompts for cheaper models
  • tokenette_metrics - View session statistics

⚙️ Configuration

Create .tokenette.json in your project root:

{
  "cache": {
    "l1_max_size": 104857600,
    "l1_ttl": 1800,
    "l2_max_size": 2147483648,
    "l2_ttl": 14400,
    "l2_path": ".tokenette/cache/l2"
  },
  "router": {
    "monthly_budget": 300,
    "prefer_free_models": true,
    "auto_mode_discount": 0.1
  },
  "compression": {
    "min_size": 1000,
    "quality_threshold": 0.95
  }
}

Or use environment variables:

export TOKENETTE_CACHE__L1_MAX_SIZE=104857600
export TOKENETTE_ROUTER__MONTHLY_BUDGET=300

📁 Project Structure

tokenette/
├── src/tokenette/
│   ├── __init__.py           # Package exports
│   ├── config.py             # Pydantic configuration
│   ├── server.py             # FastMCP server
│   ├── cli.py                # Typer CLI
│   ├── core/
│   │   ├── cache.py          # Multi-layer cache (L1-L4)
│   │   ├── minifier.py       # JSON/Code/TOON minification
│   │   ├── compressor.py     # Semantic compression
│   │   ├── optimizer.py      # Full pipeline orchestrator
│   │   ├── router.py         # Task routing engine
│   │   └── amplifier.py      # Quality amplification
│   └── tools/
│       ├── meta.py           # Dynamic tool discovery
│       ├── file_ops.py       # File operations
│       ├── analysis.py       # Code analysis
│       └── context7.py       # Documentation integration
├── tests/
├── pyproject.toml
└── README.md

🔬 How It Works

The Three Pillars

  1. Route Right: Assign tasks to the cheapest model that can handle them
  2. Amplify Low: Make free/cheap models produce premium-quality output
  3. Shrink Everything: Minify, compress, cache, batch, deduplicate

Token Optimization Pipeline

Input Data
    ↓
┌─────────────────────────────────────┐
│  Stage 1: Cache Check (L1→L4)      │  99.8% savings on cache hit
└─────────────────────────────────────┘
    ↓ (cache miss)
┌─────────────────────────────────────┐
│  Stage 2: Minification             │  20-61% savings
│  • JSON → compact                   │
│  • Code → remove comments/blanks    │
│  • Arrays → TOON format             │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  Stage 3: Deduplication            │  40-60% savings
│  • Remove repeated structures       │
│  • Cross-file shared code refs      │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  Stage 4: Reference Extraction      │  20-40% savings
│  • Replace repeated objects w/ refs │
│  • Nested object flattening         │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  Stage 5: Semantic Compression      │  30-50% savings
│  • Large text summarization         │
│  • Quality threshold: 0.95          │
└─────────────────────────────────────┘
    ↓
Optimized Output (cache & transmit)

TOON Format (Token-Optimized Object Notation)

For homogeneous arrays, TOON achieves 61% token savings:

Before (JSON):

[{"file":"auth.js","func":"validate","line":45},
 {"file":"auth.js","func":"refresh","line":67}]

After (TOON):

items[2]{file,func,line}:
auth.js,validate,45
auth.js,refresh,67

🧪 Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy src/tokenette

# Linting
ruff check src/tokenette

📜 License

MIT License - see LICENSE for details.

🤝 Contributing

Contributions are welcome! Please read the contributing guidelines first.


Made with ❤️ for the AI coding community

"Make any model perform like GPT-4.5 quality at GPT-4o cost."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenette-2.0.0.tar.gz (76.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenette-2.0.0-py3-none-any.whl (84.5 kB view details)

Uploaded Python 3

File details

Details for the file tokenette-2.0.0.tar.gz.

File metadata

  • Download URL: tokenette-2.0.0.tar.gz
  • Upload date:
  • Size: 76.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tokenette-2.0.0.tar.gz
Algorithm Hash digest
SHA256 b9b8319d6c652e54b5ec4be1b8a3dae5bff233c91ae69979c7d8b689350fd63f
MD5 149104fa1c8d85fd63d0654c72a3d796
BLAKE2b-256 1d9b02101fdff22d7e1cce2d60dd5496228194478d05481dc8999d1998dc4607

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenette-2.0.0.tar.gz:

Publisher: publish.yml on itsmeadarsh2008/tokenette

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tokenette-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: tokenette-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 84.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tokenette-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1b61ef41f6bd51b640dabea27c1391c88b5ee231b94d391787e205115f0aa35e
MD5 81b190abb1bef748917513337bdc19a3
BLAKE2b-256 d46257a4fe5e4b88870d361d26cf7852a010a9e74cf43022f084acffc9fb2423

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenette-2.0.0-py3-none-any.whl:

Publisher: publish.yml on itsmeadarsh2008/tokenette

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page