Open-source local-first context compression and token reduction pipeline for Claude Code with hybrid retrieval (BM25 + vectors), reranking, and AST-aware chunking.

These details have not been verified by PyPI

Project description

Token Reducer

Cut Claude API costs by 90%+ with intelligent context compression

The open-source alternative to expensive context management tools.

Easy Install • Features • Documentation • Contributing

The Problem

Every time you use Claude with a large codebase, you're paying for thousands of tokens that aren't relevant to your query. Most context management tools either:

Send everything (expensive)
Truncate blindly (loses important context)
Require heavy Language Servers (slow, resource-intensive)

The Solution

Token Reducer is a local-first, intelligent context compression pipeline that:

Reduces tokens by 90-98% while preserving semantic relevance
Runs entirely locally — no API calls, no data leaving your machine
Works in milliseconds — faster than Language Server alternatives
Understands code semantically — AST parsing, not just text matching

┌─────────────────┐     ┌───────────────┐     ┌──────────────────┐
│  Your Codebase  │────▶│ Token Reducer │────▶│  Compressed      │
│  (50,000 tokens)│     │   Pipeline    │     │  Context (500t)  │
└─────────────────┘     └───────────────┘     └──────────────────┘
                              │
                    ┌─────────┴─────────┐
                    │  - AST Chunking   │
                    │  - BM25 + Vector  │
                    │  - TextRank       │
                    │  - Import Graph   │
                    │  - 2-Hop Symbols  │
                    └───────────────────┘

Easy Install

Option 1 — Claude Code `/plugin` Command (Recommended)

Step 1: Register the marketplace (one-time setup):

/plugin marketplace add Madhan230205/token-reducer

This registers the marketplace as Madhan230205-token-reducer.

Step 2: Install:

/plugin install token-reducer@Madhan230205-token-reducer

For project-scoped install:

/plugin install token-reducer@Madhan230205-token-reducer --scope project

Already ran Step 1 before? Just run /plugin install token-reducer@Madhan230205-token-reducer — no need to add the marketplace again.

Option 2 — Git Clone (Manual)

# 1. Clone into your Claude plugins folder
git clone https://github.com/Madhan230205/token-reducer.git ~/.claude/plugins/token-reducer

# 2. Install dependencies (optional but recommended for best results)
pip install -r ~/.claude/plugins/token-reducer/requirements-optional.txt

Windows users: Replace ~/.claude/plugins/ with %USERPROFILE%\.claude\plugins\

Then open ~/.claude/settings.json and add:

{
  "plugins": ["~/.claude/plugins/token-reducer"]
}

Restart Claude Code. Done.

What requirements-optional.txt installs:

Package	Purpose
`sentence-transformers`	Neural embeddings for smarter retrieval
`hnswlib` / `faiss-cpu`	Fast approximate nearest-neighbor search
`tree-sitter` + language grammars	AST-based code chunking (Python, JS, TS, Go, Rust, Java, C/C++, Ruby)

If you skip this step, Token Reducer still works using hash embeddings and regex chunking — no ML libraries required.

Option 3 — Zero-Dependency Quick Start

No pip, no ML libs — runs immediately after cloning:

git clone https://github.com/Madhan230205/token-reducer.git
cd token-reducer
python scripts/context_pipeline.py run \
  --inputs ./src \
  --query "Find auth logic" \
  --embedding-backend hash \
  --db .cache/index.db

Features

Core Pipeline

Hybrid Retrieval — BM25 + semantic vector search with intelligent fallback
AST-Based Chunking — Tree-sitter parsing for Python, TypeScript, Go, Rust, Java, and more
TextRank Compression — Graph-based sentence scoring for intelligent summarization
Sub-100ms Queries — SQLite FTS5 + HNSW indexes for instant results
Local-First — Everything runs on your machine, no external APIs

LSP-Killer Features

Import Graph — Automatically maps file dependencies without Language Server
2-Hop Symbol Expansion — Auto "go-to-definition" for referenced functions
Diff Protocol — SEARCH/REPLACE edit format with automatic application
Semantic Clustering — Groups similar chunks to avoid redundancy

Enterprise Ready

Fully Configurable — 40+ tunable parameters in settings.json
Embedding Flexibility — ML models or hash fallback (zero dependencies)
Query Caching — Intelligent TTL-based caching for repeated queries
Session Memory — Tracks context across conversation turns

Documentation

How It Works

Query → FTS(BM25) → (Vector fallback if needed) → Merge → Top 5 → Compress

Full pipeline:

PREPROCESS → INDEX → RETRIEVE → RE-RANK → COMPRESS → CONTEXT PACKET

Basic Usage

# Index your codebase
python scripts/context_pipeline.py index --inputs ./src --db .cache/index.db

# Query with compression
python scripts/context_pipeline.py query \
  --query "How does authentication work?" \
  --db .cache/index.db \
  --json

# One-shot: index + query
python scripts/context_pipeline.py run \
  --inputs ./src \
  --query "Find the database connection logic" \
  --db .cache/index.db

Configuration

All settings in settings.json:

{
  "tokenReducer": {
    "chunkSizeWords": 220,
    "embeddingModel": "jinaai/jina-embeddings-v2-base-code",
    "hybridMode": "fallback",
    "astChunkingEnabled": true,
    "textRankEnabled": true,
    "lspFeatures": {
      "importGraphEnabled": true,
      "twoHopExpansionEnabled": true
    }
  }
}

Full Configuration Reference

Setting	Default	Description
`chunkSizeWords`	220	Target words per chunk
`embeddingBackend`	"ml"	"ml" for neural, "hash" for zero-dep
`embeddingModel`	jina-v2-code	Code-optimized embeddings
`hybridMode`	"fallback"	"fallback" or "always" for vector
`astChunkingEnabled`	true	Use tree-sitter AST parsing
`textRankEnabled`	true	Graph-based sentence scoring
`importGraphEnabled`	true	Track file dependencies
`twoHopExpansionEnabled`	true	Auto-expand referenced symbols
`compressionWordBudget`	350	Max words in compressed output

Zero-Dependency Mode

Run without any ML libraries:

python scripts/context_pipeline.py run \
  --inputs ./src \
  --query "Find auth logic" \
  --embedding-backend hash \
  --db .cache/index.db

Apply Code Edits

python scripts/apply_diff.py --input claude_response.txt --dir ./src
python scripts/apply_diff.py --input response.txt --dry-run

Architecture

Technology Stack

Storage: SQLite with FTS5 + custom embeddings table
Chunking: Tree-sitter AST parsing with regex fallback
Embeddings: Jina Code v2 (or zero-dependency hash embeddings)
ANN Search: HNSW via hnswlib (with FAISS fallback)
Compression: TextRank + query-relevance scoring

Repository Structure

token-reducer/
├── .claude-plugin/plugin.json
├── .mcp.json
├── .env.example
├── settings.json
├── requirements-optional.txt
├── scripts/
├── hooks/
├── commands/
├── agents/
├── skills/
└── evals/

Contributing

If anyone is interested in contributing, this project is open to contributions. Please see contribute.md for contribution guidelines.

git clone https://github.com/Madhan230205/token-reducer.git
cd token-reducer
pip install -e ".[dev]"
python scripts/context_pipeline.py self-test

License

MIT License — see LICENSE for details.

Acknowledgments

Tree-sitter for AST parsing
Sentence Transformers for embeddings
SQLite FTS5 for blazing-fast text search
hnswlib for approximate nearest neighbors

Star this repo if Token Reducer saves you money!

Report Bug • Request Feature • Discussions

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.4.0

Apr 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_token_reducer-1.4.0.tar.gz (43.9 kB view details)

Uploaded Apr 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

claude_token_reducer-1.4.0-py3-none-any.whl (47.1 kB view details)

Uploaded Apr 3, 2026 Python 3

File details

Details for the file claude_token_reducer-1.4.0.tar.gz.

File metadata

Download URL: claude_token_reducer-1.4.0.tar.gz
Upload date: Apr 3, 2026
Size: 43.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claude_token_reducer-1.4.0.tar.gz
Algorithm	Hash digest
SHA256	`f49a56b79a604e6ac0397f667735cbf8b9388b7124fa30a96be7102bc0569045`
MD5	`94f139ec1ce7c3b12d0dcfb31481eb9e`
BLAKE2b-256	`3561e333808affab23f600d698a7e3dd0c258f26b0dfdd87abe5a2170da425bf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for claude_token_reducer-1.4.0.tar.gz:

Publisher: publish.yml on Madhan230205/token-reducer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: claude_token_reducer-1.4.0.tar.gz
- Subject digest: f49a56b79a604e6ac0397f667735cbf8b9388b7124fa30a96be7102bc0569045
- Sigstore transparency entry: 1228548639
- Sigstore integration time: Apr 3, 2026
Source repository:
- Permalink: Madhan230205/token-reducer@d403d3c171fd4ed44a003859f4a138ce529da6d2
- Branch / Tag: refs/tags/v1.4.0
- Owner: https://github.com/Madhan230205
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d403d3c171fd4ed44a003859f4a138ce529da6d2
- Trigger Event: push

File details

Details for the file claude_token_reducer-1.4.0-py3-none-any.whl.

File metadata

Download URL: claude_token_reducer-1.4.0-py3-none-any.whl
Upload date: Apr 3, 2026
Size: 47.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claude_token_reducer-1.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0904efd9f3eb7545dc04194404223b259498a6e5d6c6781b8b545a12b397ded1`
MD5	`d84aebdef0a18167eafead3e32d897fb`
BLAKE2b-256	`f450b8af928f176385a690aaeb6fb61443c8cbce10be9091ecbfa34ee4a32c9f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for claude_token_reducer-1.4.0-py3-none-any.whl:

Publisher: publish.yml on Madhan230205/token-reducer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: claude_token_reducer-1.4.0-py3-none-any.whl
- Subject digest: 0904efd9f3eb7545dc04194404223b259498a6e5d6c6781b8b545a12b397ded1
- Sigstore transparency entry: 1228548643
- Sigstore integration time: Apr 3, 2026
Source repository:
- Permalink: Madhan230205/token-reducer@d403d3c171fd4ed44a003859f4a138ce529da6d2
- Branch / Tag: refs/tags/v1.4.0
- Owner: https://github.com/Madhan230205
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d403d3c171fd4ed44a003859f4a138ce529da6d2
- Trigger Event: push

claude-token-reducer 1.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Token Reducer

Cut Claude API costs by 90%+ with intelligent context compression

The Problem

The Solution

Easy Install

Option 1 — Claude Code /plugin Command (Recommended)

Option 2 — Git Clone (Manual)

Option 3 — Zero-Dependency Quick Start

Features

Core Pipeline

LSP-Killer Features

Enterprise Ready

Documentation

How It Works

Basic Usage

Configuration

Zero-Dependency Mode

Apply Code Edits

Architecture

Technology Stack

Repository Structure

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Option 1 — Claude Code `/plugin` Command (Recommended)