Token counting and budget management for MiniMax API

Project description

TokenCUNT

Python Version Version License Status

 _______    _                    _____  * 
 |__   __|  | |                  / ____|    
    | | ___ | | _____ _ __      | |        
    | |/ _ \| |/ / _ \ '_ \    | |        
    | | (_) |   <  __/ | | |   | |____    
    |_|\___/|_|\_\___|_| |_|    \_____|

A smart, token-efficient AI layer for developers.
Stop burning credits. Start knowing exactly what you spend.

Why TokenCUNT?

Most AI tools are careless with your tokens — redundant calls, bloated context, zero visibility. TokenCUNT is the opposite:

Efficient by design — smart batching and context compression minimize wasted calls
Fully transparent — see exactly how many tokens every operation costs, before and after
You're in control — set budgets and limits per session or per task

Installation

# Install TokenCUNT
pip install tokencunt

# Install with CLI dependencies (recommended)
pip install tokencunt[cli]

# Or install in development mode
pip install -e ".[cli,dev]"

Requirements

Python 3.10+
MiniMax API key

VSCode Extension

Download and install the VSCode extension:

# Option 1: Install from VSIX file
code --install-extension tokencunt-vscode/tokencunt-0.1.0.vsix

# Option 2: Manual install
# 1. Open VSCode
# 2. Go to Extensions (Ctrl+Shift+X)
# 3. Click "..." menu → "Install from VSIX"
# 4. Select tokencunt-0.1.0.vsix

VSCode Extension Features:

Status bar with live token count
Inline hover hints for token estimation
Command palette integration
Budget alerts when approaching limit
Quick actions via Ctrl+Shift+P

Quick Start

1. Configure your API key

# Set environment variable
export MINIMAX_API_KEY="your-api-key"
export MINIMAX_GROUP_ID="your-group-id"

# Or create a config file
mkdir -p ~/.tokencunt
cat > ~/.tokencunt/config.yaml << EOF
api_key: "your-api-key"
group_id: "your-group-id"
model: "abab6.5-chat"
default_budget: 10000
EOF

2. Run the CLI

# Show logo and welcome
ts start

# Ask a question with full token tracking
ts ask "explain this function" --file main.py

# Dry run — see the cost before committing
ts ask "refactor this" --file app.py

# Analyze a file and get suggestions
ts analyze --file main.py

# Run multiple tasks from a JSON file
ts batch --file tasks.json

# View a usage report for the current session
ts report

# Session management
ts session new
ts session list
ts session config --budget 5000

Commands

Command	Description
`ts start`	Show logo and welcome message
`ts ask "<prompt>" --file <file>`	Ask a question with token tracking
`ts ask ...` (no flags)	Ask a question directly
`ts ask ... --dry-run`	Preview token cost without API call
`ts analyze --file <file>`	Analyze a file for improvements
`ts analyze --file <file> --focus bugs`	Focus on specific area (bugs, performance, style, security)
`ts batch --file <json>`	Run multiple tasks from JSON file
`ts report`	Show session usage breakdown
`ts report --format json`	JSON output for scripting
`ts session new`	Create a new session
`ts session list`	List all sessions
`ts session config --budget 5000`	Set token budget for session
`ts session clear`	Clear session data
`ts version`	Show version information

Phase 4: Advanced Features

Command	Description
`ts scan <path>`	Scan project for token estimation
`ts scan --extensions py,js --verbose`	Scan with specific extensions
`ts scan --ignore .tokencuntignore`	Scan with custom ignore file
`ts simulate --requests 1000 --tokens 500`	Simulate API costs
`ts simulate --scenario startup --model gpt-4`	Use pre-defined scenario
`ts simulate --users 100 --messages 50 --tokens 300`	User-based scenario
`ts diff original.txt optimized.txt`	Git-style prompt diff
`ts diff --stats`	Show only statistics
`ts optimize prompt.txt`	Optimize with AI + rules
`ts optimize prompt.txt --rules-only`	Rules-only optimization
`ts optimize --show-diff`	Show changes made

Global Options

Option	Description
`-q, --quiet`	Minimal output
`-v, --verbose`	Verbose output
`--json`	JSON output
`--debug`	Debug mode with traceback
`-y, --yes`	Skip confirmations

Example Output

$ ts ask "what does this function do?" --file utils.py

  Estimated tokens: 312
  ─────────────────────────────────────
  Response: This function takes a list and...
  ─────────────────────────────────────
  Tokens used:  input: 312  output: 89  total: 401
  Session total: 1,204 / 5,000 tokens used (24%)

Architecture

┌─────────────────────────────────────┐
│         User Interface              │
│   CLI Tool  │  VSCode Extension     │
└─────────────┬───────────────────────┘
              │
┌─────────────▼───────────────────────┐
│           Core Engine               │
│  - Token counter & tracker          │
│  - Smart batcher                   │
│  - Budget enforcer                 │
│  - Prompt optimizer                │
└─────────────┬───────────────────────┘
              │
┌─────────────▼───────────────────────┐
│         MiniMax M2.5 API            │
└─────────────────────────────────────┘

The core engine is shared — CLI and VSCode extension are just interfaces on top of it.

Project Structure

TokenCUNT/
├── src/tokencunt/
│   ├── __init__.py
│   ├── pyproject.toml
│   ├── config.py                 # Configuration management
│   └── core/
│       ├── __init__.py           # Core exports
│       ├── api_client.py         # MiniMax API calls with retry
│       ├── exceptions.py         # Custom exception classes
│       ├── token_counter.py     # Token counting (tiktoken)
│       ├── budget.py            # Budget enforcement & alerts
│       ├── batcher.py           # Combine small requests
│       ├── optimizer.py         # Compress & strip redundant context
│       └── session.py           # Track usage across sessions
├── cli/
│   ├── __init__.py
│   ├── app.py                   # Typer app entry point
│   ├── logo.py                  # ASCII logo
│   ├── exit_codes.py            # Exit codes
│   ├── formatters.py            # Rich output formatting
│   └── commands/
│       ├── __init__.py
│       ├── ask.py               # Ask command
│       ├── analyze.py           # Analyze command
│       ├── batch.py             # Batch command
│       ├── report.py            # Report command
│       └── session.py           # Session management
├── tests/
├── .planning/                    # GSD planning docs
├── ascii-art.txt               # Logo source
├── pyproject.toml              # Project config
└── README.md

Tech Stack

Tool	Role
Typer	CLI framework
Rich	Terminal output formatting
httpx	Async HTTP client
tiktoken	Token counting
Tenacity	Retry logic
Pydantic	Data validation

Configuration

Environment Variables

MINIMAX_API_KEY=your-api-key
MINIMAX_GROUP_ID=your-group-id

Config File

Create ~/.tokencunt/config.yaml:

api_key: "your-api-key"
group_id: "your-group-id"
model: "abab6.5-chat"
default_budget: 10000

Priority Order

Environment variables (highest priority)
Config file
Hardcoded defaults (lowest priority)

Roadmap

Phase	What	Status
1	Core engine (Python)	✅ Done
2	CLI Tool	✅ Done
3	VSCode Extension	✅ Done
4	Advanced Features (scan, simulate, diff, optimize)	✅ Done

Pro Tips for Maximum Leverage

1. Use `ts scan` Before Starting New Projects

# Get a baseline of your project size
ts scan ./src

# Know your context window limits
# Large projects = split into smaller prompts

2. Set Budget Alerts Early

# Set a monthly budget
ts session config --budget 50000

# The extension will warn you at 80%
# You can stop before hitting the limit

3. Use `ts diff` to Compare Prompt Strategies

# Compare verbose vs concise prompts
ts diff verbose_prompt.txt concise_prompt.txt

# See exactly how much you're saving
# Use the optimized version in production

4. Optimize with Rules-First (Free!)

# Rules-only is instant and free
ts optimize prompt.txt --rules-only

# Then enhance with AI if needed
ts optimize optimized.txt --ai-only --show-diff

5. Simulate Before Scaling

# Before launching to 1000 users
ts simulate --users 1000 --messages 100 --tokens 500 --model gpt-4

# Know your monthly burn rate
# Adjust model to fit budget

6. Use the VSCode Extension for Quick Analysis

Analyze selected code — Select code → right-click → TokenCUNT: Analyze
Quick prompts — Use command palette for fast access
Status bar — Always know your current session usage

7. Batch Similar Tasks

# Create tasks.json
# {
#   "tasks": [
#     {"prompt": "Explain function 1", "file": "src/a.py"},
#     {"prompt": "Explain function 2", "file": "src/b.py"}
#   ]
# }

ts batch --file tasks.json --parallel

8. Use `--dry-run` for Cost Previewing

# Always check cost first
ts ask "refactor this entire file" --file huge.py --dry-run

# If too expensive, break into smaller chunks
ts ask "refactor first 50 lines" --file huge.py

Example Workflows

Daily Development

# Morning: Check budget
ts report

# During: Analyze before asking
ts analyze --file problem.py --focus bugs

# Ask with tracking
ts ask "fix this bug" --file problem.py

# End: Review spending
ts report

Project Token Audit

# 1. Scan entire project
ts scan ./src --verbose

# 2. Simulate your usage pattern
ts simulate --scenario startup --model minimax

# 3. Optimize your most-used prompts
ts optimize common_prompts.txt --rules-only --output optimized/

# 4. Diff to compare
ts diff common_prompts.txt optimized/common.txt

Production Cost Control

# 1. Set strict budget
ts session config --budget 10000

# 2. Use cheaper models for simple tasks
ts analyze --file simple.py --model minimax

# 3. Reserve GPT-4 for complex tasks
ts ask "complex refactor" --file hard.py --model gpt-4

FAQ

What exact problem does TokenCUNT solve?

TokenCUNT solves the problem of uncontrolled API costs when using AI models. Most developers have no visibility into how many tokens their prompts consume, leading to surprise bills at the end of the month. TokenCUNT provides:

Pre-call estimation — Know token cost BEFORE making API calls
Budget enforcement — Hard limits prevent runaway spending
Usage transparency — Real-time tracking of all API usage
Session history — Know exactly what you spent each session

Who is the primary user?

AI developers building apps with LLMs
SaaS builders integrating AI into products
Students learning about LLMs on limited budgets
Freelancers managing client API budgets

What input does the user give?

Raw text — Direct prompts
Files — .txt, .py, .js, .md, or any text file
Multiple files — Via batch processing

What output does the tool return?

Token count — Before and after API calls
Cost estimation — Based on model pricing
Session reports — Detailed breakdown of usage

Which models are supported?

Currently: MiniMax models (abab6.5-chat family)

Future support planned:

OpenAI (GPT-4, GPT-3.5)
Anthropic (Claude)
Google Gemini

Does it support multiple tokenizers?

Yes — uses tiktoken which supports multiple encodings:

cl100k_base (GPT-4, Claude, etc.)
p50k_base (GPT-3)
r50k_base (GPT-2)

What makes TokenCUNT better than existing token counters?

Pre-call estimation — Most counters only count AFTER the call
Budget enforcement — Most counters only track, don't prevent overspending
Integrated CLI — Ready to use, not just a library
IDE integration — VSCode extension with inline hints

Future features planned?

Cost alerts via webhooks
Multi-user team dashboards
Integration with more IDEs (JetBrains, Neovim)

Contributing

Built by a student, for developers who actually care about not burning credits. PRs and issues welcome.

License

MIT

Project details

Release history Release notifications | RSS feed

This version

1.0.0

Mar 16, 2026

0.1.0

Mar 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokencunt-1.0.0.tar.gz (43.4 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokencunt-1.0.0-py3-none-any.whl (50.5 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file tokencunt-1.0.0.tar.gz.

File metadata

Download URL: tokencunt-1.0.0.tar.gz
Upload date: Mar 16, 2026
Size: 43.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for tokencunt-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`dbc074cb332f49ed8e394f6a5bc25d31d6aea644c568b0a5160106aa75e00abe`
MD5	`eb5707b5956ba41aedc9d6ec7376ab27`
BLAKE2b-256	`aee8602dace81b04a2b0c782f1af54acfdced8f6d682e9844dcb368c1d739050`

See more details on using hashes here.

File details

Details for the file tokencunt-1.0.0-py3-none-any.whl.

File metadata

Download URL: tokencunt-1.0.0-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 50.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for tokencunt-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`41b25b2ba213544eb1b99f7c21168f8fddfb00817c1e789351ee699bb3370b9e`
MD5	`0c7f8f8bd7bb0659efd0f316e0377b18`
BLAKE2b-256	`057f098c28cc377d10422bb576aeb85d891ab0dd65fc8f98b2249ff14ed625ec`

See more details on using hashes here.

tokencunt 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

TokenCUNT

Why TokenCUNT?

Installation

Requirements

VSCode Extension

Quick Start

1. Configure your API key

2. Run the CLI

Commands

Phase 4: Advanced Features

Global Options

Example Output

Architecture

Project Structure

Tech Stack

Configuration

Environment Variables

Config File

Priority Order

Roadmap

Pro Tips for Maximum Leverage

1. Use ts scan Before Starting New Projects

2. Set Budget Alerts Early

3. Use ts diff to Compare Prompt Strategies

4. Optimize with Rules-First (Free!)

5. Simulate Before Scaling

6. Use the VSCode Extension for Quick Analysis

7. Batch Similar Tasks

8. Use --dry-run for Cost Previewing

Example Workflows

Daily Development

Project Token Audit

Production Cost Control

FAQ

What exact problem does TokenCUNT solve?

Who is the primary user?

What input does the user give?

What output does the tool return?

Which models are supported?

Does it support multiple tokenizers?

What makes TokenCUNT better than existing token counters?

Future features planned?

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

1. Use `ts scan` Before Starting New Projects

3. Use `ts diff` to Compare Prompt Strategies

8. Use `--dry-run` for Cost Previewing