Skip to main content

Token counting and budget management for MiniMax API

Project description

TokenCUNT

Python Version Version License Status Download VSIX

 _______    _                    _____  * 
 |__   __|  | |                  / ____|    
    | | ___ | | _____ _ __      | |        
    | |/ _ \| |/ / _ \ '_ \    | |        
    | | (_) |   <  __/ | | |   | |____    
    |_|\___/|_|\_\___|_| |_|    \_____|

A smart, token-efficient AI layer for developers.
Stop burning credits. Start knowing exactly what you spend.


Why TokenCUNT?

Most AI tools are careless with your tokens — redundant calls, bloated context, zero visibility. TokenCUNT is the opposite:

  • Efficient by design — smart batching and context compression minimize wasted calls
  • Fully transparent — see exactly how many tokens every operation costs, before and after
  • You're in control — set budgets and limits per session or per task

Installation

# Install TokenCUNT
pip install tokencunt

# Install with CLI dependencies (recommended)
pip install tokencunt[cli]

# Or install in development mode
pip install -e ".[cli,dev]"

Requirements

  • Python 3.10+
  • MiniMax API key

VSCode Extension

Download and install the VSCode extension:

# Option 1: Install from VSIX file
code --install-extension tokencunt-vscode/tokencunt-0.1.0.vsix

# Option 2: Manual install
# 1. Open VSCode
# 2. Go to Extensions (Ctrl+Shift+X)
# 3. Click "..." menu → "Install from VSIX"
# 4. Select tokencunt-0.1.0.vsix

VSCode Extension Features:

  • Status bar with live token count
  • Inline hover hints for token estimation
  • Command palette integration
  • Budget alerts when approaching limit
  • Quick actions via Ctrl+Shift+P

Quick Start

1. Configure your API key

# Set environment variable
export MINIMAX_API_KEY="your-api-key"
export MINIMAX_GROUP_ID="your-group-id"

# Or create a config file
mkdir -p ~/.tokencunt
cat > ~/.tokencunt/config.yaml << EOF
api_key: "your-api-key"
group_id: "your-group-id"
model: "abab6.5-chat"
default_budget: 10000
EOF

2. Run the CLI

# Show logo and welcome
ts start

# Ask a question with full token tracking
ts ask "explain this function" --file main.py

# Dry run — see the cost before committing
ts ask "refactor this" --file app.py

# Analyze a file and get suggestions
ts analyze --file main.py

# Run multiple tasks from a JSON file
ts batch --file tasks.json

# View a usage report for the current session
ts report

# Session management
ts session new
ts session list
ts session config --budget 5000

Commands

Command Description
ts start Show logo and welcome message
ts ask "<prompt>" --file <file> Ask a question with token tracking
ts ask ... (no flags) Ask a question directly
ts ask ... --dry-run Preview token cost without API call
ts analyze --file <file> Analyze a file for improvements
ts analyze --file <file> --focus bugs Focus on specific area (bugs, performance, style, security)
ts batch --file <json> Run multiple tasks from JSON file
ts report Show session usage breakdown
ts report --format json JSON output for scripting
ts session new Create a new session
ts session list List all sessions
ts session config --budget 5000 Set token budget for session
ts session clear Clear session data
ts version Show version information

Phase 4: Advanced Features

Command Description
ts scan <path> Scan project for token estimation
ts scan --extensions py,js --verbose Scan with specific extensions
ts scan --ignore .tokencuntignore Scan with custom ignore file
ts simulate --requests 1000 --tokens 500 Simulate API costs
ts simulate --scenario startup --model gpt-4 Use pre-defined scenario
ts simulate --users 100 --messages 50 --tokens 300 User-based scenario
ts diff original.txt optimized.txt Git-style prompt diff
ts diff --stats Show only statistics
ts optimize prompt.txt Optimize with AI + rules
ts optimize prompt.txt --rules-only Rules-only optimization
ts optimize --show-diff Show changes made

Global Options

Option Description
-q, --quiet Minimal output
-v, --verbose Verbose output
--json JSON output
--debug Debug mode with traceback
-y, --yes Skip confirmations

Example Output

$ ts ask "what does this function do?" --file utils.py

  Estimated tokens: 312
  ─────────────────────────────────────
  Response: This function takes a list and...
  ─────────────────────────────────────
  Tokens used:  input: 312  output: 89  total: 401
  Session total: 1,204 / 5,000 tokens used (24%)

Architecture

┌─────────────────────────────────────┐
│         User Interface              │
│   CLI Tool  │  VSCode Extension     │
└─────────────┬───────────────────────┘
              │
┌─────────────▼───────────────────────┐
│           Core Engine               │
│  - Token counter & tracker          │
│  - Smart batcher                   │
│  - Budget enforcer                 │
│  - Prompt optimizer                │
└─────────────┬───────────────────────┘
              │
┌─────────────▼───────────────────────┐
│         MiniMax M2.5 API            │
└─────────────────────────────────────┘

The core engine is shared — CLI and VSCode extension are just interfaces on top of it.


Project Structure

TokenCUNT/
├── src/tokencunt/
│   ├── __init__.py
│   ├── pyproject.toml
│   ├── config.py                 # Configuration management
│   └── core/
│       ├── __init__.py           # Core exports
│       ├── api_client.py         # MiniMax API calls with retry
│       ├── exceptions.py         # Custom exception classes
│       ├── token_counter.py     # Token counting (tiktoken)
│       ├── budget.py            # Budget enforcement & alerts
│       ├── batcher.py           # Combine small requests
│       ├── optimizer.py         # Compress & strip redundant context
│       └── session.py           # Track usage across sessions
├── cli/
│   ├── __init__.py
│   ├── app.py                   # Typer app entry point
│   ├── logo.py                  # ASCII logo
│   ├── exit_codes.py            # Exit codes
│   ├── formatters.py            # Rich output formatting
│   └── commands/
│       ├── __init__.py
│       ├── ask.py               # Ask command
│       ├── analyze.py           # Analyze command
│       ├── batch.py             # Batch command
│       ├── report.py            # Report command
│       └── session.py           # Session management
├── tests/
├── .planning/                    # GSD planning docs
├── ascii-art.txt               # Logo source
├── pyproject.toml              # Project config
└── README.md

Tech Stack

Tool Role
Typer CLI framework
Rich Terminal output formatting
httpx Async HTTP client
tiktoken Token counting
Tenacity Retry logic
Pydantic Data validation

Configuration

Environment Variables

MINIMAX_API_KEY=your-api-key
MINIMAX_GROUP_ID=your-group-id

Config File

Create ~/.tokencunt/config.yaml:

api_key: "your-api-key"
group_id: "your-group-id"
model: "abab6.5-chat"
default_budget: 10000

Priority Order

  1. Environment variables (highest priority)
  2. Config file
  3. Hardcoded defaults (lowest priority)

Roadmap

Phase What Status
1 Core engine (Python) ✅ Done
2 CLI Tool ✅ Done
3 VSCode Extension ✅ Done
4 Advanced Features (scan, simulate, diff, optimize) ✅ Done

Pro Tips for Maximum Leverage

1. Use ts scan Before Starting New Projects

# Get a baseline of your project size
ts scan ./src

# Know your context window limits
# Large projects = split into smaller prompts

2. Set Budget Alerts Early

# Set a monthly budget
ts session config --budget 50000

# The extension will warn you at 80%
# You can stop before hitting the limit

3. Use ts diff to Compare Prompt Strategies

# Compare verbose vs concise prompts
ts diff verbose_prompt.txt concise_prompt.txt

# See exactly how much you're saving
# Use the optimized version in production

4. Optimize with Rules-First (Free!)

# Rules-only is instant and free
ts optimize prompt.txt --rules-only

# Then enhance with AI if needed
ts optimize optimized.txt --ai-only --show-diff

5. Simulate Before Scaling

# Before launching to 1000 users
ts simulate --users 1000 --messages 100 --tokens 500 --model gpt-4

# Know your monthly burn rate
# Adjust model to fit budget

6. Use the VSCode Extension for Quick Analysis

  • Analyze selected code — Select code → right-click → TokenCUNT: Analyze
  • Quick prompts — Use command palette for fast access
  • Status bar — Always know your current session usage

7. Batch Similar Tasks

# Create tasks.json
# {
#   "tasks": [
#     {"prompt": "Explain function 1", "file": "src/a.py"},
#     {"prompt": "Explain function 2", "file": "src/b.py"}
#   ]
# }

ts batch --file tasks.json --parallel

8. Use --dry-run for Cost Previewing

# Always check cost first
ts ask "refactor this entire file" --file huge.py --dry-run

# If too expensive, break into smaller chunks
ts ask "refactor first 50 lines" --file huge.py

Example Workflows

Daily Development

# Morning: Check budget
ts report

# During: Analyze before asking
ts analyze --file problem.py --focus bugs

# Ask with tracking
ts ask "fix this bug" --file problem.py

# End: Review spending
ts report

Project Token Audit

# 1. Scan entire project
ts scan ./src --verbose

# 2. Simulate your usage pattern
ts simulate --scenario startup --model minimax

# 3. Optimize your most-used prompts
ts optimize common_prompts.txt --rules-only --output optimized/

# 4. Diff to compare
ts diff common_prompts.txt optimized/common.txt

Production Cost Control

# 1. Set strict budget
ts session config --budget 10000

# 2. Use cheaper models for simple tasks
ts analyze --file simple.py --model minimax

# 3. Reserve GPT-4 for complex tasks
ts ask "complex refactor" --file hard.py --model gpt-4

FAQ

What exact problem does TokenCUNT solve?

TokenCUNT solves the problem of uncontrolled API costs when using AI models. Most developers have no visibility into how many tokens their prompts consume, leading to surprise bills at the end of the month. TokenCUNT provides:

  • Pre-call estimation — Know token cost BEFORE making API calls
  • Budget enforcement — Hard limits prevent runaway spending
  • Usage transparency — Real-time tracking of all API usage
  • Session history — Know exactly what you spent each session

Who is the primary user?

  • AI developers building apps with LLMs
  • SaaS builders integrating AI into products
  • Students learning about LLMs on limited budgets
  • Freelancers managing client API budgets

What input does the user give?

  • Raw text — Direct prompts
  • Files.txt, .py, .js, .md, or any text file
  • Multiple files — Via batch processing

What output does the tool return?

  • Token count — Before and after API calls
  • Cost estimation — Based on model pricing
  • Session reports — Detailed breakdown of usage

Which models are supported?

Currently: MiniMax models (abab6.5-chat family)

Future support planned:

  • OpenAI (GPT-4, GPT-3.5)
  • Anthropic (Claude)
  • Google Gemini

Does it support multiple tokenizers?

Yes — uses tiktoken which supports multiple encodings:

  • cl100k_base (GPT-4, Claude, etc.)
  • p50k_base (GPT-3)
  • r50k_base (GPT-2)

What makes TokenCUNT better than existing token counters?

  1. Pre-call estimation — Most counters only count AFTER the call
  2. Budget enforcement — Most counters only track, don't prevent overspending
  3. Integrated CLI — Ready to use, not just a library
  4. IDE integration — VSCode extension with inline hints

Future features planned?

  • Cost alerts via webhooks
  • Multi-user team dashboards
  • Integration with more IDEs (JetBrains, Neovim)

Contributing

Built by a student, for developers who actually care about not burning credits. PRs and issues welcome.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokencunt-1.0.0.tar.gz (43.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokencunt-1.0.0-py3-none-any.whl (50.5 kB view details)

Uploaded Python 3

File details

Details for the file tokencunt-1.0.0.tar.gz.

File metadata

  • Download URL: tokencunt-1.0.0.tar.gz
  • Upload date:
  • Size: 43.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for tokencunt-1.0.0.tar.gz
Algorithm Hash digest
SHA256 dbc074cb332f49ed8e394f6a5bc25d31d6aea644c568b0a5160106aa75e00abe
MD5 eb5707b5956ba41aedc9d6ec7376ab27
BLAKE2b-256 aee8602dace81b04a2b0c782f1af54acfdced8f6d682e9844dcb368c1d739050

See more details on using hashes here.

File details

Details for the file tokencunt-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: tokencunt-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 50.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for tokencunt-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41b25b2ba213544eb1b99f7c21168f8fddfb00817c1e789351ee699bb3370b9e
MD5 0c7f8f8bd7bb0659efd0f316e0377b18
BLAKE2b-256 057f098c28cc377d10422bb576aeb85d891ab0dd65fc8f98b2249ff14ed625ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page