A fast, user-friendly CLI tool to count tokens in text files using tiktoken, with LLM context limit comparison and rich formatting.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

puya-khalili

These details have not been verified by PyPI

Project description

Token Counter

A simple, fast, and user-friendly command-line tool to count the number of tokens in a text file. It provides a progress bar for large files and displays the output in a clean, readable format.

This tool uses the tiktoken library, which is the same tokenizer used by OpenAI for its large language models.

Features

Fast Tokenization: Leverages tiktoken for high-performance token counting.
Progress Bar: A rich-powered progress bar shows the status of large files.
Styled Output: Displays results in a clean, formatted table.
Flexible Encoding Selection: Choose specific tiktoken encodings via a flag or an interactive menu.
Multiple File/Directory Support: Count tokens across multiple specified files or all supported files within a directory.
Exclusion Patterns: Exclude files or directories using glob patterns.
File Extension Control: Override default file extensions or add new ones to customize which files are processed. Supports wildcard auto-discovery.
Recursive Directory Scanning: Optionally scan subdirectories recursively for comprehensive project analysis.
LLM Context Limit Comparison: Compare token counts against common Large Language Model context window limits. These limits are loaded from src/token_counter/llm_limits.json and can be customized.
Stdin Support: Process text piped directly to the tool.
Easy to Use: Simple command-line interface for quick use.

Installation & Usage

Option 1: Install from PyPI (Recommended)

pip install token-counter-cli

Option 2: Install from source

git clone https://github.com/puya/token-counter-cli.git
cd token-counter-cli
uv init
uv add typer rich tiktoken InquirerPy
uv pip install -e .

Usage

# Basic usage - count tokens using default 'cl100k_base' encoding
token-counter my_document.txt

# Count tokens from stdin
echo "Your text here" | token-counter

# Count tokens in multiple files
token-counter file1.txt file2.md

# Count tokens in all supported files within a directory
token-counter my_project_folder/

Usage Options

Model Selection

# Use a specific encoding model
token-counter my_document.txt --model p50k_base
token-counter my_document.txt -m p50k_base

# Interactively select the encoding from a list
token-counter my_document.txt --select-encoding
token-counter my_document.txt -s

File Extension Control

# Override default extensions (only process specified extensions)
token-counter my_project_folder/ --extension .xml,.yaml,.toml
token-counter my_project_folder/ -e .xml,.yaml,.toml

# Add new extensions to the default list
token-counter my_project_folder/ --add-extensions .log,.temp
token-counter my_project_folder/ -a .log,.temp

# Auto-discover all extensions in target directory
token-counter my_project_folder/ --add-extensions "*"
token-counter my_project_folder/ -a "*"

File/Directory Exclusion

# Exclude specific files or directories using glob patterns
token-counter my_project_folder/ --exclude "*.log" --exclude "node_modules/"
token-counter my_project_folder/ -x "*.log" -x "node_modules/"

Recursive Directory Scanning

# Scan directories recursively (includes subdirectories)
token-counter my_project_folder/ --recursive
token-counter my_project_folder/ -r

# Combine recursive with wildcard extension discovery
token-counter my_project_folder/ -r -a "*"

# Recursive scan with exclusions
token-counter my_project_folder/ -r -x "node_modules/" -x ".git/"

LLM Context Limit Comparison

# Compare token count against common LLM context window limits
token-counter my_long_article.txt --check-limits
token-counter my_long_article.txt -c

Combined Options

# Complex example combining multiple options
token-counter my_project_folder/ -s -x "*.test.py" -c -e .py,.js -r

# Process only Python files recursively, exclude tests, and check limits
token-counter . --extension .py --exclude "*test*" --check-limits --recursive

# Auto-discover extensions recursively with exclusions
token-counter my_project/ -r -a "*" -x "node_modules/" -x ".git/"

# Add log files to processing and use specific model
token-counter logs/ --add-extensions .log --model p50k_base --recursive

Complete Option Reference

Option	Short	Description
`--model`	`-m`	Specify the encoding model (e.g., 'cl100k_base', 'p50k_base')
`--select-encoding`	`-s`	Interactively select the encoding model from a list
`--extension`	`-e`	Override default file extensions (comma-separated)
`--add-extensions`	`-a`	Add to default file extensions (comma-separated). Use `*` for auto-discovery
`--exclude`	`-x`	Exclude files/directories using glob patterns (repeatable)
`--check-limits`	`-c`	Compare token count against LLM context window limits
`--recursive`	`-r`	Recursively scan subdirectories when processing directories
`--help`		Show help message and exit

Examples

```bash
token-counter test_article.txt
token-counter test_article.txt -m p50k_base
token-counter test_article.txt -s
echo "Hello world" | token-counter
token-counter README.md test_article.txt
token-counter src/
token-counter . --exclude "*.md" --exclude "src/"
token-counter test_article.txt -c
token-counter . --extension .py,.js
token-counter . -e .py,.js
token-counter . --add-extensions .log,.xml
token-counter . -a "*" 
token-counter . --recursive
token-counter . -r -a "*" -x ".git/"
```

Configuration Files

The token-counter tool uses two JSON configuration files to customize its behavior:

File Extensions Configuration (`src/token_counter/config/allowed_extensions.json`)

This file defines which file extensions are processed by default when scanning directories. The current default extensions include:

{
  "default_extensions": [
    ".txt", ".md", ".py", ".js", ".ts", 
    ".json", ".html", ".css"
  ]
}

Command-line overrides:

--extension or -e: Override the default extensions entirely. Only files with the specified extensions will be processed.
--add-extensions or -a: Add new extensions to the default list without removing the existing ones. Use * for auto-discovery.

If both flags are provided, --extension takes precedence and a warning will be displayed.

LLM Context Limits Configuration (`src/token_counter/config/llm_limits.json`)

This file contains context window limits for major LLM providers and models, used when the --check-limits flag is specified. The file includes the latest models from:

OpenAI: GPT-4.1 series (1M tokens), GPT-4.5 (1M tokens), GPT-4o series
Anthropic: Claude 4 Opus/Sonnet (200K tokens), Claude 3.7/3.5 Sonnet
Google: Gemini 2.5 Pro/Flash (1M tokens), Gemini 1.5 Pro (2M tokens)
Meta: Llama 4 Scout (10M tokens), Llama 4 Maverick (1M tokens), Llama 3.x series
xAI: Grok 3 (~131K tokens)
Mistral: Large 2, Medium 3, Small 3.1 (128K tokens)
Cohere: Command A (256K tokens), Command R/R+ (128K tokens)

You can edit this file to add, remove, or modify models and their corresponding token limits to suit your needs.

Example usage:

token-counter large_document.txt --check-limits

This will show how your token count compares against all configured model limits.

Adding to PATH

To use the token-counter command from anywhere in your system, you need to add the virtual environment's bin directory to your shell's PATH.

Get the full path to the bin directory:

pwd
# Copy the output and append /.venv/bin to it.
# For example: /Users/you/token-counter/.venv/bin

Add the path to your shell's configuration file:

For Bash (usually ~/.bashrc or ~/.bash_profile):

echo 'export PATH="/path/to/your/project/.venv/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

For Zsh (usually ~/.zshrc):

echo 'export PATH="/path/to/your/project/.venv/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Now you can run token-counter from any directory.

Contributing

For information about the development process and how to contribute, see RELEASING.md.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

puya-khalili

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.6

Nov 19, 2025

This version

0.1.5

Nov 19, 2025

0.1.4

Jul 1, 2025

0.1.3

Jul 1, 2025

0.1.2

Jul 1, 2025

0.1.1

Jul 1, 2025

0.1.0

Jul 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_counter_cli-0.1.5.tar.gz (14.2 kB view details)

Uploaded Nov 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

token_counter_cli-0.1.5-py3-none-any.whl (12.3 kB view details)

Uploaded Nov 19, 2025 Python 3

File details

Details for the file token_counter_cli-0.1.5.tar.gz.

File metadata

Download URL: token_counter_cli-0.1.5.tar.gz
Upload date: Nov 19, 2025
Size: 14.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for token_counter_cli-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`cd2ec8743ba5ae5a5c2a13e7117f98e627b3a1e03869ca1e6f769c2148b1dd63`
MD5	`842d60abb4773161aecd789b756d6853`
BLAKE2b-256	`dd3725f199414212b458d21a7271cfcfdc25b0f16367b8daa4c6424d89b31bfd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for token_counter_cli-0.1.5.tar.gz:

Publisher: release.yml on puya/token-counter-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: token_counter_cli-0.1.5.tar.gz
- Subject digest: cd2ec8743ba5ae5a5c2a13e7117f98e627b3a1e03869ca1e6f769c2148b1dd63
- Sigstore transparency entry: 709064691
- Sigstore integration time: Nov 19, 2025
Source repository:
- Permalink: puya/token-counter-cli@c5bd380d17a1c259355578f24b25fc79a4bd4546
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/puya
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c5bd380d17a1c259355578f24b25fc79a4bd4546
- Trigger Event: push

File details

Details for the file token_counter_cli-0.1.5-py3-none-any.whl.

File metadata

Download URL: token_counter_cli-0.1.5-py3-none-any.whl
Upload date: Nov 19, 2025
Size: 12.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for token_counter_cli-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e84d111aaff050331c422f550b2d3e54436efefa3a9d058bac84623b5d5ee0b9`
MD5	`de616a72563aa3b2582fe6d8b580db4d`
BLAKE2b-256	`5748971dbcf446d794b92714f27799d61b842d5344506a37d277be5b3ba5390b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for token_counter_cli-0.1.5-py3-none-any.whl:

Publisher: release.yml on puya/token-counter-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: token_counter_cli-0.1.5-py3-none-any.whl
- Subject digest: e84d111aaff050331c422f550b2d3e54436efefa3a9d058bac84623b5d5ee0b9
- Sigstore transparency entry: 709064693
- Sigstore integration time: Nov 19, 2025
Source repository:
- Permalink: puya/token-counter-cli@c5bd380d17a1c259355578f24b25fc79a4bd4546
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/puya
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c5bd380d17a1c259355578f24b25fc79a4bd4546
- Trigger Event: push

token-counter-cli 0.1.5

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Token Counter

Features

Installation & Usage

Usage

Usage Options

Model Selection

File Extension Control

File/Directory Exclusion

Recursive Directory Scanning

LLM Context Limit Comparison

Combined Options

Complete Option Reference

Examples

Configuration Files

File Extensions Configuration (src/token_counter/config/allowed_extensions.json)

LLM Context Limits Configuration (src/token_counter/config/llm_limits.json)

Adding to PATH

Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File Extensions Configuration (`src/token_counter/config/allowed_extensions.json`)

LLM Context Limits Configuration (`src/token_counter/config/llm_limits.json`)