MCP server to count tokens in CSV and Excel files
Project description
csv-token-counter-mcp
A lightweight MCP (Model Context Protocol) server that counts tokens in CSV and Excel files using tiktoken. Works with VS Code, Claude Desktop, and any MCP-compatible application. No LLM or API key required — runs fully offline.
Why Use This?
Before sending CSV or Excel data to any LLM (ChatGPT, Claude, Gemini etc.), you need to know how many tokens your data contains — to estimate cost, check context window limits, or split large files. This tool does exactly that, directly inside your editor or AI assistant.
Features
- Count total tokens across an entire CSV or Excel file
- Break down token counts column by column
- Analyze tokens in a single specific column (avg, min, max per row)
- Preview file schema — column names, types, row counts
- Supports
.csv,.xlsx, and.xlsformats - Works fully offline — no API key or internet needed
- Compatible with VS Code, Claude Desktop, and any MCP client
Installation
Option 1 — pip (traditional)
pip install csv-token-counter-mcp
Add to your .vscode/mcp.json:
{
"servers": {
"csv-token-counter": {
"type": "stdio",
"command": "csv-token-counter-mcp",
"args": []
}
}
}
Option 2 — uvx (recommended, no install needed)
First install uv if you don't have it:
pip install uv
Add directly to your .vscode/mcp.json — no separate install step needed:
{
"servers": {
"csv-token-counter": {
"type": "stdio",
"command": "uvx",
"args": ["csv-token-counter-mcp"]
}
}
}
Option 3 — Claude Desktop
Add to your Claude Desktop config file:
Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"csv-token-counter": {
"command": "uvx",
"args": ["csv-token-counter-mcp"]
}
}
}
Available Tools
count_file_tokens
Count total tokens across an entire CSV or Excel file, with optional column-by-column breakdown.
Arguments:
| Argument | Type | Required | Default | Description |
|---|---|---|---|---|
file_path |
string | ✅ Yes | — | Path to .csv, .xlsx, or .xls file |
include_column_breakdown |
boolean | ❌ No | false |
Return token count per column |
Example request:
{
"tool": "count_file_tokens",
"arguments": {
"file_path": "/data/sales.xlsx",
"include_column_breakdown": true
}
}
Example response:
{
"file": "/data/sales.xlsx",
"encoding": "cl100k_base",
"rows": 500,
"columns": 5,
"total_tokens": 18453,
"column_breakdown": {
"id": 1500,
"product": 3200,
"description": 9800,
"price": 2100,
"stock": 1853
}
}
count_column_tokens
Deep token analysis for a single column — total, average, min, and max tokens per row.
Arguments:
| Argument | Type | Required | Default | Description |
|---|---|---|---|---|
file_path |
string | ✅ Yes | — | Path to .csv, .xlsx, or .xls file |
column_name |
string | ✅ Yes | — | Exact column name to analyze |
Example request:
{
"tool": "count_column_tokens",
"arguments": {
"file_path": "/data/sales.csv",
"column_name": "description"
}
}
Example response:
{
"file": "/data/sales.csv",
"column": "description",
"rows_analyzed": 500,
"total_tokens": 9800,
"avg_tokens_per_row": 19.6,
"max_tokens_in_row": 48,
"min_tokens_in_row": 6
}
preview_file_schema
Preview the structure of a CSV or Excel file — column names, data types, and non-null counts — without counting tokens.
Arguments:
| Argument | Type | Required | Default | Description |
|---|---|---|---|---|
file_path |
string | ✅ Yes | — | Path to .csv, .xlsx, or .xls file |
Example request:
{
"tool": "preview_file_schema",
"arguments": {
"file_path": "/data/sales.csv"
}
}
Example response:
{
"file": "/data/sales.csv",
"rows": 500,
"columns": 5,
"column_details": [
{ "name": "id", "dtype": "int64", "non_null": 500 },
{ "name": "product", "dtype": "object", "non_null": 500 },
{ "name": "description", "dtype": "object", "non_null": 498 },
{ "name": "price", "dtype": "float64", "non_null": 500 },
{ "name": "stock", "dtype": "int64", "non_null": 500 }
]
}
Using in VS Code (GitHub Copilot)
Once installed and configured in mcp.json, open GitHub Copilot Chat
(Ctrl+Shift+I), switch to Agent mode, and ask naturally:
How many tokens are in my file at C:\data\customers.xlsx?
Show me a token breakdown by column for /data/sales.csv
What columns does my file /data/report.xlsx have?
Copilot will automatically call the right tool and return the result.
Using in Claude Desktop
After adding to the Claude Desktop config, just ask Claude:
Count the tokens in my CSV file at /Users/me/data/sales.csv
Which column in /data/customers.xlsx has the most tokens?
Supported File Formats
| Format | Extension | Notes |
|---|---|---|
| CSV | .csv |
Any delimiter, auto-detected |
| Excel | .xlsx |
Excel 2007 and newer |
| Excel Legacy | .xls |
Excel 97-2003 format |
Requirements
- Python 3.10 or higher
- No API key required
- No internet connection required after install
Local Development
# Clone the repo
git clone https://github.com/yourusername/csv-token-counter-mcp.git
cd csv-token-counter-mcp
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
source venv/bin/activate # Mac/Linux
# Install dependencies
pip install -r requirements.txt
# Run the MCP inspector for testing
mcp dev src/server.py
Running Tests
# Create sample data
python create_sample_data.py
# Run test suite
python test_server.py
Publishing a New Version
# 1. Bump version in pyproject.toml
# 2. Clean old build files
rmdir /s /q dist build # Windows
rm -rf dist/ build/ # Mac/Linux
# 3. Build
python -m build
# 4. Upload
twine upload dist/*
Changelog
v1.0.0 — 2026-03-28
- Initial release
count_file_tokens— total token count with optional column breakdowncount_column_tokens— per-row token stats for a single columnpreview_file_schema— file structure preview- Supports
.csv,.xlsx,.xls - MCP stdio transport compatible with VS Code and Claude Desktop
License
MIT License — see LICENSE for full text.
Author
Built by Anup.
PyPI: https://pypi.org/project/csv-token-counter-mcp
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file csv_token_counter_mcp-1.0.1.tar.gz.
File metadata
- Download URL: csv_token_counter_mcp-1.0.1.tar.gz
- Upload date:
- Size: 55.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91ffa2423cafa860b6ba17071e877aee369e9bca13af20cac13006506e0c305b
|
|
| MD5 |
e2948e5437f1445ba666c44bae09342c
|
|
| BLAKE2b-256 |
ab9017ac0fbe5ada66ce8652c3a9bd75e5ab0e759c87742de40e6fb554867580
|
File details
Details for the file csv_token_counter_mcp-1.0.1-py3-none-any.whl.
File metadata
- Download URL: csv_token_counter_mcp-1.0.1-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b286498540a8c1502a20235a44c2470d80f63a90ce7d879f1a7faa0a99ecaf7
|
|
| MD5 |
3ae8558ec884ce95639c9b9125989511
|
|
| BLAKE2b-256 |
e2c74458465b2e9a709a00a4c07071fbf2ed046e690bfedab6e1e83fb01c4f17
|