Skip to main content

MCP server to count tokens in CSV and Excel files

Project description

csv-token-counter-mcp

A lightweight MCP (Model Context Protocol) server that counts tokens in CSV and Excel files using tiktoken. Works with VS Code, Claude Desktop, and any MCP-compatible application. No LLM or API key required — runs fully offline.


Why Use This?

Before sending CSV or Excel data to any LLM (ChatGPT, Claude, Gemini etc.), you need to know how many tokens your data contains — to estimate cost, check context window limits, or split large files. This tool does exactly that, directly inside your editor or AI assistant.


Features

  • Count total tokens across an entire CSV or Excel file
  • Break down token counts column by column
  • Analyze tokens in a single specific column (avg, min, max per row)
  • Preview file schema — column names, types, row counts
  • Supports .csv, .xlsx, and .xls formats
  • Works fully offline — no API key or internet needed
  • Compatible with VS Code, Claude Desktop, and any MCP client

Installation

Option 1 — pip (traditional)

pip install csv-token-counter-mcp

Add to your .vscode/mcp.json:

{
  "servers": {
    "csv-token-counter": {
      "type": "stdio",
      "command": "csv-token-counter-mcp",
      "args": []
    }
  }
}

Option 2 — uvx (recommended, no install needed)

First install uv if you don't have it:

pip install uv

Add directly to your .vscode/mcp.json — no separate install step needed:

{
  "servers": {
    "csv-token-counter": {
      "type": "stdio",
      "command": "uvx",
      "args": ["csv-token-counter-mcp"]
    }
  }
}

Option 3 — Claude Desktop

Add to your Claude Desktop config file:

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "csv-token-counter": {
      "command": "uvx",
      "args": ["csv-token-counter-mcp"]
    }
  }
}

Available Tools

count_file_tokens

Count total tokens across an entire CSV or Excel file, with optional column-by-column breakdown.

Arguments:

Argument Type Required Default Description
file_path string ✅ Yes Path to .csv, .xlsx, or .xls file
include_column_breakdown boolean ❌ No false Return token count per column

Example request:

{
  "tool": "count_file_tokens",
  "arguments": {
    "file_path": "/data/sales.xlsx",
    "include_column_breakdown": true
  }
}

Example response:

{
  "file": "/data/sales.xlsx",
  "encoding": "cl100k_base",
  "rows": 500,
  "columns": 5,
  "total_tokens": 18453,
  "column_breakdown": {
    "id": 1500,
    "product": 3200,
    "description": 9800,
    "price": 2100,
    "stock": 1853
  }
}

count_column_tokens

Deep token analysis for a single column — total, average, min, and max tokens per row.

Arguments:

Argument Type Required Default Description
file_path string ✅ Yes Path to .csv, .xlsx, or .xls file
column_name string ✅ Yes Exact column name to analyze

Example request:

{
  "tool": "count_column_tokens",
  "arguments": {
    "file_path": "/data/sales.csv",
    "column_name": "description"
  }
}

Example response:

{
  "file": "/data/sales.csv",
  "column": "description",
  "rows_analyzed": 500,
  "total_tokens": 9800,
  "avg_tokens_per_row": 19.6,
  "max_tokens_in_row": 48,
  "min_tokens_in_row": 6
}

preview_file_schema

Preview the structure of a CSV or Excel file — column names, data types, and non-null counts — without counting tokens.

Arguments:

Argument Type Required Default Description
file_path string ✅ Yes Path to .csv, .xlsx, or .xls file

Example request:

{
  "tool": "preview_file_schema",
  "arguments": {
    "file_path": "/data/sales.csv"
  }
}

Example response:

{
  "file": "/data/sales.csv",
  "rows": 500,
  "columns": 5,
  "column_details": [
    { "name": "id",          "dtype": "int64",   "non_null": 500 },
    { "name": "product",     "dtype": "object",  "non_null": 500 },
    { "name": "description", "dtype": "object",  "non_null": 498 },
    { "name": "price",       "dtype": "float64", "non_null": 500 },
    { "name": "stock",       "dtype": "int64",   "non_null": 500 }
  ]
}

Using in VS Code (GitHub Copilot)

Once installed and configured in mcp.json, open GitHub Copilot Chat (Ctrl+Shift+I), switch to Agent mode, and ask naturally:

How many tokens are in my file at C:\data\customers.xlsx?
Show me a token breakdown by column for /data/sales.csv
What columns does my file /data/report.xlsx have?

Copilot will automatically call the right tool and return the result.


Using in Claude Desktop

After adding to the Claude Desktop config, just ask Claude:

Count the tokens in my CSV file at /Users/me/data/sales.csv
Which column in /data/customers.xlsx has the most tokens?

Supported File Formats

Format Extension Notes
CSV .csv Any delimiter, auto-detected
Excel .xlsx Excel 2007 and newer
Excel Legacy .xls Excel 97-2003 format

Requirements

  • Python 3.10 or higher
  • No API key required
  • No internet connection required after install

Local Development

# Clone the repo
git clone https://github.com/yourusername/csv-token-counter-mcp.git
cd csv-token-counter-mcp

# Create virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
source venv/bin/activate     # Mac/Linux

# Install dependencies
pip install -r requirements.txt

# Run the MCP inspector for testing
mcp dev src/server.py

Running Tests

# Create sample data
python create_sample_data.py

# Run test suite
python test_server.py

Publishing a New Version

# 1. Bump version in pyproject.toml
# 2. Clean old build files
rmdir /s /q dist build        # Windows
rm -rf dist/ build/           # Mac/Linux

# 3. Build
python -m build

# 4. Upload
twine upload dist/*

Changelog

v1.0.0 — 2026-03-28

  • Initial release
  • count_file_tokens — total token count with optional column breakdown
  • count_column_tokens — per-row token stats for a single column
  • preview_file_schema — file structure preview
  • Supports .csv, .xlsx, .xls
  • MCP stdio transport compatible with VS Code and Claude Desktop

License

MIT License — see LICENSE for full text.


Author

Built by Anup.
PyPI: https://pypi.org/project/csv-token-counter-mcp

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_token_counter_mcp-1.0.1.tar.gz (55.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_token_counter_mcp-1.0.1-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file csv_token_counter_mcp-1.0.1.tar.gz.

File metadata

  • Download URL: csv_token_counter_mcp-1.0.1.tar.gz
  • Upload date:
  • Size: 55.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for csv_token_counter_mcp-1.0.1.tar.gz
Algorithm Hash digest
SHA256 91ffa2423cafa860b6ba17071e877aee369e9bca13af20cac13006506e0c305b
MD5 e2948e5437f1445ba666c44bae09342c
BLAKE2b-256 ab9017ac0fbe5ada66ce8652c3a9bd75e5ab0e759c87742de40e6fb554867580

See more details on using hashes here.

File details

Details for the file csv_token_counter_mcp-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for csv_token_counter_mcp-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3b286498540a8c1502a20235a44c2470d80f63a90ce7d879f1a7faa0a99ecaf7
MD5 3ae8558ec884ce95639c9b9125989511
BLAKE2b-256 e2c74458465b2e9a709a00a4c07071fbf2ed046e690bfedab6e1e83fb01c4f17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page