Skip to main content

MCP server for reading and editing HTML/Markdown tables in GitBook-synced documents

Project description

tablestakes

PyPI version Python versions CI License

An MCP server that gives LLMs clean, surgical access to tables trapped in messy HTML.

The Problem

Tools like GitBook, Notion exports, and CMS platforms collapse tables into single-line HTML when syncing to Markdown files. The result looks like this in your editor:

<table><thead><tr><th width="520.11">Requirement</th><th width="122.07">Priority</th><th>Priority 1-2-3</th></tr></thead><tbody><tr><td><strong>1.1</strong> Agent sees only their Salesforce-assigned cases <strong>in the currently selected organization</strong> (case is "assigned" when SF <code>Case.OwnerId</code> matches the agent's linked SF user ID)...</td><td>Must</td><td>1</td></tr></tbody></table>

This is unreadable for humans and unreliable for LLMs. Models struggle to parse collapsed HTML tables, frequently hallucinate cell boundaries, and cannot edit them without corrupting the structure.

tablestakes fixes this. It sits between the LLM and the file, converting tables to clean pipe format on read and writing back in the original format on save — preserving GitBook compatibility, HTML attributes, and inline formatting.

What the LLM Sees

Discovery — scan a 26-table document in one call:

26 tables

T0 pipe 5r 3c v:485f65f7b470 [Cross-Domain Dependencies]
  A:Integration | B:Source | C:Requirements

T2 gitbook 18r 3c v:77a9495fd328 [Case List]
  A:Requirement | B:Priority | C:Priority 1-2-3

T7 gitbook 3r 4c v:d9a9a45a370f [Attachments]
  A:Requirement | B:Priority | C:Dependency | D:Priority 1-2-3

Read — collapsed HTML becomes a clean pipe table:

v:d9a9a45a370f gitbook 3r 4c [Attachments]
A:Requirement | B:Priority | C:Dependency | D:Priority 1-2-3
| Requirement | Priority | Dependency | Priority 1-2-3 |
| --- | --- | --- | --- |
| **5.1** View inbound attachments in-app... | Must | — | 1 |
| **5.2** Send outbound attachments... | Must | Blocked on SF API | 1 |
| **5.3** Attachment file size limits... | Should | — |  |

Write — surgical cell edit, version-checked:

v:5749c94ffb1f

14 characters. The file is updated, GitBook HTML format preserved, width attributes intact.

Token Efficiency

Baseline: Claude Code's built-in Read + Edit tools operating on the same file. Measured on a synthetic 18-row, 4-column table with realistic requirement-style content (bold IDs, inline emphasis, mixed-length cells).

Operation Read + Edit tablestakes Savings
list_tables (26 HTML tables) ~28,400 tokens ~2,500 tokens 91%
read_table (18-row HTML) ~1,100 tokens ~690 tokens 39%
read_table (18-row GFM) ~780 tokens ~690 tokens 11%
Cell edit (18-row HTML) ~35 tokens ~27 tokens 23%
Cell edit (18-row GFM) ~99 tokens ~27 tokens 73%
10-edit workflow (HTML) ~1,470 tokens ~960 tokens 35%

Where the savings come from:

  • Read (HTML): collapsed HTML tags (<td>, <tr>, <th>, <strong>, width="...") are pure overhead. Pipe tables carry the same information without markup. The Read tool also adds cat -n line-number prefixes.

  • Read (GFM): modest savings from stripping line-number prefixes and surrounding document context. The table content itself is already clean.

  • Write: the Edit tool requires old_string (enough context to be unique in the file) + new_string (the modified version), both generated as output tokens. For GFM, old_string is the entire row line (~190 chars). tablestakes needs only {"row": 0, "column": "B", "value": "Should"} (~18 tokens).

  • Discovery: without tablestakes, the LLM reads the entire file to find tables. list_tables returns a compact index — metadata + 1 preview row per table.

  • Compact pipe tables with no column padding. Per the ImprovingAgents benchmark, GFM pipe tables achieve the best token-to-accuracy ratio: 1.24x CSV cost at 51.9% QA accuracy, beating JSON (2.08x, 52.3%) and YAML (1.88x, 54.7%).

Experiment details

Tokenizer: tiktoken cl100k_base (GPT-4). Claude uses a different tokenizer, but relative comparisons hold. The benchmark script (script.py) constructs tables programmatically and generates tablestakes output using the actual converter code — no hardcoded strings.

Read baseline: simulate_read_tool() wraps file content in cat -n format (line-number prefix per line), matching what Claude Code's Read tool returns. The full file (document text + table) enters the LLM context.

Write baseline: for each cell edit, the script computes the minimum unique old_string by expanding leftward from the target <td> until the substring is unique in the file. new_string is the same context with the cell value replaced. This is a best-case scenario for the Edit tool — a human might include more context than the minimum.

list_tables baseline: 26 copies of an 18-row GitBook HTML table in a markdown document. Naive = Read the full file (~28k tokens). tablestakes = list_tables output with preview_rows=0..3:

preview_rows Tokens Savings
0 (metadata only) ~1,230 96%
1 (default) ~2,530 91%
2 ~3,510 88%
3 ~4,420 84%

Reproduce: uv run --with tiktoken python scripts/script.py

Quick Start

Claude Code:

claude mcp add tablestakes -- uvx tablestakes

Codex CLI:

codex mcp add tablestakes -- uvx tablestakes

Gemini CLI:

gemini mcp add tablestakes -- uvx tablestakes

Or install from PyPI directly: pip install tablestakes

Other clients (Cursor, Windsurf, Claude Desktop)

Add the following JSON to your client's MCP config file:

{
  "mcpServers": {
    "tablestakes": {
      "command": "uvx",
      "args": ["tablestakes"]
    }
  }
}
Client Config file
Cursor .cursor/mcp.json
Windsurf ~/.codeium/windsurf/mcp_config.json
Claude Desktop claude_desktop_config.json

Tools

Discovery & Read

Tool Purpose
list_tables(file_path, preview_rows=1) Scan file, return all tables with metadata + preview
read_table(file_path, table_index) Full table normalized to pipe format + version hash

Cell, Row & Column Operations

Tool Purpose
update_cells(file_path, table_index, version, updates) Batch {row, column, value} patches
insert_row(file_path, table_index, version, position, values) Insert row at position (-1 to append)
delete_row(file_path, table_index, version, row_index) Remove row by index
insert_column(file_path, table_index, version, name, ...) Insert column with default value
delete_column(file_path, table_index, version, column) Remove column
rename_column(file_path, table_index, version, old_name, new_name) Rename header
replace_table(file_path, table_index, version, new_content) Full table replacement from pipe input
create_table(file_path, content, position, format) Create new table from pipe input (default: HTML)

All write tools require a version hash from read_table — optimistic concurrency that prevents stale overwrites without locks.

Supported Table Formats

Format Read Write Round-trip
GFM pipe tables Pass-through In-place edit Lossless
GitBook collapsed HTML HTML → pipe Pipe → collapsed HTML Preserves width, data-*, inline formatting
General HTML tables HTML → pipe or pretty HTML Reconstructs HTML Preserves structure

While GitBook is the primary motivation, tablestakes works with any Markdown document containing HTML tables — CMS exports, Notion dumps, wiki migrations, or hand-written HTML in .md files.

Column Addressing

Columns can be referenced by:

  • Letter: "A", "B", "AA" (bijective base-26, like Excel)
  • Name: "Priority" (must be unique)
  • Composite: "B:Priority" (for disambiguation)
  • Index: "0", "1" (0-based)

Development

make init      # First-time setup: venv + deps + pre-commit hooks
make check     # All checks: format + lint + typecheck + test
make test      # Run tests only
make test-cov  # Tests with coverage report

License

Apache-2.0


mcp-name: io.github.oborchers/tablestakes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tablestakes-1.0.0.tar.gz (167.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tablestakes-1.0.0-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file tablestakes-1.0.0.tar.gz.

File metadata

  • Download URL: tablestakes-1.0.0.tar.gz
  • Upload date:
  • Size: 167.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tablestakes-1.0.0.tar.gz
Algorithm Hash digest
SHA256 56c19caa9ccbbc2d53e46e3ae3a1bdf719aced48f640b89a9fe400f88662545f
MD5 1c7c8594f2adb56706d7254ff12bbd66
BLAKE2b-256 6738c175d8bcdf57a2d7fe7a0b4a8c86011553ba4308bab4a615bbb0a0b12d63

See more details on using hashes here.

Provenance

The following attestation bundles were made for tablestakes-1.0.0.tar.gz:

Publisher: release.yml on oborchers/tablestakes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tablestakes-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: tablestakes-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tablestakes-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 91c79abcad8539c7a45a69abb87a1b703085a467828fa4a249a99f4fbecb3985
MD5 282536b80eaa0c2942179ee8ce5fc031
BLAKE2b-256 f15c27ac4bfe463ff826280eae1955e02550c7dbecf9aabd2c3b7f7306b28fd5

See more details on using hashes here.

Provenance

The following attestation bundles were made for tablestakes-1.0.0-py3-none-any.whl:

Publisher: release.yml on oborchers/tablestakes

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page