Skip to main content

Compact text formats for feeding structured data into LLM contexts.

Project description

wick-formatter

CI codecov PyPI version Docker

Compact text formats for feeding structured data into LLM contexts. Python library, CLI, and stdio MCP server.

wick-formatter's primary format is LEAN (LLM-Efficient Adaptive Notation), an independent Python implementation of a format originally designed by Denys Fiialko. On tabular structured data, LEAN typically produces substantially fewer characters than uncompressed JSON while remaining losslessly round-trippable.

Guarantees

100% lossless round-trip. For any supported value:

decode(encode(data)) == data

This is verified on every benchmark run. If round-trip fails on any item, the benchmark hard-fails before measuring anything else.

No accuracy loss. LLM task accuracy with LEAN-encoded data matches JSON-encoded data within measurement noise. The benchmark enforces a maximum 3 percentage point delta.

Latest benchmark

Metric Value
Round-trip 100% lossless
Compression 45.7% fewer characters
LLM accuracy (JSON) 53.8%
LLM accuracy (LEAN) 51.9%
Accuracy delta 1.9pp
Items tested 52
Model llama3.1:8b-instruct-q4_K_M
Corpus WikiTableQuestions + custom

See docs/BENCHMARK_RESULTS.md for methodology, data sources, and replication instructions.

Status

v0.1.0, pre-release. API and on-wire format are still stabilising. See docs/CHANGELOG.md for release notes and docs/ROADMAP.md for the v0.1.0 acceptance gates and deferred items.

Install

pip install wick-formatter

Runtime dependency: mcp (only required when using the MCP server).

What's in the box

  • wick_formatter — Python library with a pluggable format registry.
  • wick-formatter — CLI (--format=X {encode,decode}, stdin→stdout).
  • python -m wick_formatter.mcp — stdio MCP server exposing wf_encode and wf_decode.
  • tests/benchmarks/ — WikiTableQuestions-based harness comparing LEAN to a JSON baseline on llama3.1:8b-instruct-q4_K_M through an OpenAI-compatible endpoint (Ollama by default, API-provider swappable).

Quick example

Encode a small array of records into LEAN's tabular form:

echo '[{"a": 1, "b": 2}, {"a": 3, "b": 4}]' \
    | wick-formatter --format=lean encode

Decode it back:

wick-formatter --format=lean decode < record.lean

The full round-trip contract — including the ~-marker semi-tabular path, dot-flatten, and block encodings — is documented in docs/SPEC.md.

Python API

from wick_formatter import get, decode

# Get the LEAN format encoder
lean = get("lean")

# Encode tabular data
data = [{"name": "Alice", "score": 95}, {"name": "Bob", "score": 87}]
encoded = lean.encode(data)
print(encoded)
# name|score
# Alice|95
# Bob|87

# Decode back to original
decoded = decode(encoded)
assert decoded == data

Resource Limits

The decoder enforces configurable limits to prevent denial-of-service when processing untrusted input:

Limit Default Environment Variable
Input size 1 GB WICK_MAX_INPUT_BYTES
Recursion depth 100 WICK_MAX_RECURSION_DEPTH
Collection size 10M items WICK_MAX_COLLECTION_SIZE

Python API

from wick_formatter.formats.lean import decode, DecodeLimits

# Custom limits
result = decode(text, limits=DecodeLimits(
    max_input_bytes=10 * 1024 * 1024,  # 10 MB
    max_recursion_depth=50,
    max_collection_size=100_000,
))

# Disable limits (not recommended for untrusted input)
result = decode(text, limits=DecodeLimits(
    max_input_bytes=None,
    max_recursion_depth=None,
    max_collection_size=None,
))

MCP Server

Set environment variables before starting the server:

export WICK_MAX_INPUT_BYTES=10485760  # 10 MB
export WICK_MAX_RECURSION_DEPTH=50
wick-formatter-mcp

Set to 0 to disable a limit (not recommended).

MCP Server Setup

For Claude Code or Codex integration:

git clone https://github.com/p6rguvyrst/wick-formatter
cd wick-formatter
make client-claude   # or: make client-codex

Check status: make client-status

Remove: make client-clean

Requires jq for JSON manipulation. Install with brew install jq (macOS) or apt install jq (Linux).

Format specification

The complete LEAN format specification, including the encoder strategy selection rules and error cases, lives at docs/SPEC.md.

License and attribution

wick-formatter is released under the MIT License.

The LEAN format was originally designed and implemented by Denys Fiialko; see NOTICE for attribution, clean-room posture, and a specific credit for the semi-tabular encoding path first implemented in his toon-mcp-server repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wick_formatter-0.1.2.tar.gz (215.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wick_formatter-0.1.2-py3-none-any.whl (23.3 kB view details)

Uploaded Python 3

File details

Details for the file wick_formatter-0.1.2.tar.gz.

File metadata

  • Download URL: wick_formatter-0.1.2.tar.gz
  • Upload date:
  • Size: 215.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for wick_formatter-0.1.2.tar.gz
Algorithm Hash digest
SHA256 d5897fafbff4b905c660395875b7bcfc5a3531dafc7a2a08766e02c4a20958e0
MD5 f1a0cc14e01f43799117f3cb3ff3a774
BLAKE2b-256 053fe4ae8a52b1659fe635b513c6363878bcf7e44888bf8c7c1f3f9131466707

See more details on using hashes here.

Provenance

The following attestation bundles were made for wick_formatter-0.1.2.tar.gz:

Publisher: release.yml on P6rguVyrst/wick-formatter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wick_formatter-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: wick_formatter-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for wick_formatter-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4c46554167dcd153a0573f3a95a0ce99eb65533a06f257f81257e5fad8674700
MD5 bcba076fda4788609e94f358a6ef28b5
BLAKE2b-256 025b53602865c36d69565f9f33ecba06f9625281ca87c5cfff93104c27b8f205

See more details on using hashes here.

Provenance

The following attestation bundles were made for wick_formatter-0.1.2-py3-none-any.whl:

Publisher: release.yml on P6rguVyrst/wick-formatter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page