A compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage
Project description
TOON Format for Python
⚠️ Beta Status (v0.9.x): This library is in active development and working towards spec compliance. Beta published to PyPI. API may change before 1.0.0 release.
Compact, human-readable serialization format for LLM contexts with 30-60% token reduction vs JSON. Combines YAML-like indentation with CSV-like tabular arrays. Working towards full compatibility with the official TOON specification.
Key Features: Minimal syntax • Tabular arrays for uniform data • Array length validation • Python 3.8+ • Comprehensive test coverage.
# Beta published to PyPI - install from source:
git clone https://github.com/toon-format/toon-python.git
cd toon-python
uv sync
# Or install directly from GitHub:
pip install git+https://github.com/toon-format/toon-python.git
Quick Start
from toon_format import encode, decode
# Simple object
encode({"name": "Alice", "age": 30})
# name: Alice
# age: 30
# Tabular array (uniform objects)
encode([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}])
# [2,]{id,name}:
# 1,Alice
# 2,Bob
# Decode back to Python
decode("items[2]: apple,banana")
# {'items': ['apple', 'banana']}
CLI Usage
# Auto-detect format by extension
toon input.json -o output.toon # Encode
toon data.toon -o output.json # Decode
echo '{"x": 1}' | toon - # Stdin/stdout
# Options
toon data.json --encode --delimiter "\t" --length-marker
toon data.toon --decode --no-strict --indent 4
Options: -e/--encode -d/--decode -o/--output --delimiter --indent --length-marker --no-strict
API Reference
encode(value, options=None) → str
encode({"id": 123}, {"delimiter": "\t", "indent": 4, "lengthMarker": "#"})
Options:
delimiter:","(default),"\t","|"indent: Spaces per level (default:2)lengthMarker:""(default) or"#"to prefix array lengths
decode(input_str, options=None) → Any
decode("id: 123", {"indent": 2, "strict": True})
Options:
indent: Expected indent size (default:2)strict: Validate syntax, lengths, delimiters (default:True)
Token Counting & Comparison
Measure token efficiency and compare formats:
from toon_format import estimate_savings, compare_formats, count_tokens
# Measure savings
data = {"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}
result = estimate_savings(data)
print(f"Saves {result['savings_percent']:.1f}% tokens") # Saves 42.3% tokens
# Visual comparison
print(compare_formats(data))
# Format Comparison
# ────────────────────────────────────────────────
# Format Tokens Size (chars)
# JSON 45 123
# TOON 28 85
# ────────────────────────────────────────────────
# Savings: 17 tokens (37.8%)
# Count tokens directly
toon_str = encode(data)
tokens = count_tokens(toon_str) # Uses tiktoken (gpt5/gpt5-mini)
Requires tiktoken: uv add tiktoken (benchmark features are optional)
Format Specification
| Type | Example Input | TOON Output |
|---|---|---|
| Object | {"name": "Alice", "age": 30} |
name: Aliceage: 30 |
| Primitive Array | [1, 2, 3] |
[3]: 1,2,3 |
| Tabular Array | [{"id": 1, "name": "A"}, {"id": 2, "name": "B"}] |
[2,]{id,name}:1,A2,B |
| Mixed Array | [{"x": 1}, 42, "hi"] |
[3]:- x: 1- 42- hi |
Quoting: Only when necessary (empty, keywords, numeric strings, whitespace, structural chars, delimiters)
Type Normalization: Infinity/NaN/Functions → null • Decimal → float • datetime → ISO 8601 • -0 → 0
Development
# Setup (requires uv: https://docs.astral.sh/uv/)
git clone https://github.com/toon-format/toon-python.git
cd toon-python
uv sync
# Run tests (792 tests, 91% coverage, 85% enforced)
uv run pytest --cov=toon_format --cov-report=term
# Code quality
uv run ruff check src/ tests/ # Lint
uv run ruff format src/ tests/ # Format
uv run mypy src/ # Type check
CI/CD: GitHub Actions • Python 3.8-3.14 • Coverage enforcement • PR coverage comments
Project Status & Roadmap
Following semantic versioning towards 1.0.0:
- v0.8.x - Initial code set, tests, documentation ✅
- v0.9.x - Serializer improvements, spec compliance testing, publishing setup (current)
- v1.0.0-rc.x - Release candidates for production readiness
- v1.0.0 - First stable release with full spec compliance
See CONTRIBUTING.md for detailed guidelines.
Documentation
- 📘 Full Documentation - Complete guides and references
- 🔧 API Reference - Detailed function documentation
- 📋 Format Specification - TOON syntax and rules
- 🤖 LLM Integration - Best practices for LLM usage
- 📜 TOON Spec - Official specification
- 🐛 Issues - Bug reports and features
- 🤝 Contributing - Contribution guidelines
Contributors
License
MIT License – see LICENSE for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flow_toon_format-0.9.0b2.tar.gz.
File metadata
- Download URL: flow_toon_format-0.9.0b2.tar.gz
- Upload date:
- Size: 84.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a7370bc69a2aa6d44fbfc9f1a256f5cc92cb4bf857f7aa81c4ed4b0ce0054be
|
|
| MD5 |
fce94e521ddc4347c48e55d349f9352a
|
|
| BLAKE2b-256 |
be7168a3f26cd06cb0700c8cb4e0f143d179bb8a00f70ccd6a1867713ed296db
|
Provenance
The following attestation bundles were made for flow_toon_format-0.9.0b2.tar.gz:
Publisher:
publish.yml on gouveiahenrique/toon-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flow_toon_format-0.9.0b2.tar.gz -
Subject digest:
2a7370bc69a2aa6d44fbfc9f1a256f5cc92cb4bf857f7aa81c4ed4b0ce0054be - Sigstore transparency entry: 910888463
- Sigstore integration time:
-
Permalink:
gouveiahenrique/toon-python@7f5be99fe19f8c48646742bfec5b5e58de60c94a -
Branch / Tag:
refs/tags/v0.9.0-beta.3 - Owner: https://github.com/gouveiahenrique
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7f5be99fe19f8c48646742bfec5b5e58de60c94a -
Trigger Event:
release
-
Statement type:
File details
Details for the file flow_toon_format-0.9.0b2-py3-none-any.whl.
File metadata
- Download URL: flow_toon_format-0.9.0b2-py3-none-any.whl
- Upload date:
- Size: 36.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0673694547ce32260325a1f346a407df41c59ddfea0afe75f2f6690c85c3c011
|
|
| MD5 |
af76f4dc76377d0e24e37e34c4992816
|
|
| BLAKE2b-256 |
04f0e8e0d034c6ca51cdcb15f86299aab2d691f71c7552c5cfc00e40295aa8e2
|
Provenance
The following attestation bundles were made for flow_toon_format-0.9.0b2-py3-none-any.whl:
Publisher:
publish.yml on gouveiahenrique/toon-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flow_toon_format-0.9.0b2-py3-none-any.whl -
Subject digest:
0673694547ce32260325a1f346a407df41c59ddfea0afe75f2f6690c85c3c011 - Sigstore transparency entry: 910888472
- Sigstore integration time:
-
Permalink:
gouveiahenrique/toon-python@7f5be99fe19f8c48646742bfec5b5e58de60c94a -
Branch / Tag:
refs/tags/v0.9.0-beta.3 - Owner: https://github.com/gouveiahenrique
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7f5be99fe19f8c48646742bfec5b5e58de60c94a -
Trigger Event:
release
-
Statement type: