Token-Oriented Object Notation: A compact format for passing structured data to LLMs with 30-60% fewer tokens than JSON
Project description
toon-py
Token-Oriented Object Notation (TOON) for Python
A compact, human-readable format for passing structured data to LLMs with 30-60% fewer tokens than JSON.
Python port of @byjohann/toon.
Why TOON?
LLM tokens cost money. TOON reduces token usage by:
- Removing redundant punctuation (braces, brackets, most quotes)
- Using indentation for structure
- Tabularizing arrays of objects
- Writing inline primitive arrays without spaces
Installation
pip install toon-py
Or with uv:
uv add toon-py
Quick Start
Python API
from toon_py import encode
data = {
"user": {
"id": 123,
"name": "Ada",
"tags": ["reading", "gaming"],
"active": True
}
}
print(encode(data))
Output:
user:
id: 123
name: Ada
tags[2]: reading,gaming
active: true
CLI
# From file
toon data.json
# From stdin
cat data.json | toon
# From string
toon '{"tags": ["foo", "bar"]}'
# With options
toon data.json --delimiter tab --length-marker -o output.toon
Token Savings
| Example | JSON Tokens | TOON Tokens | Saved | Reduction |
|---|---|---|---|---|
| Simple user | 31 | 18 | 13 | 41.9% |
| User with tags | 48 | 28 | 20 | 41.7% |
| Product catalog | 117 | 49 | 68 | 58.1% |
| API response | 123 | 53 | 70 | 56.9% |
| Analytics data | 209 | 94 | 115 | 55.0% |
| Large dataset (50 records) | 2159 | 762 | 1397 | 64.7% |
Features
Objects
encode({"id": 1, "name": "Ada"})
id: 1
name: Ada
Primitive Arrays (Inline)
encode({"tags": ["admin", "ops", "dev"]})
tags[3]: admin,ops,dev
Arrays of Objects (Tabular)
encode({
"items": [
{"sku": "A1", "qty": 2, "price": 9.99},
{"sku": "B2", "qty": 1, "price": 14.5}
]
})
items[2]{sku,qty,price}:
A1,2,9.99
B2,1,14.5
Encoding Options
from toon_py import encode, EncodeOptions
data = {"items": [{"id": 1, "name": "Widget"}]}
# Tab delimiter
options = EncodeOptions(delimiter="\t")
print(encode(data, options))
# Pipe delimiter
options = EncodeOptions(delimiter="|")
print(encode(data, options))
# Length marker
options = EncodeOptions(length_marker="#")
print(encode(data, options))
# Output: items[#1]{id,name}: ...
# Custom indent
options = EncodeOptions(indent=4)
print(encode(data, options))
CLI Options
toon [INPUT] [OPTIONS]
Arguments:
INPUT JSON file, JSON string, or stdin
Options:
-i, --indent INT Spaces per indent level (default: 2)
-d, --delimiter TEXT Delimiter: comma, tab, or pipe (default: comma)
-l, --length-marker Add '#' prefix to array lengths
-o, --output PATH Output file (default: stdout)
--help Show help message
Format Rules
Quoting
Keys and values are quoted only when necessary:
# Unquoted
{"name": "hello world"} # -> name: hello world
# Quoted (contains comma)
{"note": "hello, world"} # -> note: "hello, world"
# Quoted (looks like number)
{"code": "123"} # -> code: "123"
# Quoted (key with space)
{"full name": "Ada"} # -> "full name": Ada
Tabular Format
Arrays of objects use tabular format when:
- All elements are objects
- All objects have identical keys
- All values are primitives (no nested arrays/objects)
encode({
"users": [
{"id": 1, "name": "Alice", "active": True},
{"id": 2, "name": "Bob", "active": False}
]
})
users[2]{id,name,active}:
1,Alice,true
2,Bob,false
Empty Containers
encode({}) # -> (empty output)
encode({"items": []}) # -> items[0]:
encode({"config": {}})# -> config:
Type Conversions
| Python Type | TOON Output |
|---|---|
None |
null |
True/False |
true/false |
123 |
123 |
-0.0 |
0 |
float('nan') |
null |
float('inf') |
null |
datetime(...) |
"2025-01-01T00:00:00Z" |
Use in LLM Prompts
Wrap TOON data in code blocks:
Here's the data in TOON format:
```
user:
id: 123
tags[2]: reading,gaming
active: true
```
Please analyze this data...
Development
# Clone and setup
git clone https://github.com/shammianand/toon-py.git
cd toon-py
uv sync --all-extras
# Run tests
uv run pytest
# Format code
uv run black src/
uv run ruff check src/
License
MIT License - see LICENSE
Credits
Python port of @byjohann/toon by Johann Schopplich
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file toon_py-1.0.0.tar.gz.
File metadata
- Download URL: toon_py-1.0.0.tar.gz
- Upload date:
- Size: 23.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfe983d186cc98a903b51b85c5bbfc91c9d3b3c7db01ac85603fc5d8900ddbcd
|
|
| MD5 |
e88e44f44ad812a7d86ed83a6e218a28
|
|
| BLAKE2b-256 |
bf1ecc130e738b5676aab2f0c553fa059d200777adc4a57218b5632eed4baeb4
|
File details
Details for the file toon_py-1.0.0-py3-none-any.whl.
File metadata
- Download URL: toon_py-1.0.0-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
237acb31c328e23b3ede22cad00e5d3371c03786346ec46f9674b24d822b850c
|
|
| MD5 |
26e8c739e17a1d3f2b96f098048bdd1d
|
|
| BLAKE2b-256 |
fc213023344513b9d21a154da47adb0493cf0e168da9f097c7321c416ce2da48
|