Skip to main content

Token-Oriented Object Notation: A compact format for passing structured data to LLMs with 30-60% fewer tokens than JSON

Project description

toon-py

Token-Oriented Object Notation (TOON) for Python

A compact, human-readable format for passing structured data to LLMs with 30-60% fewer tokens than JSON.

Python port of @byjohann/toon.

Why TOON?

LLM tokens cost money. TOON reduces token usage by:

  • Removing redundant punctuation (braces, brackets, most quotes)
  • Using indentation for structure
  • Tabularizing arrays of objects
  • Writing inline primitive arrays without spaces

Installation

pip install toon-py

Or with uv:

uv add toon-py

Quick Start

Python API

from toon_py import encode

data = {
    "user": {
        "id": 123,
        "name": "Ada",
        "tags": ["reading", "gaming"],
        "active": True
    }
}

print(encode(data))

Output:

user:
  id: 123
  name: Ada
  tags[2]: reading,gaming
  active: true

CLI

# From file
toon data.json

# From stdin
cat data.json | toon

# From string
toon '{"tags": ["foo", "bar"]}'

# With options
toon data.json --delimiter tab --length-marker -o output.toon

Token Savings

Example JSON Tokens TOON Tokens Saved Reduction
Simple user 31 18 13 41.9%
User with tags 48 28 20 41.7%
Product catalog 117 49 68 58.1%
API response 123 53 70 56.9%
Analytics data 209 94 115 55.0%
Large dataset (50 records) 2159 762 1397 64.7%

Features

Objects

encode({"id": 1, "name": "Ada"})
id: 1
name: Ada

Primitive Arrays (Inline)

encode({"tags": ["admin", "ops", "dev"]})
tags[3]: admin,ops,dev

Arrays of Objects (Tabular)

encode({
    "items": [
        {"sku": "A1", "qty": 2, "price": 9.99},
        {"sku": "B2", "qty": 1, "price": 14.5}
    ]
})
items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

Encoding Options

from toon_py import encode, EncodeOptions

data = {"items": [{"id": 1, "name": "Widget"}]}

# Tab delimiter
options = EncodeOptions(delimiter="\t")
print(encode(data, options))

# Pipe delimiter
options = EncodeOptions(delimiter="|")
print(encode(data, options))

# Length marker
options = EncodeOptions(length_marker="#")
print(encode(data, options))
# Output: items[#1]{id,name}: ...

# Custom indent
options = EncodeOptions(indent=4)
print(encode(data, options))

CLI Options

toon [INPUT] [OPTIONS]

Arguments:
  INPUT                 JSON file, JSON string, or stdin

Options:
  -i, --indent INT      Spaces per indent level (default: 2)
  -d, --delimiter TEXT  Delimiter: comma, tab, or pipe (default: comma)
  -l, --length-marker   Add '#' prefix to array lengths
  -o, --output PATH     Output file (default: stdout)
  --help                Show help message

Format Rules

Quoting

Keys and values are quoted only when necessary:

# Unquoted
{"name": "hello world"}  # -> name: hello world

# Quoted (contains comma)
{"note": "hello, world"}  # -> note: "hello, world"

# Quoted (looks like number)
{"code": "123"}  # -> code: "123"

# Quoted (key with space)
{"full name": "Ada"}  # -> "full name": Ada

Tabular Format

Arrays of objects use tabular format when:

  • All elements are objects
  • All objects have identical keys
  • All values are primitives (no nested arrays/objects)
encode({
    "users": [
        {"id": 1, "name": "Alice", "active": True},
        {"id": 2, "name": "Bob", "active": False}
    ]
})
users[2]{id,name,active}:
  1,Alice,true
  2,Bob,false

Empty Containers

encode({})            # -> (empty output)
encode({"items": []}) # -> items[0]:
encode({"config": {}})# -> config:

Type Conversions

Python Type TOON Output
None null
True/False true/false
123 123
-0.0 0
float('nan') null
float('inf') null
datetime(...) "2025-01-01T00:00:00Z"

Use in LLM Prompts

Wrap TOON data in code blocks:

Here's the data in TOON format:

```
user:
  id: 123
  tags[2]: reading,gaming
  active: true
```

Please analyze this data...

Development

# Clone and setup
git clone https://github.com/shammianand/toon-py.git
cd toon-py
uv sync --all-extras

# Run tests
uv run pytest

# Format code
uv run black src/
uv run ruff check src/

License

MIT License - see LICENSE

Credits

Python port of @byjohann/toon by Johann Schopplich

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toon_py-1.0.0.tar.gz (23.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toon_py-1.0.0-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file toon_py-1.0.0.tar.gz.

File metadata

  • Download URL: toon_py-1.0.0.tar.gz
  • Upload date:
  • Size: 23.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.5

File hashes

Hashes for toon_py-1.0.0.tar.gz
Algorithm Hash digest
SHA256 dfe983d186cc98a903b51b85c5bbfc91c9d3b3c7db01ac85603fc5d8900ddbcd
MD5 e88e44f44ad812a7d86ed83a6e218a28
BLAKE2b-256 bf1ecc130e738b5676aab2f0c553fa059d200777adc4a57218b5632eed4baeb4

See more details on using hashes here.

File details

Details for the file toon_py-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: toon_py-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.5

File hashes

Hashes for toon_py-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 237acb31c328e23b3ede22cad00e5d3371c03786346ec46f9674b24d822b850c
MD5 26e8c739e17a1d3f2b96f098048bdd1d
BLAKE2b-256 fc213023344513b9d21a154da47adb0493cf0e168da9f097c7321c416ce2da48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page