TOON (Token-Oriented Object Notation) - A compact, human-readable serialization format for LLMs

These details have not been verified by PyPI

Project links

Project description

TOON (Token-Oriented Object Notation)

A compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage.

Overview

TOON achieves CSV-like compactness while adding explicit structure, making it ideal for:

Reducing token costs in LLM API calls
Improving context window efficiency
Maintaining human readability
Preserving data structure and types

Key Features

✅ Compact: 30-60% smaller than JSON for structured data
✅ Readable: Clean, indentation-based syntax
✅ Structured: Preserves nested objects and arrays
✅ Type-safe: Supports strings, numbers, booleans, null
✅ Flexible: Multiple delimiter options (comma, tab, pipe)
✅ Smart: Automatic tabular format for uniform arrays
✅ Efficient: Key folding for deeply nested objects

Installation

pip install toonify

For development:

pip install toonify[dev]

Quick Start

Python API

from toon import encode, decode

# Encode Python dict to TOON
data = {
    'users': [
        {'id': 1, 'name': 'Alice', 'role': 'admin'},
        {'id': 2, 'name': 'Bob', 'role': 'user'}
    ]
}

toon_string = encode(data)
print(toon_string)
# Output:
# users[2]{id,name,role}:
#   1,Alice,admin
#   2,Bob,user

# Decode TOON back to Python
result = decode(toon_string)
assert result == data

Command Line

# Encode JSON to TOON
toon input.json -o output.toon

# Decode TOON to JSON
toon input.toon -o output.json

# Use with pipes
cat data.json | toon -e > data.toon

# Show token statistics
toon data.json --stats

TOON Format Specification

Basic Syntax

# Simple key-value pairs
name: Alice
age: 30
active: true

Arrays

Primitive arrays (inline):

numbers: [1,2,3,4,5]
tags: [python,serialization,llm]

Tabular arrays (uniform objects with header):

users[3]{id,name,email}:
  1,Alice,alice@example.com
  2,Bob,bob@example.com
  3,Charlie,charlie@example.com

List arrays (non-uniform or nested):

items[2]:
  value1
  value2

Nested Objects

user:
  name: Alice
  profile:
    age: 30
    city: NYC

Quoting Rules

Strings are quoted only when necessary:

Contains special characters (,, :, ", newlines)
Has leading/trailing whitespace
Looks like a literal (true, false, null)
Is empty

simple: Alice
quoted: "Hello, World"
escaped: "He said \"hello\""
multiline: "Line 1\nLine 2"

API Reference

`encode(data, options=None)`

Convert Python object to TOON string.

Parameters:

data: Python dict or list
options: Optional dict with:
- delimiter: 'comma' (default), 'tab', or 'pipe'
- indent: Number of spaces per level (default: 2)
- key_folding: 'off' (default) or 'safe'
- flatten_depth: Max depth for key folding (default: None)

Example:

toon = encode(data, {
    'delimiter': 'tab',
    'indent': 4,
    'key_folding': 'safe'
})

`decode(toon_string, options=None)`

Convert TOON string to Python object.

Parameters:

toon_string: TOON formatted string
options: Optional dict with:
- strict: Validate structure strictly (default: True)
- expand_paths: 'off' (default) or 'safe'
- default_delimiter: Default delimiter (default: ',')

Example:

data = decode(toon_string, {
    'expand_paths': 'safe',
    'strict': False
})

CLI Usage

usage: toon [-h] [-o OUTPUT] [-e] [-d] [--delimiter {comma,tab,pipe}]
            [--indent INDENT] [--stats] [--no-strict]
            [--key-folding {off,safe}] [--flatten-depth DEPTH]
            [--expand-paths {off,safe}]
            [input]

TOON (Token-Oriented Object Notation) - Convert between JSON and TOON formats

positional arguments:
  input                 Input file path (or "-" for stdin)

optional arguments:
  -h, --help            show this help message and exit
  -o, --output OUTPUT   Output file path (default: stdout)
  -e, --encode          Force encode mode (JSON to TOON)
  -d, --decode          Force decode mode (TOON to JSON)
  --delimiter {comma,tab,pipe}
                        Array delimiter (default: comma)
  --indent INDENT       Indentation size (default: 2)
  --stats               Show token statistics
  --no-strict           Disable strict validation (decode only)
  --key-folding {off,safe}
                        Key folding mode (encode only)
  --flatten-depth DEPTH Maximum key folding depth (encode only)
  --expand-paths {off,safe}
                        Path expansion mode (decode only)

Advanced Features

Key Folding

Collapse single-key chains into dotted paths:

data = {
    'response': {
        'data': {
            'user': {
                'name': 'Alice'
            }
        }
    }
}

# With key_folding='safe'
toon = encode(data, {'key_folding': 'safe'})
# Output: response.data.user.name: Alice

Path Expansion

Expand dotted keys into nested objects:

toon = 'user.profile.age: 30'

# With expand_paths='safe'
data = decode(toon, {'expand_paths': 'safe'})
# Result: {'user': {'profile': {'age': 30}}}

Custom Delimiters

Choose the delimiter that best fits your data:

# Tab delimiter (better for spreadsheet-like data)
toon = encode(data, {'delimiter': 'tab'})

# Pipe delimiter (when data contains commas)
toon = encode(data, {'delimiter': 'pipe'})

Format Comparison

JSON vs TOON

JSON (225 bytes):

{
  "users": [
    {"id": 1, "name": "Alice", "role": "admin"},
    {"id": 2, "name": "Bob", "role": "user"},
    {"id": 3, "name": "Charlie", "role": "guest"}
  ]
}

TOON (90 bytes, 60% reduction):

users[3]{id,name,role}:
  1,Alice,admin
  2,Bob,user
  3,Charlie,guest

When to Use TOON

Use TOON when:

✅ Passing data to LLM APIs (reduce token costs)
✅ Working with uniform tabular data
✅ Context window is limited
✅ Human readability matters

Use JSON when:

❌ Maximum compatibility is required
❌ Data is highly irregular/nested
❌ Working with existing JSON-only tools

Development

Setup

git clone https://github.com/ScrapeGraphAI/toonify.git
cd toonify
pip install -e .[dev]

Running Tests

pytest
pytest --cov=toon --cov-report=term-missing

Running Examples

python examples/basic_usage.py
python examples/advanced_features.py

Performance

TOON typically achieves:

30-60% size reduction vs JSON for structured data
40-70% token reduction with tabular data
Minimal overhead in encoding/decoding (<1ms for typical payloads)

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes with tests
Run tests (pytest)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License - see LICENSE file for details.

Credits

Python implementation inspired by the TypeScript TOON library at toon-format/toon.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.6.0

Feb 6, 2026

1.5.1

Nov 26, 2025

1.5.0

Nov 20, 2025

1.4.0

Nov 13, 2025

1.3.0

Nov 12, 2025

1.2.0

Nov 12, 2025

1.1.1

Nov 12, 2025

This version

0.0.2

Nov 11, 2025

0.0.1

Nov 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toonify-0.0.2.tar.gz (58.9 kB view details)

Uploaded Nov 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

toonify-0.0.2-py3-none-any.whl (16.2 kB view details)

Uploaded Nov 11, 2025 Python 3

File details

Details for the file toonify-0.0.2.tar.gz.

File metadata

Download URL: toonify-0.0.2.tar.gz
Upload date: Nov 11, 2025
Size: 58.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.4.20

File hashes

Hashes for toonify-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`8cda398d5ae6d44696acecc9e6c20dab2f891be7b2f85f0800620b6598ba3fd0`
MD5	`aeea0803c416c6da0956c82bb76f6550`
BLAKE2b-256	`57984420909dd0a1509edc996094af0d0dd153fe478171c935e47620266771c9`

See more details on using hashes here.

File details

Details for the file toonify-0.0.2-py3-none-any.whl.

File metadata

Download URL: toonify-0.0.2-py3-none-any.whl
Upload date: Nov 11, 2025
Size: 16.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.4.20

File hashes

Hashes for toonify-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`db10301758f6d40b7338c07100dc39d0a287fa6ed8040903cf53a81f2232dfb2`
MD5	`2e9904476d3e5acb6d2487c0a7591bec`
BLAKE2b-256	`f7bd6a81d281e3dd35a8df5eb711128b2735b79865d20964b82de322b3f42d9c`

See more details on using hashes here.

toonify 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TOON (Token-Oriented Object Notation)

Overview

Key Features

Installation

Quick Start

Python API

Command Line

TOON Format Specification

Basic Syntax

Arrays

Nested Objects

Quoting Rules

API Reference

encode(data, options=None)

decode(toon_string, options=None)

CLI Usage

Advanced Features

Key Folding

Path Expansion

Custom Delimiters

Format Comparison

JSON vs TOON

When to Use TOON

Development

Setup

Running Tests

Running Examples

Performance

Contributing

License

Credits

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`encode(data, options=None)`

`decode(toon_string, options=None)`