TOON (Token-Oriented Object Notation) encoder/decoder for Python - Bidirectional JSON-to-TOON converter optimized for LLMs

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

xaviviro

These details have not been verified by PyPI

Project links

Documentation

Project description

python-toon encoder/decoder

Token-Oriented Object Notation for Python

A compact data format optimized for transmitting structured information to Large Language Models (LLMs) with 30-60% fewer tokens than JSON.

Installation

pip install python-toon

What is TOON?

TOON (Token-Oriented Object Notation) combines YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, optimized specifically for token efficiency in LLM contexts.

This is a faithful Python implementation maintaining 100% output compatibility with the official TOON specification.

Key Features

30-60% token reduction compared to standard JSON
Minimal syntax: Eliminates redundant punctuation (braces, brackets, most quotes)
Tabular arrays: CSV-like row format for uniform object collections
Explicit metadata: Array length indicators [N] for validation
LLM-friendly: Maintains semantic clarity while reducing token count
100% compatible with original TypeScript implementation

Quick Start

from toon import encode

# Simple object
data = {"name": "Alice", "age": 30}
print(encode(data))
# Output:
# name: Alice
# age: 30

# Tabular array (uniform objects)
users = [
    {"id": 1, "name": "Alice", "age": 30},
    {"id": 2, "name": "Bob", "age": 25},
    {"id": 3, "name": "Charlie", "age": 35},
]
print(encode(users))
# Output:
# [3,]{id,name,age}:
#   1,Alice,30
#   2,Bob,25
#   3,Charlie,35

# Complex nested structure
data = {
    "metadata": {"version": 1, "author": "test"},
    "items": [
        {"id": 1, "name": "Item1"},
        {"id": 2, "name": "Item2"},
    ],
    "tags": ["alpha", "beta", "gamma"],
}
print(encode(data))
# Output:
# metadata:
#   version: 1
#   author: test
# items[2,]{id,name}:
#   1,Item1
#   2,Item2
# tags[3]: alpha,beta,gamma

CLI Usage

Command-line tool for converting between JSON and TOON formats.

# Encode JSON to TOON (auto-detected by .json extension)
toon input.json -o output.toon

# Decode TOON to JSON (auto-detected by .toon extension)
toon data.toon -o output.json

# Use stdin/stdout
echo '{"name": "Ada"}' | toon -
# Output: name: Ada

# Force encode mode
toon data.json --encode

# Force decode mode
toon data.toon --decode

# Custom delimiter
toon data.json --delimiter "\t" -o output.toon

# With length markers
toon data.json --length-marker -o output.toon

# Lenient decoding (disable strict validation)
toon data.toon --no-strict -o output.json

CLI Options

Option	Description
`-o, --output <file>`	Output file path (prints to stdout if omitted)
`-e, --encode`	Force encode mode (overrides auto-detection)
`-d, --decode`	Force decode mode (overrides auto-detection)
`--delimiter <char>`	Array delimiter: `,` (comma), `\t` (tab), `\|` (pipe)
`--indent <number>`	Indentation size (default: 2)
`--length-marker`	Add `#` prefix to array lengths (e.g., `items[#3]`)
`--no-strict`	Disable strict validation when decoding

API Reference

`encode(value, options=None)`

Converts a Python value to TOON format.

Parameters:

value (Any): JSON-serializable value to encode
options (dict, optional): Encoding options

Returns: str - TOON-formatted string

Example:

from toon import encode

data = {"id": 123, "name": "Ada"}
toon_str = encode(data)
print(toon_str)
# Output:
# id: 123
# name: Ada

`decode(input_str, options=None)`

Converts a TOON-formatted string back to Python values.

Parameters:

input_str (str): TOON-formatted string to parse
options (DecodeOptions, optional): Decoding options

Returns: Python value (dict, list, or primitive)

Example:

from toon import decode

toon_str = """items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5"""

data = decode(toon_str)
print(data)
# Output: {'items': [{'sku': 'A1', 'qty': 2, 'price': 9.99}, {'sku': 'B2', 'qty': 1, 'price': 14.5}]}

Encoding Options

from toon import encode

encode(data, {
    "indent": 2,           # Spaces per indentation level (default: 2)
    "delimiter": ",",      # Delimiter for arrays: "," | "\t" | "|" (default: ",")
    "lengthMarker": "#"    # Optional marker prefix: "#" | False (default: False)
})

Decoding Options

from toon import decode, DecodeOptions

options = DecodeOptions(
    indent=2,    # Expected number of spaces per indentation level (default: 2)
    strict=True  # Enable strict validation (default: True)
)

data = decode(toon_str, options)

Strict Mode:

By default, the decoder validates input strictly:

Invalid escape sequences: Throws on "\x", unterminated strings
Syntax errors: Throws on missing colons, malformed headers
Array length mismatches: Throws when declared length doesn't match actual count
Delimiter mismatches: Throws when row delimiters don't match header

Set strict=False to allow lenient parsing.

Delimiter Options

You can use string literals directly:

data = [1, 2, 3, 4, 5]

# Comma (default)
print(encode(data))
# [5]: 1,2,3,4,5

# Tab
print(encode(data, {"delimiter": "\t"}))
# [5	]: 1	2	3	4	5

# Pipe
print(encode(data, {"delimiter": "|"}))
# [5|]: 1|2|3|4|5

Or use the string keys:

encode(data, {"delimiter": "comma"})   # Default
encode(data, {"delimiter": "tab"})     # Tab-separated
encode(data, {"delimiter": "pipe"})    # Pipe-separated

Length Markers

Add the # prefix to array length indicators:

users = [
    {"id": 1, "name": "Alice"},
    {"id": 2, "name": "Bob"},
]

# Without marker (default)
print(encode(users))
# [2,]{id,name}:
# 1,Alice
# 2,Bob

# With marker
print(encode(users, {"lengthMarker": "#"}))
# [#2,]{id,name}:
#   1,Alice
#   2,Bob

Format Rules

Objects

Key-value pairs with primitives or nested structures:

{"name": "Alice", "age": 30}
# =>
# name: Alice
# age: 30

Primitive Arrays

Arrays always include length [N]:

[1, 2, 3, 4, 5]
# => [5]: 1,2,3,4,5

["alpha", "beta", "gamma"]
# => [3]: alpha,beta,gamma

Tabular Arrays

Uniform objects with identical primitive-only fields use CSV-like format:

[
    {"id": 1, "name": "Alice"},
    {"id": 2, "name": "Bob"},
]
# =>
# [2,]{id,name}:
#   1,Alice
#   2,Bob

Note: The delimiter appears in the length bracket [2,] for tabular arrays.

Mixed Arrays

Non-uniform data using list format with - markers:

[{"name": "Alice"}, 42, "hello"]
# =>
# [3]:
#   - name: Alice
#   - 42
#   - hello

Array Length Format

The length bracket format depends on the array type:

Tabular arrays (with fields):

Delimiter always shown: [2,]{fields}: or [2|]{fields}: or [2\t]{fields}:

Primitive arrays (no fields):

Comma: [3]: (delimiter hidden)
Other: [3|]: or [3\t]: (delimiter shown)

Quoting Rules

Strings are quoted only when necessary (following the TOON specification):

Empty strings
Keywords: null, true, false
Numeric strings: 42, -3.14
Leading or trailing whitespace
Contains structural characters: :, [, ], {, }, -, "
Contains current delimiter (,, |, or tab)
Contains control characters (newline, carriage return, tab, backslash)

"hello"          # => hello (no quotes)
"hello world"    # => hello world (internal spaces OK)
" hello"         # => " hello" (leading space requires quotes)
"null"           # => "null" (keyword)
"42"             # => "42" (looks like number)
""               # => "" (empty)

Type Conversions

Non-JSON types are normalized automatically:

Numbers: Decimal form (no scientific notation)
Dates/DateTime: ISO 8601 strings (quoted)
Decimal: Converted to float
Infinity/NaN: Converted to null
Functions/Callables: Converted to null
-0: Normalized to 0

LLM Integration Best Practices

When using TOON with LLMs:

Wrap in code blocks for clarity:
```
```toon
name: Alice
age: 30
```
```
Instruct the model about the format:

"Respond using TOON format (Token-Oriented Object Notation). Use key: value syntax, indentation for nesting, and tabular format [N,]{fields}: for uniform arrays."
Leverage length markers for validation:
```
encode(data, {"lengthMarker": "#"})
```
Tell the model: "Array lengths are marked with [#N]. Ensure your response matches these counts."
Acknowledge tokenizer variance: Token savings depend on the specific tokenizer and model being used.

Token Efficiency Example

import json
from toon import encode

data = {
    "users": [
        {"id": 1, "name": "Alice", "age": 30, "active": True},
        {"id": 2, "name": "Bob", "age": 25, "active": True},
        {"id": 3, "name": "Charlie", "age": 35, "active": False},
    ]
}

json_str = json.dumps(data)
toon_str = encode(data)

print(f"JSON: {len(json_str)} characters")
print(f"TOON: {len(toon_str)} characters")
print(f"Reduction: {100 * (1 - len(toon_str) / len(json_str)):.1f}%")

# Output:
# JSON: 177 characters
# TOON: 85 characters
# Reduction: 52.0%

JSON output:

{"users": [{"id": 1, "name": "Alice", "age": 30, "active": true}, {"id": 2, "name": "Bob", "age": 25, "active": true}, {"id": 3, "name": "Charlie", "age": 35, "active": false}]}

TOON output:

users[3,]{id,name,age,active}:
  1,Alice,30,true
  2,Bob,25,true
  3,Charlie,35,false

Development

This project uses uv for fast, reliable package and environment management.

Setup with uv (Recommended)

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone the repository
git clone https://github.com/toon-format/toon-python.git
cd toon-python

# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install package in editable mode with dev dependencies
uv pip install -e ".[dev]"

Setup with pip (Alternative)

# Clone the repository
git clone https://github.com/toon-format/toon-python.git
cd toon-python

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .

# Install development dependencies
pip install -r requirements-dev.txt

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=toon --cov-report=term

Type Checking

mypy src/toon

Linting

ruff check src/toon tests

Credits

This project is a Python implementation of the TOON format.

License

MIT License - see LICENSE file for details

TOON Format Specification - Official specification with normative encoding rules
TOON Format Organization - Official TOON format organization

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

When contributing, please:

Add tests for new features
Update documentation as needed
Ensure compatibility with the TOON specification

Support

For bugs and feature requests, please open an issue.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

xaviviro

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

This version

0.1.3

Nov 4, 2025

0.1.2

Oct 30, 2025

0.1.1

Oct 30, 2025

0.1.0

Oct 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_toon-0.1.3.tar.gz (31.3 kB view details)

Uploaded Nov 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

python_toon-0.1.3-py3-none-any.whl (21.8 kB view details)

Uploaded Nov 4, 2025 Python 3

File details

Details for the file python_toon-0.1.3.tar.gz.

File metadata

Download URL: python_toon-0.1.3.tar.gz
Upload date: Nov 4, 2025
Size: 31.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for python_toon-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`ca348b214c4f1cdad3579fd83dd60032d9eb87eb349c2d430ad9eb6371f174bf`
MD5	`f5d72a72c4c91651f3fff8c8e897fdce`
BLAKE2b-256	`4e92640c83ca46d5fe9c49895449a8932f55252537dd13dd22186cbac3a1ce59`

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_toon-0.1.3.tar.gz:

Publisher: publish.yml on xaviviro/python-toon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: python_toon-0.1.3.tar.gz
- Subject digest: ca348b214c4f1cdad3579fd83dd60032d9eb87eb349c2d430ad9eb6371f174bf
- Sigstore transparency entry: 666599005
- Sigstore integration time: Nov 4, 2025
Source repository:
- Permalink: xaviviro/python-toon@9e22e08676f3716925572cdab5fa895e4dd79bec
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/xaviviro
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9e22e08676f3716925572cdab5fa895e4dd79bec
- Trigger Event: release

File details

Details for the file python_toon-0.1.3-py3-none-any.whl.

File metadata

Download URL: python_toon-0.1.3-py3-none-any.whl
Upload date: Nov 4, 2025
Size: 21.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for python_toon-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a27b0ee4a729e730d1037d0a63eb8b344b3e5a26e3dc9a173067b6c31a868ee6`
MD5	`16fbfff8d0a3a613157e93de2399acfa`
BLAKE2b-256	`26a42f2def0378b44f913d2d6cb3bc5b1a15267b363937ab1cb9afb07ce2313c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for python_toon-0.1.3-py3-none-any.whl:

Publisher: publish.yml on xaviviro/python-toon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: python_toon-0.1.3-py3-none-any.whl
- Subject digest: a27b0ee4a729e730d1037d0a63eb8b344b3e5a26e3dc9a173067b6c31a868ee6
- Sigstore transparency entry: 666599052
- Sigstore integration time: Nov 4, 2025
Source repository:
- Permalink: xaviviro/python-toon@9e22e08676f3716925572cdab5fa895e4dd79bec
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/xaviviro
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9e22e08676f3716925572cdab5fa895e4dd79bec
- Trigger Event: release

python-toon 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

python-toon encoder/decoder

Installation

What is TOON?

Key Features

Quick Start

CLI Usage

CLI Options

API Reference

encode(value, options=None)

decode(input_str, options=None)

Encoding Options

Decoding Options

Delimiter Options

Length Markers

Format Rules

Objects

Primitive Arrays

Tabular Arrays

Mixed Arrays

Array Length Format

Quoting Rules

Type Conversions

LLM Integration Best Practices

Token Efficiency Example

Development

Setup with uv (Recommended)

Setup with pip (Alternative)

Running Tests

Type Checking

Linting

Credits

License

Related

Contributing

Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`encode(value, options=None)`

`decode(input_str, options=None)`