TOON (Token-Oriented Object Notation) Python API - A compact data format optimized for LLM token usage
Project description
TOON Python API
TOON (Token-Oriented Object Notation) is a compact, human-readable data format designed to reduce token usage when passing data to large language models. Compared to JSON format, TOON can reduce token usage by 30-60%.
This project provides a Python API library with a Rust backend, delivering high-performance Python bindings through PyO3.
Features
- ✅ Encode and Decode: Bidirectional conversion between Python objects and TOON format
- ✅ Table Format Optimization: Automatically detects uniform object arrays and compresses them using table format
- ✅ Multiple Array Formats: Supports inline arrays, table arrays, list arrays, and arrays of arrays
- ✅ Nested Structures: Full support for nested objects and arrays
- ✅ Custom Options: Supports custom indentation, delimiters, and length markers
- ✅ High Performance: Rust backend provides fast encoding/decoding performance
Installation
pip install tost
Requirements:
- Python 3.8+
Development Installation
If you need to install from source or for development:
# Install Rust (if not already installed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Install maturin
pip install maturin
# Install from source
pip install .
# Or install in development mode (recommended for development)
maturin develop
# Or build wheel files
maturin build --release
Usage Examples
Basic Encoding
from tost import encode
# Simple object
obj = {
"id": 123,
"name": "Ada Lovelace",
"email": "ada@example.com",
"active": True
}
result = encode(obj)
print(result)
# Output:
# id: 123
# name: Ada Lovelace
# email: ada@example.com
# active: true
Table Format Arrays
from tost import encode
# Table format array (auto-optimized)
products = {
"items": [
{"sku": "LAPTOP-15", "qty": 5, "price": 899.99},
{"sku": "MOUSE-BT", "qty": 25, "price": 29.99},
{"sku": "KEYBOARD-MX", "qty": 12, "price": 149.00}
]
}
result = encode(products)
print(result)
# Output:
# items[3]{sku,qty,price}:
# LAPTOP-15,5,899.99
# MOUSE-BT,25,29.99
# KEYBOARD-MX,12,149
Inline Arrays
from tost import encode
# Inline array (primitive type array)
tags = {
"tags": ["javascript", "typescript", "nodejs", "llm"]
}
result = encode(tags)
print(result)
# Output:
# tags[4]: javascript,typescript,nodejs,llm
Nested Structures
from tost import encode
order = {
"orderId": "ORD-2025-001",
"customer": {
"name": "John Smith",
"email": "john@example.com"
},
"items": [
{"product": "Widget A", "quantity": 2, "price": 19.99},
{"product": "Widget B", "quantity": 1, "price": 34.50}
],
"total": 74.48,
"tags": ["priority", "gift-wrap"]
}
result = encode(order)
print(result)
# Output:
# orderId: ORD-2025-001
# customer:
# name: John Smith
# email: john@example.com
# items[2]{product,quantity,price}:
# Widget A,2,19.99
# Widget B,1,34.5
# total: 74.48
# tags[2]: priority,gift-wrap
Decoding
from tost import decode
tost_str = """
id: 123
name: Ada Lovelace
active: true
items[2]{sku,qty}:
A1,2
B2,1
"""
result = decode(tost_str)
print(result)
# Output:
# {
# 'id': 123,
# 'name': 'Ada Lovelace',
# 'active': True,
# 'items': [
# {'sku': 'A1', 'qty': 2},
# {'sku': 'B2', 'qty': 1}
# ]
# }
Custom Options
from tost import encode
obj = {
"items": [
{"sku": "A1", "qty": 2},
{"sku": "B2", "qty": 1}
]
}
# Custom indentation, delimiter, and length marker
result = encode(
obj,
indent=4, # 4-space indentation
delimiter="|", # Use pipe as delimiter
length_marker="#" # Use # as length marker
)
print(result)
# Output:
# items[#2|]{sku|qty}:
# A1|2
# B2|1
API Reference
encode(obj, indent=2, delimiter=",", length_marker=None)
Encode a Python object to TOON format string.
Parameters:
obj: Python object to encode (dict, list, primitive types, etc.)indent(int, optional): Number of spaces per indentation level (default: 2)delimiter(str, optional): Delimiter for array values and table rows (default: ',')length_marker(str, optional): Prefix marker for array length (e.g., '#')
Returns:
str: TOON format string
Examples:
result = encode({"id": 123, "name": "Alice"})
result = encode(obj, indent=4, delimiter="|", length_marker="#")
decode(tost_str)
Decode a TOON format string to Python object.
Parameters:
tost_str(str): TOON format string
Returns:
- Python object (dict, list, or primitive type)
Examples:
obj = decode("id: 123\nname: Alice")
TOON Format Specification
Object Format
key: value
Table Array Format
When all objects in an array have the same keys and all values are primitive types, table format is used:
items[N]{field1,field2,field3}:
value1,value2,value3
value4,value5,value6
Inline Array Format
Primitive type arrays use inline format:
tags[N]: value1,value2,value3
List Format
Mixed or non-uniform arrays use list format:
items[N]:
- value1
- key: value
other: value2
- value3
Array of Arrays Format
pairs[N]:
- [M]: value1,value2
- [M]: value3,value4
Root-Level Arrays
When the root-level value is an array, use a header form without a key name:
[N]{field1,field2}:
value1,value2
value3,value4
Or for primitive type arrays:
[N]: value1,value2,value3
Project Structure
tost/
├── Cargo.toml # Rust workspace configuration
├── pyproject.toml # Python package configuration
├── README.md # Project documentation
├── rust/ # Rust core library
│ ├── Cargo.toml
│ └── src/
│ ├── lib.rs # Main library file (contains PyO3 bindings)
│ ├── encode.rs # TOON encoding implementation
│ └── decode.rs # TOON decoding implementation
└── python/ # Python package
├── src/
│ └── tost/ # Python package
│ ├── __init__.py
│ └── tost.py # Python interface wrapper
└── tests/ # Python tests
└── test_tost.py
Development
Running Tests
# Rust tests
cd rust
cargo test
# Python tests
cd python
pytest tests/
Building
# Development mode
maturin develop
# Release mode
maturin build --release
License
MIT License
References
Language
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tost-0.1.1.tar.gz.
File metadata
- Download URL: tost-0.1.1.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1746cb8d01dde42ffd44d35e193cbf459cc68585dfffdf69ee79ecffdd456f3e
|
|
| MD5 |
65f8a022fc9c71f482d3335576b69cfe
|
|
| BLAKE2b-256 |
bb617cbcce3ed35ca79c572a35a075737baf6934fcad2ee10dd4d7e2ca8dca95
|
File details
Details for the file tost-0.1.1-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: tost-0.1.1-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 859.2 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcde393ebc8c04ba71a60d2f778ac96491de3f41deee90c07b69e3d133894dfb
|
|
| MD5 |
563cd23ecbebbdcc98559e0099b5207b
|
|
| BLAKE2b-256 |
7bb86fea52c10cefa12864c577b657c214460a7f16374c22bebff3170d2a38b1
|