A Python library for working with TOON format, a compact and human readable serialization format optimized for LLM contexts.
Project description
toon_serializer: Token Oriented Object Notation for Python
toon_serializer is a high performance Python serializer/deserializer for TOON (Token Oriented Object Notation).
TOON is a human readable data format designed to minimize token usage for LLMs by removing redundant syntax (braces, quotes, repeated keys) while maintaining structure. It excels at compressing list of dictionaries into Tabular Arrays, often reducing payload sizes by 30-50% compared to JSON.
Features
- 📉 Token Efficient Replaces repetitive JSON keys with compact CSV-like headers.
- 🧠 Adaptive Schema The decoder "learns" column types from the first row to parse massive tables instantly.
- ⚡ Fast Primitives Optimized integer, float, and boolean parsing.
- csv-compatible Handles complex string escaping and quoting automatically.
- Lazy Decoding Iterates over lines lazily, efficient for large datasets.
Installation
pip install toon_serializer
Usage
toon_serializery mimics the standard Python json API with loads and dumps.
- Encoding Data (Serialization)
toon_serializer automatically detects Uniform Lists of Dictionaries and compresses them into a tabular format.
Input
import toon_serializer
data = {
"model": "gpt-4",
"parameters": {
"temperature": 0.7,
"stream": True
},
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum physics."},
{"role": "assistant", "content": "Quantum physics studies..."}
]
}
toon_str = toon_serializer.dumps(data)
print(toon_str)
Output:
model: gpt-4
parameters:
temperature: 0.7
stream: true
messages[3]{role,content}:
system,"You are a helpful assistant."
user,"Explain quantum physics."
assistant,"Quantum physics studies..."
- Decoding Data (Deserialization)
The decoder handles primitives, standard lists, and adaptive tabular arrays seamlessly.
import toon_serializer
toon_str = """
version: 1.0
users[2]{id,name,is_active}:
1,Alice,true
2,Bob,false
tags[3]: python, rust, go
"""
data = toon_serializer.loads(toon_str)
print(data["users"][0])
# {'id': 1, 'name': 'Alice', 'is_active': True}
print(data["tags"])
# ['python', 'rust', 'go']
Performance Notes
- Encoder recursively checks for "uniformity" in lists. If a list contains mixed types, it gracefully falls back to a standard bulleted list.
- Decoder uses a Pushback Iterator to parse line-by-line without loading the entire string into memory.
- Adaptive Parsing when decoding tables, toon_serializer inspects the first row to generate a specialized converter function (e.g., "Column 1 is int, Column 2 is string"), speeding up parsing for the remaining rows.
Contributing
- Fork the repository.
- Create a feature branch.
- Add tests.
- Submit a Pull Request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file toon_serializer-1.0.0.tar.gz.
File metadata
- Download URL: toon_serializer-1.0.0.tar.gz
- Upload date:
- Size: 40.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8928a25f618558fa58a6651de9b0438257bf4568ca3c5d0967684be61505069b
|
|
| MD5 |
747b6851366edca055e8dd338f0e6834
|
|
| BLAKE2b-256 |
194b3ad0e56886fa207ac0d12c1ff9e1b6e43d364302b6dea8ed7c5c2b3334e3
|
File details
Details for the file toon_serializer-1.0.0-py3-none-any.whl.
File metadata
- Download URL: toon_serializer-1.0.0-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e27e7f6b766758472ce4b8a019fa3ea6dc6277e8fbf9ea0a00ede76ee502af3
|
|
| MD5 |
fd882ad8eabafa162d34d4b018d6e814
|
|
| BLAKE2b-256 |
139be643daf0ae261ac0b445b5f98c6afc3d29c5efba4fb56654f9db7240ad93
|