Zero Overhead Notation v8.0 (ClearText) - Human-readable data format with 30%+ compression over JSON
Project description
ZON Format v8.0 (ClearText)
Zero-Overhead Notation - A human-readable, LLM-optimized data format that achieves 30%+ compression over JSON while remaining visually clean and intuitive.
Why ZON?
ZON v8.0 "ClearText" combines the readability of YAML with the compression efficiency better than TOON, producing output that looks like structured documents rather than escaped protocols.
Performance
- ✅ 31.9% smaller than JSON on average
- ✅ 25.6% better than TOON across benchmarks
- ✅ Zero protocol overhead - no pipes, markers, or complex headers
- ✅ LLM-friendly - readable without knowing the format
Quick Example
Input (JSON):
{
"context": "Hiking Trip",
"friends": ["ana", "luis", "sam"],
"hikes": [
{"id": 1, "name": "Blue Lake Trail", "sunny": true},
{"id": 2, "name": "Ridge Overlook", "sunny": false}
]
}
Output (ZON v8.0):
context:Hiking Trip
friends:[ana,luis,sam]
@hikes(2):id,name,sunny
1,Blue Lake Trail,T
_,Ridge Overlook,F
Size: JSON: 201 bytes → ZON: 106 bytes (47% smaller)
Installation
pip install zon-format
Usage
import zon
# Encode
data = {"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]}
encoded = zon.encode(data)
print(encoded)
# Output:
# @users(2):id,name
# 1,Alice
# _,Bob
# Decode
decoded = zon.decode(encoded)
assert decoded == data # Perfect roundtrip
Format Reference
Metadata (YAML-like)
key:value
nested.key:value
list:[item1,item2,item3]
- No spaces after
:for compactness - Dot notation for nested objects
- Minimal quoting (only when necessary)
Tables (@table syntax)
@tablename(count):col1,col2,col3
val1,val2,val3
val1,val2,val3
@marks table start(count)shows row count- Columns separated by commas (no spaces)
Compression Tokens
| Token | Meaning | Example |
|---|---|---|
T |
Boolean true | T instead of true |
F |
Boolean false | F instead of false |
Note: ZON v1.0.1 prioritizes explicit data. Compression tokens like
^(repeat) and_(auto-increment) are disabled to ensure every row contains its full, actual data.
Smart Quoting
Quotes are only added when necessary:
| Value | Encoded | Reason |
|---|---|---|
ana |
ana |
No special chars |
Blue Lake |
Blue Lake |
Spaces OK |
a,b |
"a,b" |
Contains comma (delimiter) |
Hello: World |
Hello: World |
Colons OK |
Format Comparison
Random Users API (10 records)
JSON (15,026 bytes):
[
{
"gender": "female",
"name": {"title": "Ms", "first": "Sophia", "last": "Wilson"},
"location": {"city": "Austin", "state": "Texas"},
...
}
]
TOON (10,626 bytes):
results[50]{gender,name{title,first,last},location{city,state},...}
female,Ms,Sophia,Wilson,Austin,Texas,...
ZON v8.0 (6,767 bytes - 55% smaller than JSON):
@data(10):gender,location.city,location.state,name.first,name.last,name.title
female,Austin,Texas,Sophia,Wilson,Ms
^,^,^,Emma,Johnson,Mrs
male,Portland,Oregon,Liam,Brown,Mr
...
Benchmarks
Run the comprehensive benchmark suite:
python benchmarks/generate_datasets.py # Generate test data
python test_comprehensive.py # Run benchmarks
Results (318 records across 6 datasets)
| Dataset | Records | vs JSON | vs TOON |
|---|---|---|---|
| Random Users API | 50 | -42.4% | +40.4% |
| StackOverflow Q&A | 50 | -43.1% | +41.1% |
| JSONPlaceholder Posts | 100 | -13.4% | -0.1% |
| JSONPlaceholder Comments | 100 | -15.4% | +0.0% |
| JSONPlaceholder Users | 10 | -40.3% | +36.3% |
| GitHub Repos | 8 | -37.1% | +36.0% |
| AVERAGE | -31.9% | +25.6% |
View Encoded Samples
Compare formats side-by-side:
python benchmarks/generate_samples.py
# Generates .json, .zon, and .toon files in benchmarks/encoded_samples/
Open any .zon file to see the clean, readable output!
How It Works
1. Root Promotion
ZON automatically separates metadata (context) from data (tables):
{"context": "Trip", "hikes": [{...}, {...}]}
↓
context:Trip
@hikes(2):...
3. Intelligent Compression
- Sequential IDs:
1,_,_(auto-increment) - Repetitive values: Uses
^token - Booleans:
T/F(1 byte vs 4-5 bytes) - No quotes: Unless value contains
,or control chars
Using with LLMs
ZON is token-efficient and integrates with modern LLM tooling. This repository keeps concise examples for the most common integrations.
LangChain
Compress structured payloads with zon.encode() before sending them through LangChain prompts. See BENCHMARKS_ALL.md for sample usage and token impact.
LangGraph
Attach ZON-encoded payloads as node metadata to reduce token footprint when traversing or querying graphs.
dspy
Use zon.decode() to convert ZON strings back to Python objects and stream into dataframes or telemetry pipelines for analysis.
CLI Tool
# Encode
zon encode input.json output.zon
# Decode
zon decode input.zon output.json
# Benchmark
zon benchmark data.json
Development
# Install in development mode
pip install -e .
# Run tests
python -m pytest tests/
# Run benchmarks
python test_comprehensive.py
Version History
v1.0.1 (2025-11-24) - "ClearText"
- ✅ Removed protocol overhead (no more
#Z:, pipes, or markers) - ✅ YAML-like metadata syntax (
key:value) - ✅ Clean @table syntax
- ✅ Aggressive quote removal (spaces no longer trigger quoting)
- ✅ Compact array syntax:
[item1,item2,item3] - ✅ Optimized nested data:
{key:val}syntax (no more JSON strings) - ✅ 31.9% compression vs JSON, 25.6% better than TOON
v1.0.0 (2025-11-23)
- Initial release with pipe-based protocol syntax
License
Apache License 2.0 - see LICENSE file
Contributing
Contributions welcome! Please open an issue or PR on GitHub.
Made with ❤️ for efficient data transmission and LLM optimization
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zon_format-1.0.1.tar.gz.
File metadata
- Download URL: zon_format-1.0.1.tar.gz
- Upload date:
- Size: 25.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
74497fb42226e8bfd1f6e5b97e2aae0bdc50b069e22a58bdeb199d7f548e0776
|
|
| MD5 |
a9142eee1d8483237cac2d3defc8cc33
|
|
| BLAKE2b-256 |
01d645e24cb6939c69d2bfab4afea9cd7d6755f6f6d455d7d2078c040c1841d6
|
File details
Details for the file zon_format-1.0.1-py3-none-any.whl.
File metadata
- Download URL: zon_format-1.0.1-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efe007fe89e35bd9878d4512277e3294d336d69d2b961877c527561abde1add0
|
|
| MD5 |
02ab45f487859fba213e7bfeac5b4324
|
|
| BLAKE2b-256 |
df8ded626e9c70714e57518dafac6694f47503991999fab10f643fd37c7503cf
|