Skip to main content

Zero Overhead Notation v1.0.2 (ClearText) - Human-readable data format with 30%+ compression over JSON

Project description

ZON v1.0 (Entropy Engine)

Zero Overhead Notation - A human-readable data serialization format optimized for LLM token efficiency, JSON for LLMs.

Python License Production

🚀 24-40% better compression than TOON | 📊 30-42% compression vs JSON | 🔍 100% Human Readable


📚 Table of Contents


🚀 What is ZON?

ZON is a smart compression format designed specifically for transmitting structured data to Large Language Models. Unlike traditional compression (which creates binary data), ZON remains 100% human-readable while dramatically reducing token usage.

Why ZON?

Problem Solution
💸 High LLM costs from verbose JSON ZON reduces tokens by 30-42%
🔍 Binary formats aren't debuggable ZON is plain text - you can read it!
🎯 One-size-fits-all compression ZON auto-selects optimal strategy per column

Key Features

  • Entropy Tournament: Auto-selects best compression strategy per column
  • 100% Safe: Guaranteed lossless reconstruction
  • Zero Configuration: Works out of the box

⚡ Quick Start

import zon

# Your data
users = {
  "context": {
    "task": "Our favorite hikes together",
    "location": "Boulder",
    "season": "spring_2025"
  },
  "friends": [
    "ana",
    "luis",
    "sam"
  ],
  "hikes": [
    {
      "id": 1,
      "name": "Blue Lake Trail",
      "distanceKm": 7.5,
      "elevationGain": 320,
      "companion": "ana",
      "wasSunny": true
    },
    {
      "id": 2,
      "name": "Ridge Overlook",
      "distanceKm": 9.2,
      "elevationGain": 540,
      "companion": "luis",
      "wasSunny": false
    },
    {
      "id": 3,
      "name": "Wildflower Loop",
      "distanceKm": 5.1,
      "elevationGain": 180,
      "companion": "sam",
      "wasSunny": true
    }
  ]
}

# Encode (compress)
compressed = zon.encode(users)
# Decode (decompress)
original = zon.decode(compressed)
assert original == users  # ✓ Perfect reconstruction!
  • ZON (96 tokens, 264 bytes)
context:"{task:Our favorite hikes together,location:Boulder,season:spring_2025}"
friends:"[ana,luis,sam]"

@hikes(3):companion,distanceKm,elevationGain,id,name,wasSunny
ana,7.5,320,1,Blue Lake Trail,T
luis,9.2,540,2,Ridge Overlook,F
sam,5.1,180,3,Wildflower Loop,T

vs TOON Compression comparison:

  • TOON (104 tokens, 286 bytes):
context:
  task: Our favorite hikes together
  location: Boulder
  season: spring_2025
friends[3]: ana,luis,sam
hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
  1,Blue Lake Trail,7.5,320,ana,true
  2,Ridge Overlook,9.2,540,luis,false
  3,Wildflower Loop,5.1,180,sam,true

Compression's:

  • JSON (compact) (139 tokens, 451 bytes)
  • ZON (96 tokens, 264 bytes)
  • TOON (104 tokens, 286 bytes)

📦 Installation

From PyPI (Recommended)

pip install zon-format

From Source

git clone https://github.com/yourusername/zon-format.git
cd zon-format
pip install -e .

Verify Installation

import zon
print("ZON installed successfully! ✅")

Format Reference

Metadata (YAML-like)

key:value
nested.key:value
list:[item1,item2,item3]
  • No spaces after : for compactness
  • Dot notation for nested objects
  • Minimal quoting (only when necessary)

Tables (@table syntax)

@tablename(count):col1,col2,col3
val1,val2,val3
val1,val2,val3
  • @ marks table start
  • (count) shows row count
  • Columns separated by commas (no spaces)

Compression Tokens

Token Meaning Example
T Boolean true T instead of true
F Boolean false F instead of false

🤖 LLM Framework Integration

OpenAI Integration

import zon
import openai

# Prepare your data
users = [{"id": i, "name": f"User{i}", "active": True} for i in range(100)]

# Compress with ZON (saves tokens = saves money!)
zon_data = zon.encode(users)

# Use in prompt
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You will receive data in ZON format. Decode mentally and analyze."},
        {"role": "user", "content": f"Analyze this user data:\n\n{zon_data}\n\nHow many active users?"}
    ]
)

print(response.choices[0].message.content)

Cost Savings: ~30-40% fewer tokens vs JSON!

LangChain Integration

from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
import zon

# Prepare data
products = [
    {"name": "Laptop", "price": 999, "rating": 4.5},
    {"name": "Mouse", "price": 29, "rating": 4.2},
    # ... 100 more products
]

# Compress
zon_products = zon.encode(products)

# Create prompt template
template = """
You have access to product data in ZON format (a compressed JSON format).

Product Data:
{zon_data}

Question: {question}

Please analyze the data and answer.
"""

prompt = PromptTemplate(
    input_variables=["zon_data", "question"],
    template=template
)

# Use with LangChain
llm = OpenAI(temperature=0)
chain = prompt | llm

result = chain.invoke({
    "zon_data": zon_products,
    "question": "What's the average price of products with rating > 4?"
})

print(result)

📊 Benchmark Results

Unified Benchmark Results

JSON vs ZON

Dataset Records JSON Size ZON Size Compression JSON tk ZON tk
analytics 60 5.9 KB 2.1 KB +63.6% 2343 1396
complex_nested 1000 381.3 KB 296.8 KB +22.2% 121213 108563
employees 100 13.7 KB 5.9 KB +56.9% 3624 2083
github-repos 100 33.7 KB 21.0 KB +37.8% 12124 8693
hikes 1 451.0 B 264.0 B +41.5% 139 96
internet_github_repos 100 411.4 KB 345.6 KB +16.0% 113357 98980
internet_posts 100 24.0 KB 20.5 KB +14.6% 6093 5249
internet_random_users 50 53.4 KB 44.5 KB +16.7% 19860 18637
internet_users 10 4.0 KB 3.1 KB +23.8% 1225 1093
mongodb_irregular 50 16.0 KB 13.5 KB +15.6% 5832 5570
orders 50 20.1 KB 14.1 KB +29.9% 6906 5814

Summary

  • Total JSON (compact) size: 963.9 KB
  • Total ZON size: 767.3 KB
  • Overall compression: 20.4%

TOON Comparison

(datasets with .toon files)

Dataset Records JSON Size ZON Size TOON Size vs TOON JSON tk ZON tk TOON tk
hikes 3 451.0 B 264.0 B 286.0 B +7.7% 139 96 104

📚 API Reference

zon.encode(data)

Encodes a Python object (dict or list) into a ZON-formatted string.

Parameters:

  • data (Any): The input data to encode. Must be JSON-serializable (dict, list, str, int, float, bool, None).

Returns:

  • str: The ZON-encoded string.

Example:

import zon
data = {"id": 1, "name": "Alice"}
zon_str = zon.encode(data)

zon.decode(zon_str)

Decodes a ZON-formatted string back into a Python object.

Parameters:

  • zon_str (str): The ZON-encoded string to decode.

Returns:

  • Any: The decoded Python object (dict or list).

Example:

import zon
data = zon.decode(zon_str)
print(data["name"])  # "Alice"

🤝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new features
  4. Submit a pull request

📄 License

Proprietary License - Free for Production Use

You CAN:

  • Use ZON in production (commercial or non-commercial)
  • Integrate into your applications and services
  • Deploy at any scale

You CANNOT:

  • Redistribute or sell the source code
  • Modify and redistribute
  • Create competing products

Copyright (c) 2025 Roni Bhakta. All Rights Reserved.

See LICENSE for full terms. For custom licensing: ronibhakta1@gmail.com


🙏 Acknowledgments

  • Inspired by TOON format for LLM token efficiency
  • Benchmark datasets from JSONPlaceholder, GitHub API, Random User Generator, StackExchange API
  • Community feedback and testing

✉️ Support


Made with ❤️ for the LLM community

ZON v1.0+ - Compression that scales with complexity

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zon_format-1.0.2.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zon_format-1.0.2-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file zon_format-1.0.2.tar.gz.

File metadata

  • Download URL: zon_format-1.0.2.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for zon_format-1.0.2.tar.gz
Algorithm Hash digest
SHA256 3e5774d463124e3521323ddfdfc29df3276020f1b8d2ee385d3d67f8ffaf1684
MD5 dc735b7cdd69190476ba7bfeb3a34b0f
BLAKE2b-256 d4b6d7ac84a60896032f51d7b95674f8d5f88c6cb6f6cafa95002aab9e340f7f

See more details on using hashes here.

File details

Details for the file zon_format-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: zon_format-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for zon_format-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b9fa810ac0fdd0582344c77b97c986bd190b4680f093e4b1074a3591e1c33a74
MD5 026bebb5d9fb769c825ef16fbea0b795
BLAKE2b-256 205ab66e5540ba3c20d75f70d09f10ba6b053d403a0b6588df0e41c143ff8494

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page