Skip to main content

High-performance JSON ↔ TOON converter. Reduce LLM token usage by 30-60% with Rust-powered Python bindings.

Project description

TOONify Python Bindings

High-performance JSON ↔ TOON converter with native Rust bindings for Python.

Reduce LLM token usage by 30-60% with TOON (Token-Oriented Object Notation).

Installation

pip install toonify

Quick Start

from toonify import json_to_toon, toon_to_json
import json

# Convert JSON to TOON format
data = {"users": [{"id": 1, "name": "Alice", "role": "admin"}]}
toon = json_to_toon(json.dumps(data))
print("TOON:", toon)
# Output: users[1]{id,name,role}:
#         1,Alice,admin

# Convert TOON back to JSON
json_result = toon_to_json(toon)
parsed = json.loads(json_result)
print("JSON:", parsed)
# Output: {'users': [{'id': 1, 'name': 'Alice', 'role': 'admin'}]}

What is TOON?

TOON (Token-Oriented Object Notation) is a compact data format designed to minimize token usage for AI and LLM applications.

Comparison:

// JSON (25 tokens)
{
  "users": [
    {
      "id": 1,
      "name": "Alice",
      "role": "admin"
    }
  ]
}
# TOON (3 tokens - 88% reduction)
users[1]{id,name,role}:
1,Alice,admin

API Reference

json_to_toon(json_data: str) -> str

Converts a JSON string to TOON format.

Parameters:

  • json_data (str): A valid JSON string

Returns:

  • str: TOON formatted string

Raises:

  • ToonError: If JSON is invalid or conversion fails

Example:

from toonify import json_to_toon

json_str = '{"products":[{"sku":"ABC","price":19.99}]}'
toon = json_to_toon(json_str)
print(toon)
# products[1]{sku,price}:
# ABC,19.99

toon_to_json(toon_data: str) -> str

Converts a TOON formatted string to JSON.

Parameters:

  • toon_data (str): A valid TOON formatted string

Returns:

  • str: JSON string

Raises:

  • ToonError: If TOON format is invalid or conversion fails

Example:

from toonify import toon_to_json

toon = '''products[1]{sku,price}:
ABC,19.99'''
json_str = toon_to_json(toon)
print(json_str)
# {"products":[{"sku":"ABC","price":19.99}]}

Error Handling

from toonify import json_to_toon, ToonError

try:
    result = json_to_toon('invalid json')
except ToonError as e:
    print(f"Conversion failed: {e}")

High-Performance Caching

For repeated conversions, use CachedConverter for 10-330x speedup:

from toonify import CachedConverter

# Create cached converter (Moka + Sled)
converter = CachedConverter(
    cache_size=100,              # Max 100 entries in memory
    cache_ttl_secs=3600,         # 1 hour TTL (None = forever)
    persistent_path="./cache.db" # Persistent storage (None = memory only)
)

# First conversion (cache miss)
json_data = '{"users": [{"id": 1, "name": "Alice"}]}'
toon1 = converter.json_to_toon(json_data)  # ~1ms

# Second conversion (cache hit)
toon2 = converter.json_to_toon(json_data)  # <100ns (330x faster!)

# Check cache stats
print(converter.cache_stats())
# Cache Statistics:
#   Moka entries: 1
#   Moka weighted size: 1 bytes
#   Sled entries: 1

# Clear cache
converter.clear_cache()

Cache Architecture:

  • Moka: Lock-free concurrent in-memory cache (hot path)
  • Sled: Embedded persistent database (survives restarts)
  • Lookup: Moka → Sled → Conversion

Use Cases

LLM API Cost Reduction

Before (JSON):

import openai
import json

prompt = {"users": [...]}  # 1000 tokens
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": json.dumps(prompt)}]
)
# Cost: $0.03 per 1K tokens = $0.03

After (TOON):

import openai
from toonify import json_to_toon

prompt = {"users": [...]}
toon_prompt = json_to_toon(json.dumps(prompt))  # 350 tokens
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": toon_prompt}]
)
# Cost: $0.03 per 1K tokens = $0.0105 (65% savings!)

Roundtrip Conversion

from toonify import json_to_toon, toon_to_json
import json

# Original data
original = {
    "products": [
        {"sku": "ABC123", "name": "Widget", "price": 19.99},
        {"sku": "DEF456", "name": "Gadget", "price": 29.99}
    ]
}

# JSON → TOON
toon = json_to_toon(json.dumps(original))
print("TOON format:")
print(toon)

# TOON → JSON
json_str = toon_to_json(toon)
result = json.loads(json_str)

# Verify roundtrip
assert original == result  # Perfect preservation

Data Pipeline Integration

from toonify import json_to_toon
import gzip

# Convert and compress for storage
data = {"records": [...]}
toon = json_to_toon(json.dumps(data))
compressed = gzip.compress(toon.encode())

# Massive size reduction
print(f"Original JSON: {len(json.dumps(data))} bytes")
print(f"TOON: {len(toon)} bytes")
print(f"TOON + gzip: {len(compressed)} bytes")

Performance

Payload Size Conversion Time
< 1KB < 1ms
1-100KB 1-10ms
> 100KB 10-100ms

Token Savings

Data Type JSON Tokens TOON Tokens Savings
User list (3 items) 45 12 73%
Product catalog (10 items) 180 48 73%
API response (nested) 120 35 71%
Time series (100 points) 600 150 75%

Requirements

  • Python 3.8+
  • Works on macOS, Linux, and Windows

Features

  • Blazing Fast: Native Rust implementation
  • Zero Dependencies: Pure Rust + Python ctypes
  • Type Safe: Full error handling with ToonError
  • Roundtrip Safe: Perfect data preservation
  • Memory Efficient: Minimal allocations
  • Production Ready: Comprehensive test coverage

Platform Support

Platform Library File Status
macOS libtoonify.dylib ✓ Supported
Linux libtoonify.so ✓ Supported
Windows toonify.dll ✓ Supported

Advanced Usage

Batch Processing

from toonify import json_to_toon
import json
import os

# Convert multiple JSON files
for filename in os.listdir("data/"):
    if filename.endswith(".json"):
        with open(f"data/{filename}") as f:
            data = json.load(f)
        
        toon = json_to_toon(json.dumps(data))
        
        with open(f"data/{filename}.toon", "w") as f:
            f.write(toon)

Integration with LLM Libraries

from toonify import json_to_toon, toon_to_json
import anthropic

client = anthropic.Anthropic()

# Prepare data in TOON format for token efficiency
data = {"users": [...], "orders": [...]}
toon_data = json_to_toon(json.dumps(data))

message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": f"Analyze this data: {toon_data}"
    }]
)

print(f"Tokens saved: ~{len(json.dumps(data)) - len(toon_data)} characters")

Development

Built with:

  • Rust - High-performance core implementation
  • UniFFI - Automatic FFI bindings generation (Mozilla)
  • nom - Parser combinators for TOON parsing

Links

License

MIT License - see LICENSE

Contributing

Contributions welcome! Please see the main repository for contribution guidelines.


Questions? Open an issue or check the documentation.

Like this project? Star the repo and share with your AI engineering team!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toonifypy-1.1.0.tar.gz (787.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toonifypy-1.1.0-py3-none-any.whl (786.2 kB view details)

Uploaded Python 3

File details

Details for the file toonifypy-1.1.0.tar.gz.

File metadata

  • Download URL: toonifypy-1.1.0.tar.gz
  • Upload date:
  • Size: 787.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for toonifypy-1.1.0.tar.gz
Algorithm Hash digest
SHA256 88e810ed16de973a1089db24183587f267966198947d554a9c57ae226b036757
MD5 48d548a60dbd789030f8f6bc97907afb
BLAKE2b-256 571b0b93b774c20438f8b826c9d2b00d85922e64741881e6817d3c9a56a7f4c8

See more details on using hashes here.

File details

Details for the file toonifypy-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: toonifypy-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 786.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for toonifypy-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e64ba9695bb51cb1372fee93d771f65a2590546a8732c019fe4479b2e9276d9b
MD5 206265f6cfa082280e66bd337bbc38f4
BLAKE2b-256 580db8b23d09945dcc978d8aec8bfab4e092049d1c7dce53beaac57465624734

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page