Skip to main content

High-performance JSON ↔ TOON converter. Reduce LLM token usage by 30-60% with Rust-powered Python bindings.

Project description

TOONify Python Bindings

High-performance JSON ↔ TOON converter with native Rust bindings for Python.

Reduce LLM token usage by 30-60% with TOON (Token-Oriented Object Notation).

Installation

pip install toonifypy

Note: The package is installed as toonifypy, but you import it as toonify:

pip install toonifypy  # Install command
from toonify import ... # Import statement

This follows common Python packaging practice (like pip install pillowfrom PIL import ...).

Quick Start

from toonify import json_to_toon, toon_to_json
import json

# Convert JSON to TOON format
data = {"users": [{"id": 1, "name": "Alice", "role": "admin"}]}
toon = json_to_toon(json.dumps(data))
print("TOON:", toon)
# Output: users[1]{id,name,role}:
#         1,Alice,admin

# Convert TOON back to JSON
json_result = toon_to_json(toon)
parsed = json.loads(json_result)
print("JSON:", parsed)
# Output: {'users': [{'id': 1, 'name': 'Alice', 'role': 'admin'}]}

What is TOON?

TOON (Token-Oriented Object Notation) is a compact data format designed to minimize token usage for AI and LLM applications.

Comparison:

// JSON (25 tokens)
{
  "users": [
    {
      "id": 1,
      "name": "Alice",
      "role": "admin"
    }
  ]
}
# TOON (3 tokens - 88% reduction)
users[1]{id,name,role}:
1,Alice,admin

API Reference

json_to_toon(json_data: str) -> str

Converts a JSON string to TOON format.

Parameters:

  • json_data (str): A valid JSON string

Returns:

  • str: TOON formatted string

Raises:

  • ToonError: If JSON is invalid or conversion fails

Example:

from toonify import json_to_toon

json_str = '{"products":[{"sku":"ABC","price":19.99}]}'
toon = json_to_toon(json_str)
print(toon)
# products[1]{sku,price}:
# ABC,19.99

toon_to_json(toon_data: str) -> str

Converts a TOON formatted string to JSON.

Parameters:

  • toon_data (str): A valid TOON formatted string

Returns:

  • str: JSON string

Raises:

  • ToonError: If TOON format is invalid or conversion fails

Example:

from toonify import toon_to_json

toon = '''products[1]{sku,price}:
ABC,19.99'''
json_str = toon_to_json(toon)
print(json_str)
# {"products":[{"sku":"ABC","price":19.99}]}

Error Handling

from toonify import json_to_toon, ToonError

try:
    result = json_to_toon('invalid json')
except ToonError as e:
    print(f"Conversion failed: {e}")

High-Performance Caching

For repeated conversions, use CachedConverter for 10-330x speedup:

from toonify import CachedConverter

# Create cached converter (Moka + Sled)
converter = CachedConverter(
    cache_size=100,              # Max 100 entries in memory
    cache_ttl_secs=3600,         # 1 hour TTL (None = forever)
    persistent_path="./cache.db" # Persistent storage (None = memory only)
)

# First conversion (cache miss)
json_data = '{"users": [{"id": 1, "name": "Alice"}]}'
toon1 = converter.json_to_toon(json_data)  # ~1ms

# Second conversion (cache hit)
toon2 = converter.json_to_toon(json_data)  # <100ns (330x faster!)

# Check cache stats
print(converter.cache_stats())
# Cache Statistics:
#   Moka entries: 1
#   Moka weighted size: 1 bytes
#   Sled entries: 1

# Clear cache
converter.clear_cache()

Cache Architecture:

  • Moka: Lock-free concurrent in-memory cache (hot path)
  • Sled: Embedded persistent database (survives restarts)
  • Lookup: Moka → Sled → Conversion

Use Cases

LLM API Cost Reduction

Before (JSON):

import openai
import json

prompt = {"users": [...]}  # 1000 tokens
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": json.dumps(prompt)}]
)
# Cost: $0.03 per 1K tokens = $0.03

After (TOON):

import openai
from toonify import json_to_toon

prompt = {"users": [...]}
toon_prompt = json_to_toon(json.dumps(prompt))  # 350 tokens
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": toon_prompt}]
)
# Cost: $0.03 per 1K tokens = $0.0105 (65% savings!)

Roundtrip Conversion

from toonify import json_to_toon, toon_to_json
import json

# Original data
original = {
    "products": [
        {"sku": "ABC123", "name": "Widget", "price": 19.99},
        {"sku": "DEF456", "name": "Gadget", "price": 29.99}
    ]
}

# JSON → TOON
toon = json_to_toon(json.dumps(original))
print("TOON format:")
print(toon)

# TOON → JSON
json_str = toon_to_json(toon)
result = json.loads(json_str)

# Verify roundtrip
assert original == result  # Perfect preservation

Data Pipeline Integration

from toonify import json_to_toon
import gzip

# Convert and compress for storage
data = {"records": [...]}
toon = json_to_toon(json.dumps(data))
compressed = gzip.compress(toon.encode())

# Massive size reduction
print(f"Original JSON: {len(json.dumps(data))} bytes")
print(f"TOON: {len(toon)} bytes")
print(f"TOON + gzip: {len(compressed)} bytes")

Performance

Payload Size Conversion Time
< 1KB < 1ms
1-100KB 1-10ms
> 100KB 10-100ms

Token Savings

Data Type JSON Tokens TOON Tokens Savings
User list (3 items) 45 12 73%
Product catalog (10 items) 180 48 73%
API response (nested) 120 35 71%
Time series (100 points) 600 150 75%

Requirements

  • Python 3.8+
  • Works on macOS, Linux, and Windows

Features

  • Blazing Fast: Native Rust implementation
  • Zero Dependencies: Pure Rust + Python ctypes
  • Type Safe: Full error handling with ToonError
  • Roundtrip Safe: Perfect data preservation
  • Memory Efficient: Minimal allocations
  • Production Ready: Comprehensive test coverage

Platform Support

Platform Library File Status
macOS libtoonify.dylib ✓ Supported
Linux libtoonify.so ✓ Supported
Windows toonify.dll ✓ Supported

Advanced Usage

Batch Processing

from toonify import json_to_toon
import json
import os

# Convert multiple JSON files
for filename in os.listdir("data/"):
    if filename.endswith(".json"):
        with open(f"data/{filename}") as f:
            data = json.load(f)
        
        toon = json_to_toon(json.dumps(data))
        
        with open(f"data/{filename}.toon", "w") as f:
            f.write(toon)

Integration with LLM Libraries

from toonify import json_to_toon, toon_to_json
import anthropic

client = anthropic.Anthropic()

# Prepare data in TOON format for token efficiency
data = {"users": [...], "orders": [...]}
toon_data = json_to_toon(json.dumps(data))

message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": f"Analyze this data: {toon_data}"
    }]
)

print(f"Tokens saved: ~{len(json.dumps(data)) - len(toon_data)} characters")

Development

Built with:

  • Rust - High-performance core implementation
  • UniFFI - Automatic FFI bindings generation (Mozilla)
  • nom - Parser combinators for TOON parsing

Links

License

MIT License - see LICENSE

Contributing

Contributions welcome! Please see the main repository for contribution guidelines.


Questions? Open an issue or check the documentation.

Like this project? Star the repo and share with your AI engineering team!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toonifypy-1.1.7.tar.gz (790.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toonifypy-1.1.7-py3-none-any.whl (788.4 kB view details)

Uploaded Python 3

File details

Details for the file toonifypy-1.1.7.tar.gz.

File metadata

  • Download URL: toonifypy-1.1.7.tar.gz
  • Upload date:
  • Size: 790.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for toonifypy-1.1.7.tar.gz
Algorithm Hash digest
SHA256 d501343d3e19af573d4d8792f206fc94e6ffbc0a4842131d3b6b960d16b10f69
MD5 b63373da0427babb3d250fc9614d49c4
BLAKE2b-256 e97c162a8480dd24d06f2fc6564c977bd4de86df1ff53ae0488c64930f38150c

See more details on using hashes here.

File details

Details for the file toonifypy-1.1.7-py3-none-any.whl.

File metadata

  • Download URL: toonifypy-1.1.7-py3-none-any.whl
  • Upload date:
  • Size: 788.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for toonifypy-1.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 fb49f3a04604ac869f7179ed7a421e27fb51631d39ff4137c0d8fc2081128601
MD5 37d45e0184f25be6b061f60a9313e536
BLAKE2b-256 b18fb62d6b62a0798b2fbaf9f109104a00f66149eaa2bfdd6c16593e1f56dd2e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page