Skip to main content

A comprehensive Python library for encoding, decoding, and converting data using the Token-Oriented Object Notation (TOON) format - optimized for LLM contexts and human readability

Project description

TOON - Token-Oriented Object Notation

A Python library for working with TOON (Token-Oriented Object Notation) format, a compact and human-readable serialization format optimized for Large Language Model (LLM) contexts.

Features

  • TOON Encoder & Decoder - Convert Python data structures to/from TOON format
  • JSON Conversion - Transform between JSON and TOON
  • CSV Conversion - Convert tabular CSV to/from TOON
  • XML Conversion - Transform between XML and TOON
  • Zero External Dependencies - Everything built from scratch
  • TDD - 146 tests with 100% pass rate
  • Clean Code - Clean, modular, and well-documented code
  • Type-Safe - Complete type hints for better developer experience

Installation

pip install pytoon

Or with Poetry:

poetry add pytoon

Quick Start

Basic Encoder and Decoder

from toon import ToonEncoder, ToonDecoder

# Encode Python data to TOON
encoder = ToonEncoder()
data = {
    "id": 123,
    "name": "Ada",
    "tags": ["admin", "user"],
    "active": True
}
toon_str = encoder.encode(data)
print(toon_str)
# Output:
# id: 123
# name: Ada
# tags[2]: admin,user
# active: true

# Decode TOON back to Python
decoder = ToonDecoder()
decoded_data = decoder.decode(toon_str)
print(decoded_data)
# Output: {'id': 123, 'name': 'Ada', 'tags': ['admin', 'user'], 'active': True}

Tabular Arrays

from toon import ToonEncoder

encoder = ToonEncoder()
data = [
    {"sku": "A1", "qty": 2, "price": 9.99},
    {"sku": "B2", "qty": 1, "price": 14.5}
]
toon_str = encoder.encode(data)
print(toon_str)
# Output:
# [2]{sku,qty,price}:
#   A1,2,9.99
#   B2,1,14.5

JSON ↔ TOON Conversion

from toon import ToonConverter

converter = ToonConverter()

# JSON to TOON
json_str = '{"id": 1, "name": "Alice", "roles": ["admin", "user"]}'
toon = converter.json_to_toon(json_str)
print(toon)
# Output:
# id: 1
# name: Alice
# roles[2]: admin,user

# TOON to JSON
json_result = converter.toon_to_json(toon, indent=2)
print(json_result)

CSV ↔ TOON Conversion

from toon import ToonConverter

converter = ToonConverter()

# CSV to TOON
csv_data = """id,name,active
1,Ada,true
2,Bob,false"""

toon = converter.csv_to_toon(csv_data)
print(toon)
# Output:
# [2]{id,name,active}:
#   1,Ada,true
#   2,Bob,false

# TOON to CSV
csv_result = converter.toon_to_csv(toon)
print(csv_result)

XML ↔ TOON Conversion

from toon import ToonConverter

converter = ToonConverter()

# XML to TOON
xml_data = """<users>
    <user>
        <id>1</id>
        <name>Ada</name>
    </user>
</users>"""

toon = converter.xml_to_toon(xml_data)
print(toon)

# TOON to XML
xml_result = converter.toon_to_xml(toon, root_name="users")
print(xml_result)

Configuration Options

Encoder

from toon import ToonEncoder

# Customize indentation, delimiter, and length marker
encoder = ToonEncoder(
    indent=4,              # Spaces per level (default: 2)
    delimiter="|",         # Delimiter: ",", "\t", or "|" (default: ",")
    length_marker=True     # Include # marker (default: False)
)

data = [1, 2, 3]
print(encoder.encode(data))
# Output: [#3|]: 1|2|3

Decoder

from toon import ToonDecoder

# Strict or permissive mode
decoder = ToonDecoder(
    indent=2,      # Expected spaces per level (default: 2)
    strict=True    # Strict validation mode (default: True)
)

TOON Format Features

Primitives

# Numbers, strings, booleans, and null
encoder.encode(42)        # → "42"
encoder.encode("hello")   # → "hello"
encoder.encode(True)      # → "true"
encoder.encode(None)      # → "null"

Objects

# Simple and nested objects
data = {
    "user": {
        "id": 1,
        "name": "Ada"
    }
}
# Output:
# user:
#   id: 1
#   name: Ada

Primitive Arrays (Inline)

[1, 2, 3, 4, 5]
# → [5]: 1,2,3,4,5

Object Arrays (Tabular)

[
    {"id": 1, "name": "Ada"},
    {"id": 2, "name": "Bob"}
]
# → [2]{id,name}:
#     1,Ada
#     2,Bob

Mixed Arrays (Expanded)

[1, {"a": 1}, "text"]
# → [3]:
#     - 1
#     - a: 1
#     - text

Alternative Delimiters

# Tab-delimited
encoder = ToonEncoder(delimiter="\t")
[1, 2, 3]
# → [3	]: 1	2	3

# Pipe-delimited
encoder = ToonEncoder(delimiter="|")
[1, 2, 3]
# → [3|]: 1|2|3

Project Structure

toon/
├── src/
│   └── toon/
│       ├── __init__.py      # Main exports
│       ├── encoder.py       # ToonEncoder
│       ├── decoder.py       # ToonDecoder, ToonDecodeError
│       └── converter.py     # ToonConverter (JSON/CSV/XML)
├── tests/
│   ├── test_encoder.py      # Encoder tests (57 tests)
│   ├── test_decoder.py      # Decoder tests (56 tests)
│   └── test_converter.py    # Converter tests (33 tests)
├── examples.py              # Usage examples
├── pyproject.toml
├── README.md
└── SPEC.md                  # Complete TOON specification

Testing

# Run all tests
poetry run pytest

# Tests with coverage
poetry run pytest --cov=toon tests/

# Specific tests
poetry run pytest tests/test_encoder.py
poetry run pytest tests/test_decoder.py -v

Running Examples

# Run the examples file to see TOON in action
poetry run python examples.py

Command-Line Interface (CLI)

TOON includes a powerful CLI for file operations:

# Convert between formats
poetry run toon convert data.json data.toon --from json --to toon
poetry run toon convert inventory.csv inventory.toon --from csv --to toon

# Validate TOON files
poetry run toon validate data.toon

# Format/pretty-print TOON files
poetry run toon format data.toon

# Optimize for minimum token count
poetry run toon minify data.toon

# Show file information
poetry run toon info data.toon --verbose

# Compare TOON files
poetry run toon diff file1.toon file2.toon --semantic

Available Commands:

  • convert - Convert between JSON/CSV/XML/TOON formats
  • validate - Validate TOON file syntax
  • format - Format/pretty-print TOON files
  • minify - Optimize for minimum token count
  • info - Show detailed file information
  • diff - Compare two TOON files

See CLI.md for complete CLI documentation and examples.

Benefits of TOON

  1. Token Reduction: 30-60% fewer tokens than JSON for tabular data
  2. Readability: Clear and easy-to-read format
  3. Deterministic: Consistent and predictable encoding
  4. Strict Validation: Strict mode to ensure data integrity
  5. Interoperability: Easy conversion between JSON, CSV, and XML

Use Cases

  • 📊 Efficient serialization of tabular data for LLMs
  • 🔄 Data format transformation (JSON ↔ CSV ↔ XML ↔ TOON)
  • 💾 Compact storage of configurations
  • 📡 Transmission of structured data
  • 🤖 Prompt contexts for language models

License

MIT

Specification

For complete details about the TOON format, see SPEC.md.

Development

Developed following TDD (Test-Driven Development) and Clean Code principles:

  • ✅ 146 tests passing
  • ✅ Complete edge case coverage
  • ✅ Type hints throughout the codebase
  • ✅ Comprehensive documentation
  • ✅ Zero external dependencies
  • ✅ Modular and maintainable code

Author

Juan Manuel Panozzo Zenere
Email: juanmanuel.panozzozenere@alumnos.uai.edu.ar

Acknowledgments

TOON specification by Johann Schopplich (@johannschopplich)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toon_formatter-1.0.1.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toon_formatter-1.0.1-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

File details

Details for the file toon_formatter-1.0.1.tar.gz.

File metadata

  • Download URL: toon_formatter-1.0.1.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.10.18 Darwin/25.0.0

File hashes

Hashes for toon_formatter-1.0.1.tar.gz
Algorithm Hash digest
SHA256 806600392b71adc6077fe56cd947640977073f42f7abd82d09c64c7714082967
MD5 f67c00e09b1392f06237d5dd3cb5180c
BLAKE2b-256 dc4165ed3d940405ccbcfb7923ec9c3238afd6b6e1a837ecd46a1eeef530304c

See more details on using hashes here.

File details

Details for the file toon_formatter-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: toon_formatter-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 20.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.4 CPython/3.10.18 Darwin/25.0.0

File hashes

Hashes for toon_formatter-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a884832acdd8660d2c5d745995eb1066989c79a25b970bf603dcb2c1edefb9c9
MD5 b1691a5e12df2226f3dfc18c862ceedd
BLAKE2b-256 3a10e0b34dba206d01eee690d2f652442e92a71a055ad07fff3c724af01dcf2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page