A comprehensive Python library for encoding, decoding, and converting data using the Token-Oriented Object Notation (TOON) format - optimized for LLM contexts and human readability
Project description
TOON - Token-Oriented Object Notation
A Python library for working with TOON (Token-Oriented Object Notation) format, a compact and human-readable serialization format optimized for Large Language Model (LLM) contexts.
Features
- ✅ TOON Encoder & Decoder - Convert Python data structures to/from TOON format
- ✅ JSON Conversion - Transform between JSON and TOON
- ✅ CSV Conversion - Convert tabular CSV to/from TOON
- ✅ XML Conversion - Transform between XML and TOON
- ✅ Zero External Dependencies - Everything built from scratch
- ✅ TDD - 146 tests with 100% pass rate
- ✅ Clean Code - Clean, modular, and well-documented code
- ✅ Type-Safe - Complete type hints for better developer experience
Installation
pip install pytoon
Or with Poetry:
poetry add pytoon
Quick Start
Basic Encoder and Decoder
from toon import ToonEncoder, ToonDecoder
# Encode Python data to TOON
encoder = ToonEncoder()
data = {
"id": 123,
"name": "Ada",
"tags": ["admin", "user"],
"active": True
}
toon_str = encoder.encode(data)
print(toon_str)
# Output:
# id: 123
# name: Ada
# tags[2]: admin,user
# active: true
# Decode TOON back to Python
decoder = ToonDecoder()
decoded_data = decoder.decode(toon_str)
print(decoded_data)
# Output: {'id': 123, 'name': 'Ada', 'tags': ['admin', 'user'], 'active': True}
Tabular Arrays
from toon import ToonEncoder
encoder = ToonEncoder()
data = [
{"sku": "A1", "qty": 2, "price": 9.99},
{"sku": "B2", "qty": 1, "price": 14.5}
]
toon_str = encoder.encode(data)
print(toon_str)
# Output:
# [2]{sku,qty,price}:
# A1,2,9.99
# B2,1,14.5
JSON ↔ TOON Conversion
from toon import ToonConverter
converter = ToonConverter()
# JSON to TOON
json_str = '{"id": 1, "name": "Alice", "roles": ["admin", "user"]}'
toon = converter.json_to_toon(json_str)
print(toon)
# Output:
# id: 1
# name: Alice
# roles[2]: admin,user
# TOON to JSON
json_result = converter.toon_to_json(toon, indent=2)
print(json_result)
CSV ↔ TOON Conversion
from toon import ToonConverter
converter = ToonConverter()
# CSV to TOON
csv_data = """id,name,active
1,Ada,true
2,Bob,false"""
toon = converter.csv_to_toon(csv_data)
print(toon)
# Output:
# [2]{id,name,active}:
# 1,Ada,true
# 2,Bob,false
# TOON to CSV
csv_result = converter.toon_to_csv(toon)
print(csv_result)
XML ↔ TOON Conversion
from toon import ToonConverter
converter = ToonConverter()
# XML to TOON
xml_data = """<users>
<user>
<id>1</id>
<name>Ada</name>
</user>
</users>"""
toon = converter.xml_to_toon(xml_data)
print(toon)
# TOON to XML
xml_result = converter.toon_to_xml(toon, root_name="users")
print(xml_result)
Configuration Options
Encoder
from toon import ToonEncoder
# Customize indentation, delimiter, and length marker
encoder = ToonEncoder(
indent=4, # Spaces per level (default: 2)
delimiter="|", # Delimiter: ",", "\t", or "|" (default: ",")
length_marker=True # Include # marker (default: False)
)
data = [1, 2, 3]
print(encoder.encode(data))
# Output: [#3|]: 1|2|3
Decoder
from toon import ToonDecoder
# Strict or permissive mode
decoder = ToonDecoder(
indent=2, # Expected spaces per level (default: 2)
strict=True # Strict validation mode (default: True)
)
TOON Format Features
Primitives
# Numbers, strings, booleans, and null
encoder.encode(42) # → "42"
encoder.encode("hello") # → "hello"
encoder.encode(True) # → "true"
encoder.encode(None) # → "null"
Objects
# Simple and nested objects
data = {
"user": {
"id": 1,
"name": "Ada"
}
}
# Output:
# user:
# id: 1
# name: Ada
Primitive Arrays (Inline)
[1, 2, 3, 4, 5]
# → [5]: 1,2,3,4,5
Object Arrays (Tabular)
[
{"id": 1, "name": "Ada"},
{"id": 2, "name": "Bob"}
]
# → [2]{id,name}:
# 1,Ada
# 2,Bob
Mixed Arrays (Expanded)
[1, {"a": 1}, "text"]
# → [3]:
# - 1
# - a: 1
# - text
Alternative Delimiters
# Tab-delimited
encoder = ToonEncoder(delimiter="\t")
[1, 2, 3]
# → [3 ]: 1 2 3
# Pipe-delimited
encoder = ToonEncoder(delimiter="|")
[1, 2, 3]
# → [3|]: 1|2|3
Project Structure
toon/
├── src/
│ └── toon/
│ ├── __init__.py # Main exports
│ ├── encoder.py # ToonEncoder
│ ├── decoder.py # ToonDecoder, ToonDecodeError
│ └── converter.py # ToonConverter (JSON/CSV/XML)
├── tests/
│ ├── test_encoder.py # Encoder tests (57 tests)
│ ├── test_decoder.py # Decoder tests (56 tests)
│ └── test_converter.py # Converter tests (33 tests)
├── examples.py # Usage examples
├── pyproject.toml
├── README.md
└── SPEC.md # Complete TOON specification
Testing
# Run all tests
poetry run pytest
# Tests with coverage
poetry run pytest --cov=toon tests/
# Specific tests
poetry run pytest tests/test_encoder.py
poetry run pytest tests/test_decoder.py -v
Running Examples
# Run the examples file to see TOON in action
poetry run python examples.py
Command-Line Interface (CLI)
TOON includes a powerful CLI for file operations:
# Convert between formats
poetry run toon convert data.json data.toon --from json --to toon
poetry run toon convert inventory.csv inventory.toon --from csv --to toon
# Validate TOON files
poetry run toon validate data.toon
# Format/pretty-print TOON files
poetry run toon format data.toon
# Optimize for minimum token count
poetry run toon minify data.toon
# Show file information
poetry run toon info data.toon --verbose
# Compare TOON files
poetry run toon diff file1.toon file2.toon --semantic
Available Commands:
convert- Convert between JSON/CSV/XML/TOON formatsvalidate- Validate TOON file syntaxformat- Format/pretty-print TOON filesminify- Optimize for minimum token countinfo- Show detailed file informationdiff- Compare two TOON files
See CLI.md for complete CLI documentation and examples.
Benefits of TOON
- Token Reduction: 30-60% fewer tokens than JSON for tabular data
- Readability: Clear and easy-to-read format
- Deterministic: Consistent and predictable encoding
- Strict Validation: Strict mode to ensure data integrity
- Interoperability: Easy conversion between JSON, CSV, and XML
Use Cases
- 📊 Efficient serialization of tabular data for LLMs
- 🔄 Data format transformation (JSON ↔ CSV ↔ XML ↔ TOON)
- 💾 Compact storage of configurations
- 📡 Transmission of structured data
- 🤖 Prompt contexts for language models
License
MIT
Specification
For complete details about the TOON format, see SPEC.md.
Development
Developed following TDD (Test-Driven Development) and Clean Code principles:
- ✅ 146 tests passing
- ✅ Complete edge case coverage
- ✅ Type hints throughout the codebase
- ✅ Comprehensive documentation
- ✅ Zero external dependencies
- ✅ Modular and maintainable code
Author
Juan Manuel Panozzo Zenere
Email: juanmanuel.panozzozenere@alumnos.uai.edu.ar
Acknowledgments
TOON specification by Johann Schopplich (@johannschopplich)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file toon_formatter-1.0.1.tar.gz.
File metadata
- Download URL: toon_formatter-1.0.1.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.10.18 Darwin/25.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
806600392b71adc6077fe56cd947640977073f42f7abd82d09c64c7714082967
|
|
| MD5 |
f67c00e09b1392f06237d5dd3cb5180c
|
|
| BLAKE2b-256 |
dc4165ed3d940405ccbcfb7923ec9c3238afd6b6e1a837ecd46a1eeef530304c
|
File details
Details for the file toon_formatter-1.0.1-py3-none-any.whl.
File metadata
- Download URL: toon_formatter-1.0.1-py3-none-any.whl
- Upload date:
- Size: 20.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.4 CPython/3.10.18 Darwin/25.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a884832acdd8660d2c5d745995eb1066989c79a25b970bf603dcb2c1edefb9c9
|
|
| MD5 |
b1691a5e12df2226f3dfc18c862ceedd
|
|
| BLAKE2b-256 |
3a10e0b34dba206d01eee690d2f652442e92a71a055ad07fff3c724af01dcf2a
|