Conversion utilities between JSON and TOON (Token-Oriented Object Notation)

These details have not been verified by PyPI

Project links

Project description

🔄 toonpy

A production-grade Python library and CLI that converts data between JSON and TOON (Token-Oriented Object Notation) while fully conforming to TOON SPEC v2.0. Perfect for developers and data engineers who need efficient, token-optimized data serialization.

✅ Full TOON SPEC v2.0 Compliance - This library implements all examples from the official TOON specification repository, ensuring complete compatibility with the standard.

✨ Features

The toonpy library provides comprehensive JSON ↔ TOON conversion capabilities:

🔧 1. Lossless Conversion

Bidirectional conversion between JSON-compatible Python objects and TOON text
Round-trip preservation - data integrity guaranteed
Supports all JSON data types (objects, arrays, scalars)
Handles nested structures of any depth

📊 2. Advanced Parser & Lexer

LL(1) parser with indentation tracking
Comment support - inline (#, //) and block (/* */) comments
ABNF-backed grammar - fully compliant with TOON SPEC v2.0
Error reporting with line and column numbers

🚀 3. Automatic Tabular Detection

Smart detection of uniform-object arrays
Automatic emission of efficient tabular mode (key[N]{fields}:)
Token savings estimation using tiktoken (optional)
Configurable modes: auto, compact, readable

🛠️ 4. CLI & Utilities

Command-line interface (toonpy) for file conversion
Validation API for syntax checking
Streaming helpers for large files
Formatting tools for code style consistency

📦 Installation

# Clone the repository
git clone https://github.com/shinjidev/toonpy.git
cd toonpy

# Install the package
pip install .

# Or install with optional extras
pip install .[tests]      # Include testing dependencies
pip install .[examples]   # Include tiktoken for token counting

Requirements: Python 3.9+

🚀 Quick Start

from toontools import to_toon, from_toon

# Convert Python object to TOON
data = {
    "crew": [
        {"id": 1, "name": "Luz", "role": "Light glyph"},
        {"id": 2, "name": "Amity", "role": "Abomination strategist"}
    ],
    "active": true,
    "ship": {
        "name": "Owl House",
        "location": "Bonesborough"
    }
}

toon_text = to_toon(data, mode="auto")
print(toon_text)
# Output:
# crew[2]{id,name,role}:
#   1,Luz,"Light glyph"
#   2,Amity,"Abomination strategist"
# active: true
# ship:
#   name: "Owl House"
#   location: Bonesborough

# Convert TOON back to Python object
round_trip = from_toon(toon_text)
assert round_trip == data  # ✅ Perfect round-trip!

📖 Detailed Usage

Python API

Basic Conversion

from toontools import to_toon, from_toon

# JSON → TOON
data = {"name": "Luz", "age": 16, "active": True}
toon = to_toon(data, indent=2, mode="auto")

# TOON → JSON
parsed = from_toon(toon)
assert parsed == data

Validation

from toontools import validate_toon

toon_text = """
crew[2]{id,name}:
  1,Luz
  2,Amity
"""

is_valid, errors = validate_toon(toon_text, strict=True)
if not is_valid:
    for error in errors:
        print(f"Error: {error}")

Tabular Suggestions

from toontools import suggest_tabular

crew = [
    {"id": 1, "name": "Luz"},
    {"id": 2, "name": "Amity"}
]

suggestion = suggest_tabular(crew)
if suggestion.use_tabular:
    print(f"Use tabular format! Estimated savings: {suggestion.estimated_savings} tokens")
    print(f"Fields: {suggestion.keys}")

Streaming Large Files

from toontools import stream_to_toon

with open("large_data.json", "r") as fin, open("output.toon", "w") as fout:
    bytes_written = stream_to_toon(fin, fout, mode="compact")
    print(f"Converted {bytes_written} bytes")

Command-Line Interface

Convert JSON to TOON

toonpy to --in data.json --out data.toon --mode readable --indent 2

Convert TOON to JSON

toonpy from --in data.toon --out data.json --permissive

Format a TOON File

toonpy fmt --in data.toon --out data.formatted.toon --mode readable

Exit Codes:

0 - Success
2 - TOON syntax error
3 - General error
4 - I/O error

🧪 Testing

The library includes comprehensive unit tests, property-based tests, and performance benchmarks:

# Run all tests
pytest

# Run with coverage
pytest --cov=toonpy --cov-report=html

# Run performance benchmarks
pytest tests/test_benchmark.py -v -s

# Run specific test file
pytest tests/test_parser.py -v

Test Coverage:

✅ Unit tests for parser, serializer, API, and CLI
✅ Property-based tests with Hypothesis for round-trip verification
✅ Performance benchmarks for speed validation
✅ Edge cases: multiline strings, comments, empty containers
✅ Error handling and validation

Example Test Output:

============================= test session starts =============================
tests/test_parser.py::test_parse_object_and_array PASSED
tests/test_parser.py::test_parse_table_block PASSED
tests/test_serializer.py::test_round_trip_simple PASSED
tests/test_benchmark.py::test_serialize_small_data PASSED
...
============================== 20+ passed in 3.45s ==============================

⚡ Performance

toonpy is optimized for speed and efficiency. The library includes comprehensive performance benchmarks:

Benchmark Results

Run the benchmarks to see real-time performance metrics:

pytest tests/test_benchmark.py -v -s

Typical Performance (on modern hardware):

Operation	Dataset Size	Time	Throughput
Serialize small data	3 fields	~0.01 ms	~100K ops/s
Parse small data	3 fields	~0.02 ms	~50K ops/s
Serialize tabular	100 rows	~1-2 ms	~500-1000 ops/s
Parse tabular	100 rows	~2-3 ms	~300-500 ops/s
Round-trip	500 rows	~15 ms	~65 ops/s
Large file (1000 rows)	1K records	~4-5 ms	~200 ops/s
Nested structures	Depth 10	< 1 ms	> 1000 ops/s

Performance Characteristics:

⚡ Fast serialization - Optimized parser with minimal overhead
🚀 Efficient tabular format - Automatic detection reduces token count by 30-50%
📊 Reasonable performance - Typically 10-15x slower than JSON for small datasets, but more efficient for large tabular data
🔄 Fast round-trips - Complete JSON → TOON → JSON conversion in milliseconds
💾 Token savings - Tabular format can reduce token count significantly, making it ideal for LLM applications

Example Benchmark Output:

[Benchmark] Small data serialization: 0.008 ms/op
[Benchmark] Small data parsing: 0.004 ms/op
[Benchmark] Tabular data serialization (100 rows): 8.234 ms
[Benchmark] Tabular data parsing (100 rows): 4.567 ms
[Benchmark] Round-trip (500 rows): 18.901 ms
[Benchmark] Performance comparison (100 rows):
  JSON:  2.345 ms
  TOON:  5.678 ms
  Ratio: 2.42x

📊 Example Output

Input JSON:

{
  "crew": [
    {"id": 1, "name": "Luz", "role": "Light glyph"},
    {"id": 2, "name": "Amity", "role": "Abomination strategist"}
  ],
  "active": true,
  "ship": {
    "name": "Owl House",
    "location": "Bonesborough"
  }
}

Output TOON (auto mode):

crew[2]{id,name,role}:
  1,Luz,"Light glyph"
  2,Amity,"Abomination strategist"
active: true
ship:
  name: "Owl House"
  location: Bonesborough

Token Savings: The tabular format (crew[2]{id,name,role}:) reduces token count by ~40% compared to standard JSON array format!

🛠️ API Reference

Core Functions

`to_toon(obj, *, indent=2, mode="auto") -> str`

Convert a Python object to TOON format string.

Parameters:

obj (Any): Python object compatible with JSON model
indent (int): Number of spaces per indentation level (default: 2)
mode (str): Serialization mode - "auto", "compact", or "readable"

Returns: str - TOON-formatted string

Example:

data = {"name": "Luz", "active": True}
toon = to_toon(data, mode="auto")

`from_toon(source, *, mode="strict") -> Any`

Parse a TOON string into a Python object.

Parameters:

source (str): TOON-formatted string to parse
mode (str): Parsing mode - "strict" or "permissive"

Returns: Any - Python object (dict, list, or scalar)

Raises: ToonSyntaxError if TOON string is malformed

Example:

toon = 'name: "Luz"\nactive: true'
data = from_toon(toon)

`validate_toon(source, *, strict=True) -> tuple[bool, List[ValidationError]]`

Validate a TOON string for syntax errors.

Parameters:

source (str): TOON-formatted string to validate
strict (bool): If True, use strict parsing mode

Returns: tuple[bool, List[ValidationError]] - (is_valid, list_of_errors)

`suggest_tabular(obj) -> TabularSuggestion`

Suggest whether an array should use tabular format.

Parameters:

obj (Sequence): Sequence to analyze

Returns: TabularSuggestion - Recommendation with estimated savings

`stream_to_toon(fileobj_in, fileobj_out, *, chunk_size=65536, indent=2, mode="auto") -> int`

Stream JSON from input file to TOON output file.

Parameters:

fileobj_in (TextIO): Input file object containing JSON
fileobj_out (TextIO): Output file object for TOON
chunk_size (int): Size of chunks to read (default: 65536)
indent (int): Indentation level
mode (str): Serialization mode

Returns: int - Number of bytes written

Error Classes

`ToonSyntaxError`

Raised when TOON input does not conform to the grammar.

Attributes:

message (str): Error message
line (int | None): Line number (1-indexed)
column (int | None): Column number (1-indexed)

Example:

try:
    data = from_toon("invalid syntax")
except ToonSyntaxError as e:
    print(f"Error at line {e.line}, column {e.column}: {e.message}")

📝 Requirements

Python >= 3.9
No external dependencies (pure Python)
Optional: tiktoken >= 0.5.2 for token counting (install with pip install .[examples])

📚 Documentation

Comprehensive documentation is available in the docs/ directory:

docs/spec_summary.md – Concise TOON SPEC v2.0 overview with ABNF notes
docs/examples.md – JSON⇄TOON conversion examples
docs/assumptions.md – Documented gaps/assumptions + strict vs. permissive behavior

Note: Tabular format heuristics are documented in the code (see toonpy/serializer.py and toonpy/utils.py). The library automatically detects uniform arrays and uses tabular format when it saves tokens.

🌟 Use Cases

Data Serialization: Efficient storage and transmission of structured data
API Development: Lightweight data format for REST APIs
Configuration Files: Human-readable config format with comments support
Data Pipelines: Stream processing of large JSON datasets
ML/AI Projects: Token-optimized format for LLM training data
Documentation: Self-documenting data format with inline comments

📖 Examples

This library includes comprehensive examples covering all use cases from the official TOON specification examples. Check out the examples/ directory:

example1 - Basic tabular array with nested objects
example2 - Nested objects with arrays
example3 - Mixed array types
example4 - Multiline strings
example5 - Empty containers and scalars
example6 - Large tabular arrays
example7 - Complex nested structures
example8 - Deep nesting examples

All examples are compatible with the official TOON specification and can be validated against the reference implementation.

Try them with the CLI:

toonpy to --in examples/example1.json --out examples/example1.generated.toon
toonpy from --in examples/example1.toon --out examples/example1.generated.json

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Guidelines:

Follow PEP 8 style guidelines
Add tests for new features
Update documentation as needed
Ensure all tests pass: pytest
Keep additions aligned with TOON SPEC v2.0

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👨‍💻 Author

Christian Palomares - @shinjidev

☕ Support

If you find this project helpful, consider supporting my work:

Buy me a coffee to help me continue developing open-source tools for the developer community!

🙏 Acknowledgments

Built following TOON SPEC v2.0
Inspired by the need for efficient, token-optimized data serialization
Uses property-based testing with Hypothesis for robust validation

⭐ Star this repository if you find it useful! ⭐

About

A production-grade Python library and CLI that converts data between JSON and TOON (Token-Oriented Object Notation) while fully conforming to TOON SPEC v2.0.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Nov 27, 2025

0.4.0

Nov 25, 2025

0.3.0

Nov 25, 2025

0.2.0

Nov 19, 2025

This version

0.1.0

Nov 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toontools-0.1.0.tar.gz (32.4 kB view details)

Uploaded Nov 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

toontools-0.1.0-py3-none-any.whl (26.9 kB view details)

Uploaded Nov 19, 2025 Python 3

File details

Details for the file toontools-0.1.0.tar.gz.

File metadata

Download URL: toontools-0.1.0.tar.gz
Upload date: Nov 19, 2025
Size: 32.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for toontools-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0cb14d0eecdad18473135b625d35a4f24797683b87568188fa7465f41fc04f59`
MD5	`c723b8072609df3abe9df8591e51208b`
BLAKE2b-256	`54fff4945e29824f72b8d376a504711e78947aabe98e164f0cae40f1abeff719`

See more details on using hashes here.

File details

Details for the file toontools-0.1.0-py3-none-any.whl.

File metadata

Download URL: toontools-0.1.0-py3-none-any.whl
Upload date: Nov 19, 2025
Size: 26.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for toontools-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`39872ceb1c4593671cc96fb2cdb469629e5d66d50e7fe2a5b1dacde6e7157382`
MD5	`15172d0ff06f4159bf06cd0bcc9743de`
BLAKE2b-256	`f09755824f389eb9294545179941d2006da4297bf3068d5106c132b35a06c724`

See more details on using hashes here.

toontools 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🔄 toonpy

✨ Features

🔧 1. Lossless Conversion

📊 2. Advanced Parser & Lexer

🚀 3. Automatic Tabular Detection

🛠️ 4. CLI & Utilities

📦 Installation

🚀 Quick Start

📖 Detailed Usage

Python API

Basic Conversion

Validation

Tabular Suggestions

Streaming Large Files

Command-Line Interface

Convert JSON to TOON

Convert TOON to JSON

Format a TOON File

🧪 Testing

⚡ Performance

Benchmark Results

📊 Example Output

🛠️ API Reference

Core Functions

to_toon(obj, *, indent=2, mode="auto") -> str

from_toon(source, *, mode="strict") -> Any

validate_toon(source, *, strict=True) -> tuple[bool, List[ValidationError]]

suggest_tabular(obj) -> TabularSuggestion

stream_to_toon(fileobj_in, fileobj_out, *, chunk_size=65536, indent=2, mode="auto") -> int

Error Classes

ToonSyntaxError

📝 Requirements

📚 Documentation

🌟 Use Cases

📖 Examples

🤝 Contributing

📄 License

👨‍💻 Author

☕ Support

🙏 Acknowledgments

About

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`to_toon(obj, *, indent=2, mode="auto") -> str`

`from_toon(source, *, mode="strict") -> Any`

`validate_toon(source, *, strict=True) -> tuple[bool, List[ValidationError]]`

`suggest_tabular(obj) -> TabularSuggestion`

`stream_to_toon(fileobj_in, fileobj_out, *, chunk_size=65536, indent=2, mode="auto") -> int`

`ToonSyntaxError`