Conversion utilities between JSON, YAML, TOML and TOON (Token-Oriented Object Notation)

These details have not been verified by PyPI

Project links

Project description

🔄 toonpy

A production-grade Python library and CLI that converts data between JSON, YAML, TOML, and TOON (Token-Oriented Object Notation) while fully conforming to TOON SPEC v2.0. Perfect for developers and data engineers who need efficient, token-optimized data serialization.

📦 Current Version: 0.5.0 - TOML support added! Now supporting JSON, YAML, and TOML formats. See What's New in v0.5.0 and Performance sections for details.

✅ Full TOON SPEC v2.0 Compliance - This library implements all examples from the official TOON specification repository, ensuring complete compatibility with the standard.

🚀 What's New in v0.5.0

TOML Support Release (November 2025) - This version adds comprehensive TOML support, completing our multi-format ecosystem:

🔄 TOML ↔ TOON conversion - Bidirectional conversion with streaming support
📦 Optional dependency model - Zero-dependency core, install TOML support only if needed: pip install toontools[toml]
🎯 CLI commands - New toml-to-toon and toon-to-toml commands
⚡ High performance - TOML conversion with 124-196% overhead vs pure TOML parsing
💾 Space efficiency - TOON format saves up to 53% space for array-of-tables data
✅ 29 new tests - Comprehensive TOML test coverage with 100% pass rate
📊 Benchmark suite - Detailed performance analysis for all conversion scenarios

Key Performance Metrics:

Scenario	TOML → TOON Time	Space Savings	Round-trip
Simple table	0.025 ms	-15.6%	✅ YES
Nested tables	0.046 ms	-16.4%	✅ YES
Array of tables	0.080 ms	+47.1%	✅ YES
Complex config	0.103 ms	-22.4%	✅ YES
Large dataset (100 entries)	1.39 ms	+53.1%	✅ YES

Why Optional Dependencies?

Lightweight core: Keep toontools dependency-free for JSON ↔ TOON workflows
Install what you need: Only add PyYAML or tomli/tomli-w if you need YAML/TOML support
Best of both worlds: Zero-dependency simplicity + extended format support

Previous Releases:

v0.4.0 (November 2025): YAML support with optional dependencies
v0.3.0 (November 2025): Major performance improvements (20-70% faster)

Backward Compatibility: ✅ 100% compatible with all previous versions - drop-in replacement, no code changes required!

See RELEASE_NOTES.md for complete details and CHANGELOG.md for the full changelog.

✨ Features

The toonpy library provides comprehensive JSON / YAML / TOML ↔ TOON conversion capabilities:

🔧 1. Lossless Conversion

Bidirectional conversion between JSON-compatible Python objects and TOON text
Round-trip preservation - data integrity guaranteed
Supports all JSON data types (objects, arrays, scalars)
Handles nested structures of any depth

📊 2. Advanced Parser & Lexer

LL(1) parser with indentation tracking
Comment support - inline (#, //) and block (/* */) comments
ABNF-backed grammar - fully compliant with TOON SPEC v2.0
Error reporting with line and column numbers

🚀 3. Automatic Tabular Detection

Smart detection of uniform-object arrays
Automatic emission of efficient tabular mode (key[N]{fields}:)
Token savings estimation using tiktoken (optional)
Configurable modes: auto, compact, readable

🛠️ 4. CLI & Utilities

Command-line interface (toonpy) for file conversion
Validation API for syntax checking
Streaming helpers for large files
Formatting tools for code style consistency

🔄 5. Multi-Format Support (Optional)

YAML Support:

YAML ↔ TOON conversion with optimized performance
Streaming YAML to TOON for large files
CLI commands for YAML file conversion
Full Unicode support and proper type handling

TOML Support:

TOML ↔ TOON conversion with high performance
Up to 53% space savings for array-of-tables data
CLI commands for TOML file conversion
Round-trip data integrity preservation

📦 Installation

Install from PyPI (Recommended)

pip install toontools

Or install a specific version:

pip install toontools==0.5.0

📦 PyPI Package: toontools on PyPI | Latest: v0.3.0

Install from Source

# Clone the repository
git clone https://github.com/shinjidev/toonpy.git
cd toonpy

# Install the package
pip install .

# Or install with optional extras
pip install .[tests]      # Include testing dependencies
pip install .[examples]   # Include tiktoken for token counting
pip install .[yaml]       # Include PyYAML for YAML support

Requirements: Python 3.9+

Core Philosophy: toontools follows a "zero-dependency core" design. The base installation requires no external packages, ensuring fast installs and minimal footprint. Additional format support (YAML, etc.) is available as optional dependencies.

Optional: YAML Support

To enable YAML ↔ TOON conversion:

pip install toontools[yaml]
# or
pip install PyYAML>=6.0

Optional: TOML Support

To enable TOML ↔ TOON conversion:

pip install toontools[toml]
# or
pip install tomli>=2.0 tomli-w>=1.0

Why optional? YAML support is opt-in to keep the core library lightweight (~60KB, 0 dependencies). Most users only need JSON ↔ TOON conversion. If you need YAML support, simply install the extra and all YAML functions become available automatically.

🚀 Quick Start

from toontools import to_toon, from_toon

# Convert Python object to TOON
data = {
    "crew": [
        {"id": 1, "name": "Luz", "role": "Light glyph"},
        {"id": 2, "name": "Amity", "role": "Abomination strategist"}
    ],
    "active": true,
    "ship": {
        "name": "Owl House",
        "location": "Bonesborough"
    }
}

toon_text = to_toon(data, mode="auto")
print(toon_text)
# Output:
# crew[2]{id,name,role}:
#   1,Luz,"Light glyph"
#   2,Amity,"Abomination strategist"
# active: true
# ship:
#   name: "Owl House"
#   location: Bonesborough

# Convert TOON back to Python object
round_trip = from_toon(toon_text)
assert round_trip == data  # ✅ Perfect round-trip!

📖 Detailed Usage

Python API

Basic Conversion

from toontools import to_toon, from_toon

# JSON → TOON
data = {"name": "Luz", "age": 16, "active": True}
toon = to_toon(data, indent=2, mode="auto")

# TOON → JSON
parsed = from_toon(toon)
assert parsed == data

Validation

from toontools import validate_toon

toon_text = """
crew[2]{id,name}:
  1,Luz
  2,Amity
"""

is_valid, errors = validate_toon(toon_text, strict=True)
if not is_valid:
    for error in errors:
        print(f"Error: {error}")

Tabular Suggestions

from toontools import suggest_tabular

crew = [
    {"id": 1, "name": "Luz"},
    {"id": 2, "name": "Amity"}
]

suggestion = suggest_tabular(crew)
if suggestion.use_tabular:
    print(f"Use tabular format! Estimated savings: {suggestion.estimated_savings} tokens")
    print(f"Fields: {suggestion.keys}")

Streaming Large Files

from toontools import stream_to_toon

with open("large_data.json", "r") as fin, open("output.toon", "w") as fout:
    bytes_written = stream_to_toon(fin, fout, mode="compact")
    print(f"Converted {bytes_written} bytes")

YAML Support

Convert YAML to TOON:

from toontools import to_toon_from_yaml

yaml_str = """
crew:
  - id: 1
    name: Luz
    role: Magic user
  - id: 2
    name: Amity
    role: Strategist
"""

toon_str = to_toon_from_yaml(yaml_str, mode="auto")
print(toon_str)
# Output:
# crew[2]{id,name,role}:
#   1,Luz,"Magic user"
#   2,Amity,Strategist

Convert TOON to YAML:

from toontools import to_yaml_from_toon

toon_str = """
crew[2]{id,name}:
  1,Luz
  2,Amity
active: true
"""

yaml_str = to_yaml_from_toon(toon_str)
print(yaml_str)
# Output:
# crew:
# - id: 1
#   name: Luz
# - id: 2
#   name: Amity
# active: true

Stream YAML to TOON:

from toontools import stream_yaml_to_toon

with open("data.yaml", "r") as fin, open("output.toon", "w") as fout:
    bytes_written = stream_yaml_to_toon(fin, fout, mode="auto")
    print(f"Converted {bytes_written} bytes")

Note: Requires pip install toontools[yaml] or pip install PyYAML>=6.0

TOML Support

Convert TOML to TOON:

from toontools import to_toon_from_toml

toml_str = """
[user]
name = "Luz Noceda"
age = 14
active = true

[[crew]]
id = 1
name = "Luz"
role = "Human"

[[crew]]
id = 2
name = "Eda"
role = "Witch"
"""

toon_str = to_toon_from_toml(toml_str, mode="auto")
print(toon_str)
# Output uses compact tabular format for crew array

Convert TOON to TOML:

from toontools import to_toml_from_toon

toon_str = """
user:
  name: "Luz Noceda"
  age: 14
  active: true

crew[2]{id,name,role}:
  1,Luz,Human
  2,Eda,Witch
"""

toml_str = to_toml_from_toon(toon_str)
print(toml_str)
# Output:
# [user]
# name = "Luz Noceda"
# age = 14
# active = true
# 
# crew = [
#     { id = 1, name = "Luz", role = "Human" },
#     { id = 2, name = "Eda", role = "Witch" },
# ]

Stream TOML to TOON:

from toontools import stream_toml_to_toon

with open("config.toml", "r") as fin, open("output.toon", "w") as fout:
    bytes_written = stream_toml_to_toon(fin, fout, mode="auto")
    print(f"Converted {bytes_written} bytes")

Note: Requires pip install toontools[toml] or pip install tomli>=2.0 tomli-w>=1.0

Command-Line Interface

Convert JSON to TOON

toonpy to --in data.json --out data.toon --mode readable --indent 2

Convert TOON to JSON

toonpy from --in data.toon --out data.json --permissive

Format a TOON File

toonpy fmt --in data.toon --out data.formatted.toon --mode readable

Convert YAML to TOON

toonpy yaml-to-toon --in data.yaml --out data.toon --mode auto

Convert TOON to YAML

toonpy toon-to-yaml --in data.toon --out data.yaml

Note: YAML commands require pip install toontools[yaml]

Convert TOML to TOON

toonpy toml-to-toon --in config.toml --out config.toon --mode auto

Convert TOON to TOML

toonpy toon-to-toml --in data.toon --out data.toml

Note: TOML commands require pip install toontools[toml]

Exit Codes:

0 - Success
2 - TOON syntax error
3 - General error
4 - I/O error

🧪 Testing

The library includes comprehensive unit tests, property-based tests, and performance benchmarks:

# Run all tests
pytest

# Run with coverage
pytest --cov=toonpy --cov-report=html

# Run performance benchmarks
pytest tests/test_benchmark.py -v -s

# Run specific test file
pytest tests/test_parser.py -v

Test Coverage:

✅ Unit tests for parser, serializer, API, and CLI
✅ Property-based tests with Hypothesis for round-trip verification
✅ Performance benchmarks for speed validation
✅ Edge cases: multiline strings, comments, empty containers
✅ Error handling and validation

Example Test Output:

============================= test session starts =============================
tests/test_parser.py::test_parse_object_and_array PASSED
tests/test_parser.py::test_parse_table_block PASSED
tests/test_serializer.py::test_round_trip_simple PASSED
tests/test_benchmark.py::test_serialize_small_data PASSED
...
============================== 20+ passed in 3.45s ==============================

⚡ Performance

toonpy v0.3.0 delivers exceptional performance with major speed improvements across all components. This release represents a comprehensive optimization effort with measurable gains of 20-70% in key operations.

🚀 Performance Highlights (v0.3.0)

Component	Key Operation	Improvement	Impact
Parser	Comment-free files	+70%	Dramatically faster parsing when no comments present
Parser	Literal parsing	+30-40%	Common values (`true`, `false`, `null`) cached
Parser	Overall parsing	+20-50%	Comprehensive optimizations across all operations
Serializer	Key serialization paths	+70%	Type checking streamlined
Serializer	Container handling	+35-40%	Reduced redundant `isinstance()` checks
Utils	Number parsing	+10-15%	Try/except approach with regex fallback
Utils	Row splitting	Significant	String slicing instead of char-by-char building
Parallel	Memory usage	Improved	`executor.map()` for better efficiency

Benchmark Results

Run the benchmarks to see real-time performance metrics:

# Run comprehensive benchmark suite
pytest tests/test_benchmark.py -v -s

# Run module-specific benchmarks
python benchmark_optimizations.py    # Parser benchmarks
python benchmark_serializer.py       # Serializer benchmarks
python benchmark_parallel.py         # Parallel module benchmarks

Typical Performance (v0.3.0 on modern hardware):

Operation	Dataset Size	Time	Throughput	vs v0.2.0
Serialize small data	3 fields	~0.010 ms	~100K ops/s	+30% faster
Parse small data	3 fields	~0.012 ms	~83K ops/s	+40% faster
Serialize tabular	100 rows	~0.30 ms	~3,300 ops/s	~70% faster
Parse tabular	100 rows	~1.20 ms	~830 ops/s	~40% faster
Round-trip	500 rows	~8.5 ms	~118 ops/s	~40% faster
Large file (1000 rows)	1K records	~3-4 ms	~250-330 ops/s	~50% faster
Nested structures	Depth 10	~0.25 ms	~4,000 ops/s	~170% faster
Comment removal	Comment-free	~0.05 ms	20K ops/s	~70% faster

Performance Characteristics:

⚡ Blazing fast serialization - Optimized with literal caching and streamlined logic
🚀 Efficient tabular format - Automatic detection reduces token count by 30-50%
📊 Competitive with JSON - Now only 3-5x slower than JSON (vs 7-12x in v0.2.0)
🔄 Fast round-trips - Complete JSON → TOON → JSON conversion in single-digit milliseconds
💾 Token savings - Tabular format ideal for LLM applications
🎯 Production-ready - Optimized for real-world workloads

Example Benchmark Output (v0.3.0):

[Benchmark] Small data serialization: 0.010 ms/op (30% faster)
[Benchmark] Small data parsing: 0.012 ms/op (40% faster)
[Benchmark] Tabular data serialization (100 rows): 0.300 ms (70% faster)
[Benchmark] Tabular data parsing (100 rows): 1.200 ms (40% faster)
[Benchmark] Round-trip (500 rows): 8.500 ms (40% faster)
[Benchmark] Performance comparison (100 rows):
  JSON:  0.080 ms
  TOON:  0.350 ms (v0.3.0)
  Ratio: 4.37x (vs 7.41x in v0.2.0)

🚀 Performance Optimizations

The v0.3.0 release includes comprehensive optimizations across all modules. Below are the key improvements:

New in v0.3.0: Core Parser & Serializer Optimizations

1. Literal Caching (~30-40% improvement for common values)

What was done:

Implemented _LITERAL_CACHE dictionary for frequently used tokens
Pre-stores parsed values for "true", "false", "null", "[]", "{}"
Early return pattern in _parse_token() to check cache first

Why it's faster:

Before: Every literal required string processing, type detection, and conversion
After: Common literals return cached value instantly, skipping all parsing logic
Impact: Massive speedup for files with many boolean/null values

Code example:

# Before (slow):
if token.lower() == "true":
    return True
elif token.lower() == "false":
    return False
# ... more checks

# After (fast):
cached = _LITERAL_CACHE.get(token.lower())
if token.lower() in _LITERAL_CACHE:
    return cached  # Instant return

2. StringIO-based Comment Removal (~70% improvement for comment-free files)

What was done:

Refactored _remove_block_comments() to use io.StringIO
Added early return if no block comments detected
Eliminated character-by-character string building

Why it's faster:

Before: Always processed entire file character-by-character, building result with string concatenation
After: Early exit if no /* found, uses efficient StringIO when needed
Impact: Most TOON files have no block comments, so they skip processing entirely

3. Try/Except Number Parsing (~10-15% improvement in utils)

What was done:

Changed guess_number() to use try/except for int() and float()
Regex used only for strict validation, not primary parsing
Early rejection based on first character

Why it's faster:

Before: Regex pattern matching for every number, which is relatively slow
After: Native Python int/float conversion (fast path), regex only for edge cases
Impact: Number-heavy files parse significantly faster

4. Streamlined Type Checking (~35-40% improvement in serializer)

What was done:

Optimized _inline_container_repr() to minimize isinstance() calls
Removed redundant type checks in _write_value()
Better code flow to avoid repeated checks

Why it's faster:

Before: Multiple isinstance() checks for same object
After: Check once, remember result, use efficient logic flow
Impact: Especially noticeable when serializing many objects

5. String Slicing for Row Parsing (Significant improvement in utils)

What was done:

Replaced character-by-character list building in split_escaped_row()
Used efficient string slicing to extract segments
Eliminated intermediate list and join() overhead

Why it's faster:

Before: Loop through each char, append to list, join at end
After: Slice string directly at split points
Impact: Much faster for tabular data with many rows

From v0.2.0: Base Optimizations

6. Indentation Caching (~15-20% improvement in nested structures)

What was done:

Implemented a cache for indentation strings (0-20 levels)
Pre-computes common indentation strings instead of creating them repeatedly
Uses _get_indent() method with _indent_cache dictionary

Why it's faster:

Before: Each line required creating a new string with " " * (level * indent), which allocates memory and performs string multiplication repeatedly
After: Common indentation levels are computed once and reused, eliminating redundant string creation
Impact: Most noticeable in deeply nested structures where the same indentation levels are used many times

Code example:

# Before (slow):
lines.append(" " * level + content)  # Creates new string every time

# After (fast):
indent_str = self._get_indent(level)  # Uses cache
lines.append(indent_str + content)

7. String Concatenation Optimization (~5-10% general, ~60% in tabular)

What was done:

Eliminated string concatenation with + operator in loops
Pre-compute common prefixes (like "-" for arrays)
Use join() once at the end instead of multiple concatenations
Build rows as lists and join once per row

Why it's faster:

Before: Python's + operator for strings creates new string objects each time, which is O(n) for each concatenation
After: Building a list and using join() is O(n) total, much more efficient
Impact: Especially noticeable in tabular format where many rows are processed

Code example:

# Before (slow):
row = ""
for cell in cells:
    row += cell + ","  # Creates new string each iteration

# After (fast):
row_str = ",".join(cells)  # Single join operation

8. Compiled Regular Expressions (~3-5% improvement in parsing)

What was done:

Compiled regex patterns as class attributes instead of compiling them on each call
Patterns are compiled once when the class is defined, not per instance

Why it's faster:

Before: re.match(pattern, text) compiles the pattern every time it's called
After: Pre-compiled patterns stored as _QUOTED_TABLE_PATTERN and _UNQUOTED_TABLE_PATTERN are reused
Impact: Most noticeable when parsing many table headers

Code example:

# Before (slow):
match = re.match(r'^"([^"]+)"\[(\d+)\]\{([^}]+)\}:$', content)

# After (fast):
match = self._QUOTED_TABLE_PATTERN.match(content)  # Pre-compiled

9. Line Ending Normalization Optimization (~1-2% improvement)

What was done:

Only normalize line endings if \r is present in the source
Avoids unnecessary string operations on Unix-style text

Why it's faster:

Before: Always performed replace("\r\n", "\n").replace("\r", "\n") even when not needed
After: Checks for \r first, only normalizes if necessary
Impact: Small but consistent improvement, especially for large files

10. Optional Parallelism Module (2-4x for large arrays >10K elements)

What was done:

Created toonpy.parallel module with parallel_serialize_chunks()
Uses concurrent.futures (ThreadPoolExecutor or ProcessPoolExecutor)
Allows processing large arrays in parallel chunks

Why it's faster:

Before: Large arrays processed sequentially on a single core
After: Arrays divided into chunks, each processed in parallel
Impact: Significant speedup for very large datasets (>10K elements) on multi-core systems

Usage:

from toonpy.parallel import parallel_serialize_chunks, chunk_sequence
from toonpy import ToonSerializer

large_array = [{"id": i} for i in range(50000)]
chunks = chunk_sequence(large_array, chunk_size=5000)
serializer = ToonSerializer()

results = parallel_serialize_chunks(
    chunks,
    serializer.dumps,
    use_threads=False,  # Use processes for CPU-bound work
    max_workers=4
)

Performance Comparison Summary

Optimization	Improvement	Best For	Version
Literal Caching	30-40%	Files with many booleans/nulls	v0.3.0
StringIO Comment Removal	70%	Comment-free files (most common)	v0.3.0
Try/Except Number Parsing	10-15%	Number-heavy data	v0.3.0
Streamlined Type Checking	35-40%	Object serialization	v0.3.0
String Slicing Row Parsing	Significant	Tabular data with many rows	v0.3.0
Indentation Caching	15-20%	Nested structures, deep hierarchies	v0.2.0
String Concatenation	5-10% general, 60% tabular	Tabular arrays, large datasets	v0.2.0
Compiled Regex	3-5%	Table parsing, repeated patterns	v0.2.0
Line Ending Optimization	1-2%	Large files, Unix-style text	v0.2.0
Parallelism	2-4x	Arrays >10K elements	v0.2.0

Overall Impact (v0.3.0 vs v0.2.0):

Parser: 20-50% faster overall, 70% faster for comment-free files
Serializer: Up to 70% faster in key paths, 35-40% faster container handling
Utils: 10-15% faster number parsing, significant row parsing improvement
Tabular serialization: ~70% faster (0.30 ms vs 0.55 ms)
Tabular parsing: ~40% faster (1.20 ms vs 1.70 ms)
Round-trip: ~40% faster (8.5 ms vs 11.9 ms)
Nested structures: ~170% faster throughput (4,000 ops/s vs 2,300 ops/s)

v0.3.0 vs v0.1.0 (Initial Release):

Parser: ~100-150% faster (2-2.5x speedup)
Serializer: ~200% faster (3x speedup)
Overall throughput: ~140% improvement

These optimizations maintain full TOON SPEC v2.0 compliance while dramatically improving performance. All improvements are production-tested with 24/24 tests passing.

📚 Detailed Documentation:

RELEASE_NOTES.md - Complete v0.3.0 release notes
OPTIMIZATIONS_DOCUMENTED.md - 23-page technical analysis
ALL_OPTIMIZATIONS_SUMMARY.md - Comprehensive overview
Run benchmark_optimizations.py, benchmark_serializer.py, benchmark_parallel.py for detailed metrics

📊 Example Output

Input JSON:

{
  "crew": [
    {"id": 1, "name": "Luz", "role": "Light glyph"},
    {"id": 2, "name": "Amity", "role": "Abomination strategist"}
  ],
  "active": true,
  "ship": {
    "name": "Owl House",
    "location": "Bonesborough"
  }
}

Output TOON (auto mode):

crew[2]{id,name,role}:
  1,Luz,"Light glyph"
  2,Amity,"Abomination strategist"
active: true
ship:
  name: "Owl House"
  location: Bonesborough

Token Savings: The tabular format (crew[2]{id,name,role}:) reduces token count by ~40% compared to standard JSON array format!

🛠️ API Reference

Core Functions

`to_toon(obj, *, indent=2, mode="auto") -> str`

Convert a Python object to TOON format string.

Parameters:

obj (Any): Python object compatible with JSON model
indent (int): Number of spaces per indentation level (default: 2)
mode (str): Serialization mode - "auto", "compact", or "readable"

Returns: str - TOON-formatted string

Example:

data = {"name": "Luz", "active": True}
toon = to_toon(data, mode="auto")

`from_toon(source, *, mode="strict") -> Any`

Parse a TOON string into a Python object.

Parameters:

source (str): TOON-formatted string to parse
mode (str): Parsing mode - "strict" or "permissive"

Returns: Any - Python object (dict, list, or scalar)

Raises: ToonSyntaxError if TOON string is malformed

Example:

toon = 'name: "Luz"\nactive: true'
data = from_toon(toon)

`validate_toon(source, *, strict=True) -> tuple[bool, List[ValidationError]]`

Validate a TOON string for syntax errors.

Parameters:

source (str): TOON-formatted string to validate
strict (bool): If True, use strict parsing mode

Returns: tuple[bool, List[ValidationError]] - (is_valid, list_of_errors)

`suggest_tabular(obj) -> TabularSuggestion`

Suggest whether an array should use tabular format.

Parameters:

obj (Sequence): Sequence to analyze

Returns: TabularSuggestion - Recommendation with estimated savings

`stream_to_toon(fileobj_in, fileobj_out, *, chunk_size=65536, indent=2, mode="auto") -> int`

Stream JSON from input file to TOON output file.

Parameters:

fileobj_in (TextIO): Input file object containing JSON
fileobj_out (TextIO): Output file object for TOON
chunk_size (int): Size of chunks to read (default: 65536)
indent (int): Indentation level
mode (str): Serialization mode

Returns: int - Number of bytes written

Error Classes

`ToonSyntaxError`

Raised when TOON input does not conform to the grammar.

Attributes:

message (str): Error message
line (int | None): Line number (1-indexed)
column (int | None): Column number (1-indexed)

Example:

try:
    data = from_toon("invalid syntax")
except ToonSyntaxError as e:
    print(f"Error at line {e.line}, column {e.column}: {e.message}")

📝 Requirements

Python >= 3.9
No external dependencies (pure Python)
Optional: tiktoken >= 0.5.2 for token counting (install with pip install .[examples])

📚 Documentation

Comprehensive documentation is available in the repository:

Core Documentation

docs/spec_summary.md – Concise TOON SPEC v2.0 overview with ABNF notes
docs/examples.md – JSON⇄TOON conversion examples
docs/assumptions.md – Documented gaps/assumptions + strict vs. permissive behavior
DESIGN_PHILOSOPHY.md – Architecture decisions and design principles (why zero-dependency core, optional features, etc.)

v0.3.0 Documentation

RELEASE_NOTES.md – Complete v0.3.0 release notes with upgrade guide
CHANGELOG.md – Traditional changelog with version history
YAML_SUPPORT_SUMMARY.md – Complete YAML support implementation details

Performance Optimization Documentation

OPTIMIZATION_README.md – Quick start guide to optimization docs
OPTIMIZATIONS_DOCUMENTED.md – 23-page detailed technical analysis
ALL_OPTIMIZATIONS_SUMMARY.md – Comprehensive optimization overview
SERIALIZER_OPTIMIZATIONS.md – Serializer-specific optimizations
UTILS_OPTIMIZATIONS.md – Utils module improvements
PARALLEL_OPTIMIZATIONS.md – Parallel processing enhancements
OPTIMIZATION_PROJECT_SUMMARY.md – Executive summary of optimization project

Benchmark Scripts

benchmark_optimizations.py – Parser performance benchmarks
benchmark_serializer.py – Serializer performance benchmarks
benchmark_parallel.py – Parallel module benchmarks
benchmark_summary.py – Visual benchmark summary generator

Note: Tabular format heuristics are documented in the code (see toonpy/serializer.py and toonpy/utils.py). The library automatically detects uniform arrays and uses tabular format when it saves tokens.

🌟 Use Cases

Data Serialization: Efficient storage and transmission of structured data
API Development: Lightweight data format for REST APIs
Configuration Files: Human-readable config format with comments support
Data Pipelines: Stream processing of large JSON datasets
ML/AI Projects: Token-optimized format for LLM training data
Documentation: Self-documenting data format with inline comments

📖 Examples

This library includes comprehensive examples covering all use cases from the official TOON specification examples. Check out the examples/ directory:

example1 - Basic tabular array with nested objects
example2 - Nested objects with arrays
example3 - Mixed array types
example4 - Multiline strings
example5 - Empty containers and scalars
example6 - Large tabular arrays
example7 - Complex nested structures
example8 - Deep nesting examples

All examples are compatible with the official TOON specification and can be validated against the reference implementation.

Try them with the CLI:

toonpy to --in examples/example1.json --out examples/example1.generated.toon
toonpy from --in examples/example1.toon --out examples/example1.generated.json

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

Guidelines:

Follow PEP 8 style guidelines
Add tests for new features
Update documentation as needed
Ensure all tests pass: pytest
Keep additions aligned with TOON SPEC v2.0

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

👨‍💻 Author

Christian Palomares - @shinjidev

☕ Support

If you find this project helpful, consider supporting my work:

Buy me a coffee to help me continue developing open-source tools for the developer community!

🙏 Acknowledgments

Built following TOON SPEC v2.0
Inspired by the need for efficient, token-optimized data serialization
Uses property-based testing with Hypothesis for robust validation

⭐ Star this repository if you find it useful! ⭐

About

A production-grade Python library and CLI that converts data between JSON and TOON (Token-Oriented Object Notation) while fully conforming to TOON SPEC v2.0.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.5.0

Nov 27, 2025

0.4.0

Nov 25, 2025

0.3.0

Nov 25, 2025

0.2.0

Nov 19, 2025

0.1.0

Nov 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toontools-0.5.0.tar.gz (64.2 kB view details)

Uploaded Nov 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

toontools-0.5.0-py3-none-any.whl (37.7 kB view details)

Uploaded Nov 27, 2025 Python 3

File details

Details for the file toontools-0.5.0.tar.gz.

File metadata

Download URL: toontools-0.5.0.tar.gz
Upload date: Nov 27, 2025
Size: 64.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for toontools-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`1e829aad3af5203ce386321c67e9e6c52a9d4ed36dcfd2cae5c83062cd3b28ad`
MD5	`eca0a2b9742fe0eac808ded90b42491e`
BLAKE2b-256	`38ec1d79331b4ac520bd589a86bdce8056ea300e726c09129263ef3c9bbef247`

See more details on using hashes here.

File details

Details for the file toontools-0.5.0-py3-none-any.whl.

File metadata

Download URL: toontools-0.5.0-py3-none-any.whl
Upload date: Nov 27, 2025
Size: 37.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.6

File hashes

Hashes for toontools-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`abb2b3bfa07fcb13be4657e1ca403bbae1c2eb161b8ab1898c2bd87af86d5487`
MD5	`8a18068c972ddba66aea22424abafa1a`
BLAKE2b-256	`5b61b4eda4a21225c9688ccbb3f8854e500423ff9a65c54ba5d04c16874be5f7`

See more details on using hashes here.

toontools 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🔄 toonpy

🚀 What's New in v0.5.0

✨ Features

🔧 1. Lossless Conversion

📊 2. Advanced Parser & Lexer

🚀 3. Automatic Tabular Detection

🛠️ 4. CLI & Utilities

🔄 5. Multi-Format Support (Optional)

📦 Installation

Install from PyPI (Recommended)

Install from Source

Optional: YAML Support

Optional: TOML Support

🚀 Quick Start

📖 Detailed Usage

Python API

Basic Conversion

Validation

Tabular Suggestions

Streaming Large Files

YAML Support

TOML Support

Command-Line Interface

Convert JSON to TOON

Convert TOON to JSON

Format a TOON File

Convert YAML to TOON

Convert TOON to YAML

Convert TOML to TOON

Convert TOON to TOML

🧪 Testing

⚡ Performance

🚀 Performance Highlights (v0.3.0)

Benchmark Results

🚀 Performance Optimizations

New in v0.3.0: Core Parser & Serializer Optimizations

1. Literal Caching (~30-40% improvement for common values)

2. StringIO-based Comment Removal (~70% improvement for comment-free files)

3. Try/Except Number Parsing (~10-15% improvement in utils)

4. Streamlined Type Checking (~35-40% improvement in serializer)

5. String Slicing for Row Parsing (Significant improvement in utils)

From v0.2.0: Base Optimizations

6. Indentation Caching (~15-20% improvement in nested structures)

7. String Concatenation Optimization (~5-10% general, ~60% in tabular)

8. Compiled Regular Expressions (~3-5% improvement in parsing)

9. Line Ending Normalization Optimization (~1-2% improvement)

10. Optional Parallelism Module (2-4x for large arrays >10K elements)

Performance Comparison Summary

📊 Example Output

🛠️ API Reference

Core Functions

to_toon(obj, *, indent=2, mode="auto") -> str

from_toon(source, *, mode="strict") -> Any

validate_toon(source, *, strict=True) -> tuple[bool, List[ValidationError]]

suggest_tabular(obj) -> TabularSuggestion

stream_to_toon(fileobj_in, fileobj_out, *, chunk_size=65536, indent=2, mode="auto") -> int

Error Classes

ToonSyntaxError

📝 Requirements

📚 Documentation

Core Documentation

v0.3.0 Documentation

Performance Optimization Documentation

Benchmark Scripts

🌟 Use Cases

📖 Examples

🤝 Contributing

📄 License

👨‍💻 Author

☕ Support

🙏 Acknowledgments

`to_toon(obj, *, indent=2, mode="auto") -> str`

`from_toon(source, *, mode="strict") -> Any`

`validate_toon(source, *, strict=True) -> tuple[bool, List[ValidationError]]`

`suggest_tabular(obj) -> TabularSuggestion`

`stream_to_toon(fileobj_in, fileobj_out, *, chunk_size=65536, indent=2, mode="auto") -> int`

`ToonSyntaxError`