Fast Python bindings for TOON format parser
Project description
toonpy
High-performance Python bindings for the TOON format parser, built with PyO3 and Rust.
5.82x faster than pure Python implementations, optimized for tabular data and LLM applications.
Features
- ⚡ Blazing Fast: 5.82x average speedup (2.98x - 9.68x range)
- 🔧 Zero Dependencies: Pure PyO3/Rust implementation
- 🎯 Optimized for Tabular Data: Inline primitive conversions for common patterns
- 🔄 Async Support: Native asyncio integration via
atoonpymodule - 🐍 Python 3.8+: abi3 wheels for broad compatibility
- 📦 Drop-in Replacement: Compatible API with other TOON libraries
Installation
pip install toonpy
Or build from source:
pip install maturin
maturin build --release
pip install target/wheels/toonpy-*.whl
Quick Start
Synchronous API
import toonpy
# Encode Python data to TOON
data = {"name": "Alice", "age": 30, "active": True}
toon_str = toonpy.encode(data)
# Output: 'active: true\nage: 30\nname: Alice\n'
# Decode TOON to Python
result = toonpy.decode(toon_str)
# Output: {'active': True, 'age': 30, 'name': 'Alice'}
# Batch operations
data_list = [{"id": i, "name": f"User{i}"} for i in range(100)]
toon_strs = toonpy.encode_batch(data_list)
results = toonpy.decode_batch(toon_strs)
Asynchronous API
import asyncio
import atoonpy
async def main():
# Async encode/decode
data = {"name": "Bob", "age": 25}
toon_str = await atoonpy.encode(data)
result = await atoonpy.decode(toon_str)
# Concurrent batch operations
data_list = [{"id": i} for i in range(1000)]
toon_strs = await atoonpy.encode_batch(data_list)
results = await atoonpy.decode_batch(toon_strs)
asyncio.run(main())
API Reference
Synchronous (toonpy)
encode(data, delimiter=None, strict=None) -> str
Encode Python data to TOON format string.
Parameters:
data: Python object (dict, list, str, int, float, bool, None)delimiter: Optional delimiter ('comma', 'tab', 'pipe'). Default: 'comma'strict: Optional strict mode. Default: False
Returns: TOON-formatted string
decode(toon_str, delimiter=None, strict=None) -> Any
Decode TOON format string to Python data.
Parameters:
toon_str: TOON-formatted stringdelimiter: Optional delimiter hint ('comma', 'tab', 'pipe'). Auto-detected if not specifiedstrict: Optional strict mode. Default: False
Returns: Python object
encode_batch(data_list, delimiter=None, strict=None) -> list
Encode multiple Python objects.
decode_batch(toon_strs, delimiter=None, strict=None) -> list
Decode multiple TOON strings.
dumps(data, **kwargs) -> str
Alias for encode().
loads(toon_str, **kwargs) -> Any
Alias for decode().
Asynchronous (atoonpy)
All functions have the same signature as the sync API but return coroutines.
import atoonpy
# All functions are async
await atoonpy.encode(data)
await atoonpy.decode(toon_str)
await atoonpy.encode_batch(data_list)
await atoonpy.decode_batch(toon_strs)
Performance
Benchmark Results
Tested against toon-llm v1.0.0b6 (November 2025):
| Test | toonpy | toon-llm | Speedup |
|---|---|---|---|
| Small Object Decode | 16.1 μs | 94.7 μs | 5.9x |
| Tabular Small Decode | 46.0 μs | 144.2 μs | 3.1x |
| Tabular Large Decode (1k rows) | 220.2 μs | 905.9 μs | 4.1x |
| Mixed Array Decode | 21.1 μs | 102.8 μs | 4.9x |
| Small Object Encode | 36.3 μs | 278.1 μs | 7.7x |
| Tabular Large Encode (1k rows) | 325.4 μs | 969.9 μs | 3.0x |
Average: 5.82x faster (range: 2.98x - 9.68x)
See PERFORMANCE.md for detailed analysis.
Architecture
Core Components
Rust Core (src/lib.rs)
- PyO3 bindings for Python C API
- Custom
json_to_python()with inlined primitive conversions - Zero-copy operations where possible
- Optimized for TOON's common patterns (tabular data)
Async Wrapper (python/atoonpy.py)
- Pure Python asyncio wrapper
- Uses
asyncio.to_thread()to release GIL - Enables concurrent I/O operations
TOON Parser
- Based on toon-rs by Jimmy Stridh
- Features: SIMD string scanning (memchr), stack allocations (smallvec), fast float parsing
Optimization Techniques
-
Inlined Primitive Conversions
- 85% of TOON data is primitives in dicts/arrays
- Avoid recursion overhead by inlining Null/Bool/Number/String conversions
- Only recurse for nested structures
-
Pre-allocated Collections
let mut items = Vec::with_capacity(arr.len()); Ok(PyList::new(py, items)?.into_any())
-
Type-specific Fast Paths
.is_instance_of::<T>()for O(1) type checking- Direct conversions without dynamic dispatch
-
SIMD Acceleration
- memchr for string scanning (6.5x faster than stdlib)
- AVX2 support on x86_64
-
Link-time Optimization
[profile.release] opt-level = 3 lto = true codegen-units = 1
Dependencies
Production
pyo3 = "0.27"- Python bindingsserde_json = "1.0"- JSON handlingonce_cell = "1.20"- Static defaultssmallvec = "1.13"- Stack allocations (transitive)toon- TOON parser by Jimmy Stridhperf_memchr- SIMD string scanningperf_smallvec- Stack allocationsperf_lexical- Fast float parsing
Development
criterion = "0.5"- Micro-benchmarking
Building from Source
Requirements
- Rust 1.70+
- Python 3.8+
- maturin
Build Steps
# Install maturin
pip install maturin
# Development build
maturin develop
# Release build
maturin build --release
# Install wheel
pip install target/wheels/toonpy-*.whl
# Run tests
python test_toonpy.py
python test_async.py
# Run benchmarks
python benchmark.py
cargo bench
Testing
# Unit tests
python test_toonpy.py
# Async tests
python test_async.py
# Benchmarks
python benchmark.py
# Micro-benchmarks
cargo bench
Credits
Core TOON Parser
Built on toon-rs by Jimmy Stridh.
The excellent TOON Rust implementation provides:
- Fast TOON ↔ JSON conversion
- SIMD-optimized string scanning
- Efficient memory management
- Robust error handling
toonpy Author
magi8101 (sharmamagi0@gmail.com)
Acknowledgments
- PyO3 team for excellent Python-Rust bindings
- TOON format creators for the readable data format
- Rust community for performance-focused tools
License
MIT OR Apache-2.0
Related Projects
- toon-rs - Rust TOON parser (core dependency)
- toon-llm - Python TOON library with LLM features
- toon-format - Official Python placeholder
Roadmap
- PyO3 0.27 support
- Async API via asyncio
- Comprehensive benchmarking
- Micro-optimization for tabular data
- Streaming decoder for large files
- Columnar output for pandas/polars
- Python 3.13 free-threaded support
Contributing
Issues and PRs welcome! See PERFORMANCE.md for optimization internals.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file toon_parser-0.1.0-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: toon_parser-0.1.0-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 262.3 kB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab8f3505747012a8ac7cb934bba0cada313daa084a1a2838db346198569437f2
|
|
| MD5 |
43fd9caca5d6d9f3fd9a4815de0dae83
|
|
| BLAKE2b-256 |
ef53eb098c3791aa90beee8d9e217493e1b0124e2750a47fe0a3f750cfacc037
|