TOON (Token-Oriented Object Notation) format support for pandas
Project description
pandas-toon
TOON (Token-Oriented Object Notation) format support for pandas DataFrames.
Overview
pandas-toon is a pandas plugin that brings native support for the TOON data serialization format. TOON is a compact, LLM-optimized alternative to JSON or CSV, specifically designed for scenarios such as LLM prompts, data validation, and token-efficient storage or exchange.
With pandas-toon, you can seamlessly integrate TOON into your pandas-based data workflows using familiar pandas syntax.
Features
- Native TOON support in pandas: Read and write TOON just like built-in formats
- LLM optimization: Designed for minimal token usage and high reliability in AI/LLM pipelines
- Easy installation: Simple pip installation with pandas integration
- Clean API: Follows pandas conventions with
pd.read_toon()anddf.to_toon() - Type inference: Automatically handles strings, numbers, booleans, and null values
Installation
Install via pip:
pip install pandas-toon
Or install pandas with the toon extra (future support):
pip install pandas[toon]
Quick Start
Reading TOON files
import pandas as pd
import pandas_toon
# Read a TOON file
df = pd.read_toon("data.toon")
Writing TOON files
import pandas as pd
import pandas_toon
# Create a DataFrame
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'age': [30, 25, 35],
'city': ['New York', 'London', 'Paris']
})
# Save as TOON
df.to_toon("output.toon")
# Or get TOON string
toon_str = df.to_toon()
print(toon_str)
Output:
name|age|city
---
Alice|30|New York
Bob|25|London
Charlie|35|Paris
Using table names
# Write with table name
df.to_toon("data.toon", table_name="users")
# The output will include the table name:
# @users
# name|age|city
# ---
# ...
TOON Format
TOON uses a simple, readable syntax optimized for token efficiency:
@table_name # Optional table identifier
column1|column2|column3
--- # Separator line
value1|value2|value3
value4|value5|value6
Data Types
TOON automatically handles common data types:
- Strings: Plain text values
- Numbers: Integers and floating-point numbers (e.g.,
42,3.14) - Booleans:
trueorfalse - Null values: Empty values or
null
Example:
name|age|score|active|notes
---
Alice|30|95.5|true|Great performance
Bob|25||false|
Examples
Working with different data types
import pandas as pd
from io import StringIO
# Read TOON data from string
toon_data = """@employee_data
name|age|salary|active
---
Alice|30|75000.0|true
Bob|25|65000.0|true
Charlie|35||false
"""
df = pd.read_toon(StringIO(toon_data))
print(df)
# name age salary active
# 0 Alice 30 75000.0 True
# 1 Bob 25 65000.0 True
# 2 Charlie 35 NaN False
Round-trip conversion
# Create DataFrame
df = pd.DataFrame({
'product': ['Laptop', 'Mouse', 'Keyboard'],
'price': [999.99, 29.99, 79.99],
'in_stock': [True, True, False]
})
# Convert to TOON and back
toon_str = df.to_toon()
df_restored = pd.read_toon(StringIO(toon_str))
# Verify data integrity
assert df.equals(df_restored)
Use Cases
LLM Prompts
TOON's compact format is ideal for including data in LLM prompts while minimizing token usage:
df = pd.DataFrame({
'question': ['What is Python?', 'What is pandas?'],
'answer': ['A programming language', 'A data analysis library']
})
# Include in prompt
prompt = f"""Here is the Q&A data:
{df.to_toon()}
Please analyze this data..."""
Data Exchange
Use TOON for lightweight data exchange between systems:
# Export data
df.to_toon("export.toon")
# Share file or content
# Other system reads it back
df_received = pd.read_toon("export.toon")
API Reference
pd.read_toon(filepath_or_buffer, **kwargs)
Read a TOON format file into a DataFrame.
Parameters:
filepath_or_buffer: str, Path, or file-like object- Path to the TOON file or a file-like object containing TOON data
Returns:
DataFrame: A pandas DataFrame containing the parsed TOON data
DataFrame.to_toon(path_or_buf=None, table_name=None, **kwargs)
Write a DataFrame to TOON format.
Parameters:
path_or_buf: str, Path, or None (optional)- File path to write to. If None, returns the TOON string
table_name: str (optional)- Optional table name to include in the TOON output
Returns:
strorNone: If path_or_buf is None, returns the TOON-formatted string. Otherwise, writes to the file and returns None
Development
Setup
Clone the repository and install in development mode:
git clone https://github.com/AMSeify/pandas-toon.git
cd pandas-toon
pip install -e ".[dev]"
Running Tests
pytest tests/
With coverage:
pytest --cov=pandas_toon tests/
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Links
- GitHub Repository: https://github.com/AMSeify/pandas-toon
- PyPI Package: https://pypi.org/project/pandas-toon/
- TOON Format Specification: https://github.com/toon-format/toon
Credits
This library builds upon:
- pandas - The powerful Python data analysis library
- TOON format - Token-Oriented Object Notation specification
About TOON
TOON (Token-Oriented Object Notation) is a data format specifically designed for Large Language Models. It aims to:
- Minimize token usage compared to JSON or CSV
- Provide clear, unambiguous structure
- Maintain human readability
- Support common data types efficiently
Learn more about TOON at the official TOON repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pandas_toon-0.1.0.tar.gz.
File metadata
- Download URL: pandas_toon-0.1.0.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a3170db573ba3378a6f7902bbd90775b90c8b01ef864d3327f37f52ee5bf920
|
|
| MD5 |
166dae713c56235dc3e78f30614764de
|
|
| BLAKE2b-256 |
c636eee5499da99f639689eff6d8b26c22c2d0918fc0895bf955982341a66846
|
Provenance
The following attestation bundles were made for pandas_toon-0.1.0.tar.gz:
Publisher:
publish.yml on AMSeify/pandas-toon
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandas_toon-0.1.0.tar.gz -
Subject digest:
0a3170db573ba3378a6f7902bbd90775b90c8b01ef864d3327f37f52ee5bf920 - Sigstore transparency entry: 702158274
- Sigstore integration time:
-
Permalink:
AMSeify/pandas-toon@618a3c71726b53b10837ed86f05bca2cfe62ee16 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AMSeify
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@618a3c71726b53b10837ed86f05bca2cfe62ee16 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pandas_toon-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pandas_toon-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64f1703bc5e66925f604eab39f9492f01ce7640bfdffddca2a4e8c664db7467e
|
|
| MD5 |
958c6b38ff2b2ebd39e3df65707b1de1
|
|
| BLAKE2b-256 |
9dc2fdd3a9580c7832a230a968817b7a3e3789ab1620e16da678485e8675eaea
|
Provenance
The following attestation bundles were made for pandas_toon-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on AMSeify/pandas-toon
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandas_toon-0.1.0-py3-none-any.whl -
Subject digest:
64f1703bc5e66925f604eab39f9492f01ce7640bfdffddca2a4e8c664db7467e - Sigstore transparency entry: 702158275
- Sigstore integration time:
-
Permalink:
AMSeify/pandas-toon@618a3c71726b53b10837ed86f05bca2cfe62ee16 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AMSeify
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@618a3c71726b53b10837ed86f05bca2cfe62ee16 -
Trigger Event:
release
-
Statement type: