Skip to main content

TOON (Token-Oriented Object Notation) format support for pandas

Project description

pandas-toon

Python Version PyPI version License Tests

TOON (Token-Oriented Object Notation) format support for pandas DataFrames.

Overview

pandas-toon is a pandas plugin that brings native support for the TOON data serialization format. TOON is a compact, LLM-optimized alternative to JSON or CSV, specifically designed for scenarios such as LLM prompts, data validation, and token-efficient storage or exchange.

With pandas-toon, you can seamlessly integrate TOON into your pandas-based data workflows using familiar pandas syntax.

Features

  • Native TOON support in pandas: Read and write TOON just like built-in formats
  • LLM optimization: Designed for minimal token usage and high reliability in AI/LLM pipelines
  • Easy installation: Simple pip installation with pandas integration
  • Clean API: Follows pandas conventions with pd.read_toon() and df.to_toon()
  • Type inference: Automatically handles strings, numbers, booleans, and null values

Installation

Install via pip:

pip install pandas-toon

Or install pandas with the toon extra (future support):

pip install pandas[toon]

Quick Start

Reading TOON files

import pandas as pd
import pandas_toon

# Read a TOON file
df = pd.read_toon("data.toon")

Writing TOON files

import pandas as pd
import pandas_toon

# Create a DataFrame
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [30, 25, 35],
    'city': ['New York', 'London', 'Paris']
})

# Save as TOON
df.to_toon("output.toon")

# Or get TOON string
toon_str = df.to_toon()
print(toon_str)

Output:

name|age|city
---
Alice|30|New York
Bob|25|London
Charlie|35|Paris

Using table names

# Write with table name
df.to_toon("data.toon", table_name="users")

# The output will include the table name:
# @users
# name|age|city
# ---
# ...

TOON Format

TOON uses a simple, readable syntax optimized for token efficiency:

@table_name          # Optional table identifier
column1|column2|column3
---                  # Separator line
value1|value2|value3
value4|value5|value6

Data Types

TOON automatically handles common data types:

  • Strings: Plain text values
  • Numbers: Integers and floating-point numbers (e.g., 42, 3.14)
  • Booleans: true or false
  • Null values: Empty values or null

Example:

name|age|score|active|notes
---
Alice|30|95.5|true|Great performance
Bob|25||false|

Examples

Working with different data types

import pandas as pd
from io import StringIO

# Read TOON data from string
toon_data = """@employee_data
name|age|salary|active
---
Alice|30|75000.0|true
Bob|25|65000.0|true
Charlie|35||false
"""

df = pd.read_toon(StringIO(toon_data))
print(df)
#       name  age   salary  active
# 0    Alice   30  75000.0    True
# 1      Bob   25  65000.0    True
# 2  Charlie   35      NaN   False

Round-trip conversion

# Create DataFrame
df = pd.DataFrame({
    'product': ['Laptop', 'Mouse', 'Keyboard'],
    'price': [999.99, 29.99, 79.99],
    'in_stock': [True, True, False]
})

# Convert to TOON and back
toon_str = df.to_toon()
df_restored = pd.read_toon(StringIO(toon_str))

# Verify data integrity
assert df.equals(df_restored)

Use Cases

LLM Prompts

TOON's compact format is ideal for including data in LLM prompts while minimizing token usage:

df = pd.DataFrame({
    'question': ['What is Python?', 'What is pandas?'],
    'answer': ['A programming language', 'A data analysis library']
})

# Include in prompt
prompt = f"""Here is the Q&A data:

{df.to_toon()}

Please analyze this data..."""

Data Exchange

Use TOON for lightweight data exchange between systems:

# Export data
df.to_toon("export.toon")

# Share file or content
# Other system reads it back
df_received = pd.read_toon("export.toon")

API Reference

pd.read_toon(filepath_or_buffer, **kwargs)

Read a TOON format file into a DataFrame.

Parameters:

  • filepath_or_buffer: str, Path, or file-like object
    • Path to the TOON file or a file-like object containing TOON data

Returns:

  • DataFrame: A pandas DataFrame containing the parsed TOON data

DataFrame.to_toon(path_or_buf=None, table_name=None, **kwargs)

Write a DataFrame to TOON format.

Parameters:

  • path_or_buf: str, Path, or None (optional)
    • File path to write to. If None, returns the TOON string
  • table_name: str (optional)
    • Optional table name to include in the TOON output

Returns:

  • str or None: If path_or_buf is None, returns the TOON-formatted string. Otherwise, writes to the file and returns None

Development

Setup

Clone the repository and install in development mode:

git clone https://github.com/AMSeify/pandas-toon.git
cd pandas-toon
pip install -e ".[dev]"

Running Tests

pytest tests/

With coverage:

pytest --cov=pandas_toon tests/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Links

Credits

This library builds upon:

  • pandas - The powerful Python data analysis library
  • TOON format - Token-Oriented Object Notation specification

About TOON

TOON (Token-Oriented Object Notation) is a data format specifically designed for Large Language Models. It aims to:

  • Minimize token usage compared to JSON or CSV
  • Provide clear, unambiguous structure
  • Maintain human readability
  • Support common data types efficiently

Learn more about TOON at the official TOON repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_toon-0.1.0.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_toon-0.1.0-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file pandas_toon-0.1.0.tar.gz.

File metadata

  • Download URL: pandas_toon-0.1.0.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pandas_toon-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0a3170db573ba3378a6f7902bbd90775b90c8b01ef864d3327f37f52ee5bf920
MD5 166dae713c56235dc3e78f30614764de
BLAKE2b-256 c636eee5499da99f639689eff6d8b26c22c2d0918fc0895bf955982341a66846

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_toon-0.1.0.tar.gz:

Publisher: publish.yml on AMSeify/pandas-toon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pandas_toon-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pandas_toon-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pandas_toon-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 64f1703bc5e66925f604eab39f9492f01ce7640bfdffddca2a4e8c664db7467e
MD5 958c6b38ff2b2ebd39e3df65707b1de1
BLAKE2b-256 9dc2fdd3a9580c7832a230a968817b7a3e3789ab1620e16da678485e8675eaea

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_toon-0.1.0-py3-none-any.whl:

Publisher: publish.yml on AMSeify/pandas-toon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page