Skip to main content

DSPy adapter using TOON (Token-Oriented Object Notation) v3.0 format for 40%+ token reduction in LLM structured outputs

Project description

DSPy-TOON

Tests Python 3.10+ License: MIT Buy Me a Coffee

DSPy adapter using TOON (Token-Oriented Object Notation) for 40%+ token reduction in structured LLM outputs.

DSPy-TOON provides a custom adapter for DSPy that uses TOON format instead of JSON for structured outputs. TOON is a compact, human-readable serialization format optimized for LLM contexts, achieving 65% fewer output tokens for tabular data.

Key Features

  • 40%+ Total Token Reduction - Significant savings on both input and output tokens
  • 65% Output Token Reduction - Tabular format dramatically reduces response tokens for lists
  • Seamless DSPy Integration - Drop-in replacement for JSONAdapter
  • Async & Streaming Support - Full support for dspy.asyncify() and dspy.streamify()

Installation

# Install from GitHub (recommended during beta)
pip install git+https://github.com/Archelunch/dspy-toon.git

# With benchmark dependencies
pip install "dspy-toon[benchmark] @ git+https://github.com/Archelunch/dspy-toon.git"

# For development
git clone https://github.com/Archelunch/dspy-toon.git
cd dspy-toon
pip install -e ".[dev]"

Quick Start

import dspy
from pydantic import BaseModel, Field
from dspy_toon import ToonAdapter

# Define your Pydantic models
class UserInfo(BaseModel):
    name: str = Field(description="Full name")
    age: int = Field(description="Age in years")
    occupation: str = Field(description="Job title")

# Define DSPy signature
class ExtractUser(dspy.Signature):
    """Extract user information from text."""
    text: str = dspy.InputField()
    user: UserInfo = dspy.OutputField()

# Configure DSPy with ToonAdapter
lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm, adapter=ToonAdapter())

# Use as normal
extractor = dspy.Predict(ExtractUser)
result = extractor(text="Alice Johnson is a 35-year-old software engineer.")
print(result.user)
# UserInfo(name='Alice Johnson', age=35, occupation='software engineer')

TOON Format

TOON uses compact syntax that LLMs can easily produce and parse:

JSON:

[{"id":1,"name":"Person 1","age":21},{"id":2,"name":"Person 2","age":22},{"id":3,"name":"Person 3","age":23}]

TOON (v3.0 format):

[3]{id,name,age}:
  1,Person 1,21
  2,Person 2,22
  3,Person 3,23

Objects with arrays as first field:

items[2]:
  - users[2]{id,name}:
    1,Alice
    2,Bob
    status: active
  - users[1]{id,name}:
    3,Carol
    status: pending

Benchmarks

Real token usage measured with DSPy's track_usage=True on gemini/gemini-2.5-flash-lite:

Token Usage by Test Case

Test Case ToonAdapter BAMLAdapter JSONAdapter ChatAdapter
Single person 214 🏆 219 326 310
List of 3 people 272 🏆 308 453 405
List of 10 people 389 🏆 599 729 597
List of 5 products 335 🏆 420 573 507
Nested address 334 313 🏆 512 472

Total Token Usage (All Tests)

Adapter Input Output Total vs JSON
ToonAdapter 1282 262 1544 🏆 -40.5%
BAMLAdapter 1181 678 1859 -28.3%
ChatAdapter 1779 512 2291 -11.7%
JSONAdapter 1855 738 2593 -

ToonAdapter wins with 40.5% total savings and 65% output token reduction vs JSONAdapter!

Run benchmarks:

python -m benchmarks.adapter_comparison --model gemini/gemini-2.5-flash-lite

Examples

Sentiment Analysis

import dspy
from pydantic import BaseModel, Field
from typing import Literal
from dspy_toon import ToonAdapter

class SentimentResult(BaseModel):
    sentiment: Literal["positive", "negative", "neutral"]
    confidence: float = Field(description="Confidence score 0-1")
    key_phrases: list[str] = Field(description="Key phrases that influenced sentiment")

class AnalyzeSentiment(dspy.Signature):
    """Analyze sentiment of the given text."""
    text: str = dspy.InputField()
    result: SentimentResult = dspy.OutputField()

lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm, adapter=ToonAdapter())

analyzer = dspy.Predict(AnalyzeSentiment)
result = analyzer(text="I absolutely love this product! Best purchase ever.")
print(result.result)

Extract Multiple Entities (Tabular)

import dspy
from pydantic import BaseModel
from dspy_toon import ToonAdapter

class Person(BaseModel):
    name: str
    age: int
    occupation: str

class ExtractPeople(dspy.Signature):
    """Extract all people mentioned in the text."""
    text: str = dspy.InputField()
    people: list[Person] = dspy.OutputField()

lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm, adapter=ToonAdapter())

extractor = dspy.Predict(ExtractPeople)
result = extractor(text="""
    Alice (35) is a software engineer. Bob is 28 and works as a designer.
    Carol, aged 42, is the project manager.
""")

# ToonAdapter uses tabular format for lists - saves 30%+ tokens
for person in result.people:
    print(f"{person.name}, {person.age}, {person.occupation}")

Nested Models

import dspy
from pydantic import BaseModel, Field
from typing import Literal
from dspy_toon import ToonAdapter

class Address(BaseModel):
    street: str
    city: str
    country: Literal["US", "UK", "DE"]

class UserProfile(BaseModel):
    name: str = Field(description="Full name")
    email: str = Field(description="Email address")
    address: Address | None = Field(description="Home address")

class ExtractProfile(dspy.Signature):
    """Extract user profile from text."""
    text: str = dspy.InputField()
    profile: UserProfile = dspy.OutputField()

lm = dspy.LM("openai/gpt-4o-mini")
dspy.configure(lm=lm, adapter=ToonAdapter())

extractor = dspy.Predict(ExtractProfile)
result = extractor(text="Contact John at john@example.com. He lives at 123 Main St, Boston, US.")
print(result.profile)

See the examples/ directory for complete working examples.

Async & Streaming

ToonAdapter fully supports DSPy's async operations and token-level streaming.

Async Operations

Use dspy.asyncify() for async operations:

import asyncio
import dspy
from dspy_toon import ToonAdapter

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"), adapter=ToonAdapter())

predict = dspy.Predict("question -> answer")
async_predict = dspy.asyncify(predict)

async def main():
    result = await async_predict(question="What is the capital of France?")
    print(result.answer)

asyncio.run(main())

Token-Level Streaming

For real-time token streaming, enable ToonAdapter streaming support:

import asyncio
import dspy
from dspy_toon import ToonAdapter, enable_toon_streaming

# Enable streaming support (call once at startup)
enable_toon_streaming()

# Configure DSPy
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini", cache=False), adapter=ToonAdapter())

predict = dspy.Predict("question -> answer")

# Create streaming predictor
stream_predict = dspy.streamify(
    predict,
    stream_listeners=[dspy.streaming.StreamListener(signature_field_name="answer")],
)

async def stream_response():
    async for chunk in stream_predict(question="Explain quantum computing briefly."):
        if isinstance(chunk, dspy.streaming.StreamResponse):
            print(chunk.chunk, end="", flush=True)  # Print tokens as they arrive
        elif isinstance(chunk, dspy.Prediction):
            print(f"\n\nFinal: {chunk.answer}")

asyncio.run(stream_response())

Synchronous Streaming

For sync streaming, set async_streaming=False:

import dspy
from dspy_toon import ToonAdapter, enable_toon_streaming

enable_toon_streaming()
dspy.configure(lm=dspy.LM("openai/gpt-4o-mini", cache=False), adapter=ToonAdapter())

predict = dspy.Predict("question -> answer")
stream_predict = dspy.streamify(
    predict,
    stream_listeners=[dspy.streaming.StreamListener(signature_field_name="answer")],
    async_streaming=False,  # Sync mode
)

for chunk in stream_predict(question="What is 2+2?"):
    if isinstance(chunk, dspy.streaming.StreamResponse):
        print(chunk.chunk, end="", flush=True)

Development

# Clone repository
git clone https://github.com/Archelunch/dspy-toon.git
cd dspy-toon

# Install with dev dependencies
pip install -e ".[dev,benchmark]"

# Run tests
pytest tests/ -v

# Run tests with coverage
pytest tests/ --cov=dspy_toon --cov-report=term

# Type checking
mypy src/

# Linting
ruff check src/ tests/
ruff format src/ tests/

Roadmap

  • Core ToonAdapter implementation
  • Token usage benchmarks
  • BAML adapter comparison benchmarks
  • Async support via dspy.asyncify()
  • Token-level streaming via enable_toon_streaming()
  • Integration with DSPy optimizers (MIPROv2, BootstrapFewShot)
  • More benchmarks on complex data and optimizations

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

MIT License

Acknowledgments

  • DSPy - The foundation framework
  • TOON Format - Original TOON specification
  • toon-python - Python TOON encoder/decoder
  • BAML - Inspiration for adapter approach

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dspy_toon-0.3.0.tar.gz (808.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dspy_toon-0.3.0-py3-none-any.whl (24.0 kB view details)

Uploaded Python 3

File details

Details for the file dspy_toon-0.3.0.tar.gz.

File metadata

  • Download URL: dspy_toon-0.3.0.tar.gz
  • Upload date:
  • Size: 808.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dspy_toon-0.3.0.tar.gz
Algorithm Hash digest
SHA256 75c5588727c2a137b0f6d92ec1b6eb6c677d0153702262a74f4a669cb3f3a79c
MD5 7781f5244da224ea44ec33837fcfb1d9
BLAKE2b-256 e5cdda4927de8c12a714cdaf2b3b0312d32c9ea69dce247980f89134092199f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dspy_toon-0.3.0.tar.gz:

Publisher: publish.yml on Archelunch/dspy-toon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dspy_toon-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dspy_toon-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 24.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dspy_toon-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 200751256333e16df0d6fd03fcbe2599ab0748fbcfc5a0d9a78bc09a19b5b511
MD5 db42686eb3b31fab8a2cbfed7ada650f
BLAKE2b-256 836f7e923ca0d34e49db0012e5642604ef5292f1d2ea1c42bcca91ae7cb5f9f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dspy_toon-0.3.0-py3-none-any.whl:

Publisher: publish.yml on Archelunch/dspy-toon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page