Skip to main content

ISON (Interchange Simple Object Notation) - A token-efficient data format for AI/LLM workflows

Project description

ISON Logo

ison-py

ISON (Interchange Simple Object Notation) - A token-efficient data format optimized for AI/LLM workflows.

PyPI version Python 3.9+ License: MIT Tests

Features

  • 30-70% fewer tokens than JSON for structured data
  • ISONL streaming format for fine-tuning datasets and event streams
  • Native references for relational data (:-prefixed IDs)
  • Type inference for clean, minimal syntax
  • Zero dependencies - pure Python implementation

Installation

pip install ison-py

Quick Start

Basic Usage

import ison_parser

# Parse ISON
ison_text = """
table.users
id name email
1 Alice alice@example.com
2 Bob bob@example.com
"""

doc = ison_parser.loads(ison_text)

# Access data
users = doc['users']
print(users.rows[0]['name'])  # Alice

# Convert to JSON
json_data = doc.to_dict()

ISONL Streaming Format

ISONL is perfect for fine-tuning datasets, event streams, and logs:

from ison_parser import loads_isonl, dumps_isonl, isonl_stream

# Parse ISONL
isonl_text = """table.examples|instruction response|"Summarize this" "Brief summary..."
table.examples|instruction response|"Translate to Spanish" "Hola mundo" """

doc = loads_isonl(isonl_text)

# Stream large files (constant memory)
with open("large_dataset.isonl", "r") as f:
    for record in isonl_stream(f):
        process(record)

Format Conversion

from ison_parser import ison_to_isonl, isonl_to_ison

# ISON to ISONL (one line per record)
isonl = ison_to_isonl(ison_text)

# ISONL to ISON (grouped blocks)
ison = isonl_to_ison(isonl_text)

ISON Format

Tables (Structured Data)

table.users
id name email active
1 Alice alice@example.com true
2 Bob bob@example.com false

Objects (Key-Value)

object.config
timeout 30
debug true
api_key "sk-xxx"

References

table.orders
id customer_id total
O1 :C1 99.99
O2 :C2 149.50

ISONL Format

Each line is a self-contained record:

kind.name|field1 field2 field3|value1 value2 value3

Example:

table.users|id name email|1 Alice alice@example.com
table.users|id name email|2 Bob bob@example.com
table.orders|id user total|O1 :1 99.99

Token Efficiency

Records JSON Tokens ISON Tokens Savings
10 ~200 ~60-140 30-70%
100 ~2000 ~600-1400 30-70%
1000 ~20000 ~6000-14000 30-70%

API Reference

Core Functions

  • loads(text) - Parse ISON string to Document
  • dumps(doc) - Serialize Document to ISON string
  • load(path) - Load ISON from file
  • dump(doc, path) - Save Document to file

ISONL Functions

  • loads_isonl(text) - Parse ISONL string to Document
  • dumps_isonl(doc) - Serialize Document to ISONL
  • load_isonl(path) - Load ISONL from file
  • dump_isonl(doc, path) - Save Document to ISONL file
  • isonl_stream(file) - Stream ISONL records (generator)
  • ison_to_isonl(text) - Convert ISON to ISONL
  • isonl_to_ison(text) - Convert ISONL to ISON

Classes

  • Document - Container for ISON blocks
  • Block - Single data block (table/object)
  • Reference - Reference to another record
  • ISONLRecord - Single ISONL record

CLI Usage

# Convert JSON to ISON
ison input.json -o output.ison

# Convert ISON to JSON
ison input.ison --to-json -o output.json

# Validate ISON file
ison input.ison --validate

Database Plugins

Export database tables directly to ISON for LLM workflows:

SQLite (Zero Dependencies)

from ison_parser.plugins import SQLiteToISON

# Export entire database
with SQLiteToISON('mydb.sqlite') as db:
    ison_text = db.export_all()

# Export specific tables
    ison_text = db.export_tables(['users', 'orders'])

# Stream large tables as ISONL
    for line in db.stream_table('logs'):
        print(line)

# Foreign keys auto-convert to ISON references (:id)

PostgreSQL

pip install psycopg2-binary
from ison_parser.plugins import PostgreSQLToISON

with PostgreSQLToISON('postgresql://user:pass@localhost/mydb') as db:
    # Export all tables
    ison_text = db.export_all()

    # Export with related tables (follows foreign keys)
    ison_text = db.export_with_relations('orders')

    # Stream for large datasets
    for line in db.stream_table('events', batch_size=5000):
        process(line)

SQLAlchemy (Any Database)

Works with MySQL, MariaDB, Oracle, MS SQL Server, and more:

pip install sqlalchemy pymysql  # For MySQL
from ison_parser.plugins import SQLAlchemyToISON

# MySQL
with SQLAlchemyToISON('mysql+pymysql://user:pass@localhost/db') as db:
    ison_text = db.export_all()

# Export ORM models directly
from myapp.models import User, Order
    ison_text = db.export_models([User, Order], session)

# Custom queries
    ison_text = db.export_query(
        "SELECT * FROM users WHERE active = true",
        block_name="active_users"
    )

Vector Database Plugins

Export vector search results directly to ISON for RAG pipelines:

ChromaDB

pip install chromadb
from ison_parser.plugins import ChromaToISON

with ChromaToISON() as db:
    # Export RAG context (optimized for LLM prompts)
    ison_context = db.export_for_rag(
        collection='documents',
        query='What is ISON?',
        n_results=5
    )

    # Export search results with scores
    ison_text = db.export_query_results(
        collection='documents',
        query_texts=['semantic search query'],
        n_results=10
    )

    # Stream large collections as ISONL
    for line in db.stream_collection('documents'):
        process(line)

Pinecone

pip install pinecone-client
from ison_parser.plugins import PineconeToISON

exporter = PineconeToISON(api_key='your-key')

# Export search results
ison_text = exporter.export_query_results(
    index='my-index',
    query_vector=embedding,
    top_k=10
)

# RAG context with custom embedding function
ison_context = exporter.export_for_rag(
    index='my-index',
    query='What is ISON?',
    embedding_fn=my_embed_function,
    top_k=5
)

Qdrant

pip install qdrant-client
from ison_parser.plugins import QdrantToISON

exporter = QdrantToISON(host='localhost', port=6333)

# Export search results
ison_text = exporter.export_search_results(
    collection='documents',
    query_vector=embedding,
    limit=10
)

# RAG context
ison_context = exporter.export_for_rag(
    collection='documents',
    query='What is ISON?',
    embedding_fn=my_embed_function,
    limit=5
)

LLM Framework Integrations

Native integrations for major LLM frameworks, providing 30-70% token savings.

LangChain

from ison_parser.integrations import ISONOutputParser

parser = ISONOutputParser()
prompt = f"List users. {parser.get_format_instructions()}"
doc = parser.parse(llm.predict(prompt))

LlamaIndex

from ison_parser.integrations import ISONReader

reader = ISONReader()
documents = reader.load_data("data.ison")
index = VectorStoreIndex.from_documents(documents)

MCP Server (for AI Assistants like Claude)

# Run ISON MCP server
python -m ison_parser.integrations.mcp_server
from ison_parser.integrations import ISONMCPServer, ISONMCPClient

# Server exposes: parse_ison, format_ison, validate_ison, query_ison
server = ISONMCPServer()

# Client with local fallback
async with ISONMCPClient() as client:
    result = await client.parse_ison(ison_text)

OpenAI Function Calling

from ison_parser.integrations import OpenAIISONTools

tools = OpenAIISONTools()
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    tools=tools.get_tool_definitions()
)
doc = tools.parse_response(response)

Anthropic Tool Use

from ison_parser.integrations import AnthropicISONTools

tools = AnthropicISONTools()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    messages=messages,
    tools=tools.get_tool_definitions()
)
doc = tools.parse_response(response)

Use Cases

  • LLM Fine-tuning datasets - 30-70% smaller training files
  • RAG pipelines - Token-efficient context from vector DBs
  • Database-to-LLM - Direct export with SQL plugins
  • Vector search - Export results in compact format
  • Event streaming - Append-only logs
  • Configuration - Human-readable configs
  • API responses - Reduced bandwidth
  • MCP Tools - AI assistant integrations

Test Results

All tests passing:

============================= test session starts =============================
platform win32 -- Python 3.12.7, pytest-8.4.1

tests/test_ison_parser.py::test_basic_table PASSED
tests/test_ison_parser.py::test_quoted_strings PASSED
tests/test_ison_parser.py::test_escape_sequences PASSED
tests/test_ison_parser.py::test_type_inference PASSED
tests/test_ison_parser.py::test_references PASSED
tests/test_ison_parser.py::test_null_handling PASSED
tests/test_ison_parser.py::test_dot_path_fields PASSED
tests/test_ison_parser.py::test_comments PASSED
tests/test_ison_parser.py::test_multiple_blocks PASSED
tests/test_ison_parser.py::test_serialization_roundtrip PASSED
tests/test_ison_parser.py::test_to_json PASSED
tests/test_ison_parser.py::test_from_dict PASSED
tests/test_ison_parser.py::test_error_handling PASSED
tests/test_ison_parser.py::test_complete_example PASSED
tests/test_ison_parser.py::test_typed_fields PASSED
tests/test_ison_parser.py::test_relationship_references PASSED
tests/test_ison_parser.py::test_summary_rows PASSED
tests/test_ison_parser.py::test_computed_fields PASSED
tests/test_ison_parser.py::test_serialization_with_types PASSED
tests/test_ison_parser.py::test_isonl_basic_parsing PASSED
tests/test_ison_parser.py::test_isonl_type_inference PASSED
tests/test_ison_parser.py::test_isonl_references PASSED
tests/test_ison_parser.py::test_isonl_multiple_blocks PASSED
tests/test_ison_parser.py::test_isonl_comments_and_empty PASSED
tests/test_ison_parser.py::test_isonl_serialization PASSED
tests/test_ison_parser.py::test_isonl_roundtrip PASSED
tests/test_ison_parser.py::test_ison_to_isonl_conversion PASSED
tests/test_ison_parser.py::test_isonl_to_ison_conversion PASSED
tests/test_ison_parser.py::test_isonl_quoted_pipes PASSED
tests/test_ison_parser.py::test_isonl_error_handling PASSED
tests/test_ison_parser.py::test_isonl_fine_tuning_format PASSED

============================= 31 passed in 0.10s ==============================

Run tests with:

pytest tests/

Links

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ison_py-1.0.0.tar.gz (94.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ison_py-1.0.0-py3-none-any.whl (82.0 kB view details)

Uploaded Python 3

File details

Details for the file ison_py-1.0.0.tar.gz.

File metadata

  • Download URL: ison_py-1.0.0.tar.gz
  • Upload date:
  • Size: 94.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for ison_py-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d716f22569e22dba290edf36690f204ae4b7db840c12fb2d0fba8b9e33121d57
MD5 ce037dc31dfb488d08d26618714cfef4
BLAKE2b-256 2cb3721946be3ad69c8eea960a28d9def55c497f749aaf4b939d07a635a546e6

See more details on using hashes here.

File details

Details for the file ison_py-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: ison_py-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 82.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for ison_py-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 410a3733e3d1e65e4d3325dfe72f1544b935db7d7f84e221bb599f6e7695c7f6
MD5 be38dd2fcd4c83fe6ea6270b8f93dcad
BLAKE2b-256 1e687867fbce6f54bb008452b2868e3651d1839d81a62e39c0867dded5c9870c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page