ATON FORMAT - Adaptive Token-Oriented Notation - Data format optimized for LLMs with 56% token reduction

These details have not been verified by PyPI

Project links

Project description

ATON - Adaptive Token-Oriented Notation

A data serialization format optimized for Large Language Models achieving 50-60% token reduction compared to JSON while maintaining full data integrity and human readability.

Overview
Installation
Quick Start
Core Concepts
Features
Performance
API Reference
Advanced Usage
Use Cases
Technical Details
Examples
Testing
Contributing
License

Overview

ATON (Adaptive Token-Oriented Notation) is a novel data serialization format specifically engineered for applications utilizing Large Language Models (LLMs). Unlike traditional formats like JSON, which were designed for general-purpose data interchange, ATON optimizes for the tokenization patterns of modern LLMs, resulting in significant reductions in token count without sacrificing data integrity or readability.

Key Metrics

Token Reduction: 50-60% fewer tokens compared to JSON
Type Safety: Explicit schema with full type definitions
Data Integrity: Zero data loss in round-trip encoding/decoding
Performance: Comparable encoding/decoding speed to JSON
Human Readability: Clear, structured format suitable for manual inspection

Why ATON?

Traditional data formats like JSON introduce significant overhead when processed by LLMs:

Repetitive Key Names: In arrays of objects, key names are repeated for every item
Verbose Syntax: Brackets, braces, and quotes add unnecessary tokens
Lack of Schema: Type information is implicit, requiring additional context
No Default Values: Common values must be explicitly stated every time

ATON addresses these inefficiencies through:

Schema Declaration: Define structure once, not per record
Default Values: Declare common values once and omit from data rows
Tabular Structure: Homogeneous data represented in compact tabular form
Type Annotations: Explicit type information in schema definitions

Installation

From PyPI

pip install aton-format

From Source

git clone https://github.com/dagoSte/aton-format.git
cd aton
pip install -e .

Development Installation

pip install aton-format[dev]

This installs additional dependencies for development:

pytest (testing framework)
pytest-cov (coverage reporting)
black (code formatting)
flake8 (linting)
mypy (type checking)

Requirements

Python 3.8 or higher
No external dependencies for core functionality

Quick Start

Basic Encoding

from aton import ATONEncoder

# Create encoder with optimization enabled
encoder = ATONEncoder(optimize=True)

# Define your data
data = {
    "users": [
        {"id": 1, "name": "Alice", "email": "alice@example.com", "active": True},
        {"id": 2, "name": "Bob", "email": "bob@example.com", "active": True},
        {"id": 3, "name": "Charlie", "email": "charlie@example.com", "active": False}
    ]
}

# Encode to ATON format
aton_string = encoder.encode(data)
print(aton_string)

Output:

@schema[id:int, name:str, email:str, active:bool]
@defaults[active:true]

users(3):
  1, "Alice", "alice@example.com"
  2, "Bob", "bob@example.com"
  3, "Charlie", "charlie@example.com", false

Basic Decoding

from aton import ATONDecoder

decoder = ATONDecoder()

# Decode ATON string back to Python dictionary
original_data = decoder.decode(aton_string)

# Verify data integrity
assert data == original_data  # True - zero data loss

Core Concepts

Schema Definition

ATON uses explicit schema declarations to define the structure and types of data. The schema is declared once at the beginning of each entity collection using the @schema directive.

Syntax:

@schema[field1:type1, field2:type2, field3:type3]

Supported Types:

int - Integer values
float - Floating-point numbers
str - String values
bool - Boolean (true/false)
arr - Arrays/lists
obj - Objects/dictionaries
datetime - ISO 8601 datetime strings
ref - References to other entities

Example:

@schema[user_id:int, username:str, balance:float, verified:bool, tags:arr]

Default Values

The @defaults directive allows you to specify common values that apply to multiple records. When a field has the default value, it can be omitted from the data row, significantly reducing token count for datasets with repetitive values.

Syntax:

@defaults[field1:value1, field2:value2]

Example:

@schema[id:int, name:str, status:str, role:str]
@defaults[status:"active", role:"user"]

users(3):
  1, "Alice"
  2, "Bob"
  3, "Charlie", "inactive", "admin"

In this example:

Records 1 and 2 use default values for status and role
Record 3 overrides both defaults with explicit values

Tabular Structure

ATON represents homogeneous collections (arrays of objects with the same structure) in a tabular format. Each row contains only the values, with the structure defined by the schema.

Entity Declaration:

entity_name(count):
  value1, value2, value3
  value1, value2, value3
  ...

This approach eliminates the need to repeat field names for every record, resulting in substantial token savings.

Native Relationships

ATON supports explicit relationships between entities using the -> notation, allowing you to reference entities in other collections directly.

Syntax:

->collection_name[entity_id]

Example:

@schema[order_id:int, customer_ref:ref, amount:float]

orders(2):
  1001, ->customers[customer_42], 299.99
  1002, ->customers[customer_17], 149.50

This creates an explicit link between orders and customers, making relationships clear to both humans and LLMs.

Features

Type Safety

ATON provides explicit type information through schema declarations, enabling:

Validation: Verify data conforms to expected types
Auto-completion: IDEs can provide intelligent suggestions
Documentation: Schema serves as self-documenting format
Type Inference: Automatic type detection during encoding

Human Readability

Unlike binary formats or highly compressed representations, ATON maintains excellent readability:

Clean, structured layout
Self-documenting through schemas
Easy to inspect and debug
Suitable for version control systems
Can be manually edited when necessary

Zero Data Loss

ATON guarantees perfect round-trip encoding and decoding:

encoder = ATONEncoder()
decoder = ATONDecoder()

original = {"data": [{"id": 1, "value": 3.14159}]}
aton = encoder.encode(original)
recovered = decoder.decode(aton)

assert original == recovered  # Always True

This makes ATON suitable for:

Data persistence
Inter-service communication
Backup and restore operations
Data migration

Configuration Flexibility

ATON encoders support multiple configuration options to suit different use cases:

encoder = ATONEncoder(
    optimize=True,           # Enable all optimizations
    include_schema=True,     # Generate @schema declarations
    include_defaults=True,   # Generate @defaults and omit values
    min_items=1             # Minimum array size for optimization
)

Performance

Token Efficiency Comparison

ATON consistently achieves 50-60% token reduction across various data structures compared to JSON.

Example Dataset: Product Catalog (20 items)

Format	Size	Tokens	Reduction
JSON	1,847 bytes	462 tokens	0% (baseline)
ATON	823 bytes	206 tokens	55.4%

Example Dataset: User Records (100 items)

Format	Size	Tokens	Reduction
JSON	8,932 bytes	2,233 tokens	0% (baseline)
ATON	3,891 bytes	973 tokens	56.4%

Example Dataset: RAG System (50 chunks)

Format	Size	Tokens	Reduction
JSON	15,400 bytes	3,850 tokens	0% (baseline)
ATON	6,600 bytes	1,650 tokens	57.1%

Cost Savings

Based on current LLM API pricing (GPT-4: $0.03 per 1K input tokens):

Daily Volume	JSON Cost	ATON Cost	Annual Savings
1M tokens	$30	$13.20	$6,132
10M tokens	$300	$132	$61,320
100M tokens	$3,000	$1,320	$613,200
1B tokens	$30,000	$13,200	$6,132,000

Encoding/Decoding Performance

ATON maintains comparable performance to JSON for encoding and decoding operations:

Benchmark Results (1,000 iterations, Python 3.11)

Operation	JSON	ATON	Difference
Encode (10 items)	0.42ms	0.51ms	+21%
Decode (10 items)	0.38ms	0.44ms	+16%
Encode (100 items)	3.21ms	3.67ms	+14%
Decode (100 items)	2.89ms	3.12ms	+8%

The slight overhead in encoding/decoding is negligible compared to the token savings during LLM processing.

API Reference

ATONEncoder

class ATONEncoder:
    def __init__(
        self,
        optimize: bool = True,
        include_schema: bool = True,
        include_defaults: bool = True,
        min_items: int = 1
    )

Parameters:

optimize (bool): Enable optimization features. Default: True
include_schema (bool): Generate @schema declarations. Default: True
include_defaults (bool): Generate @defaults and omit matching values. Default: True
min_items (int): Minimum number of items in array to apply optimizations. Default: 1

Methods:

encode(data: Dict[str, Any]) -> str

Encodes a Python dictionary to ATON format string.

Parameters:

data: Dictionary containing arrays of homogeneous objects

Returns:

ATON formatted string

Raises:

TypeError: If data structure is invalid

Example:

encoder = ATONEncoder()
data = {"products": [{"id": 1, "name": "Widget"}]}
aton = encoder.encode(data)

estimate_tokens(text: str) -> int

Estimates the number of tokens in a text string using a rough approximation (4 characters per token).

Parameters:

text: Input text string

Returns:

Estimated token count (integer)

Example:

encoder = ATONEncoder()
token_count = encoder.estimate_tokens(aton_string)

ATONDecoder

class ATONDecoder:
    def __init__(self)

Methods:

decode(aton_str: str) -> Dict[str, Any]

Decodes an ATON format string to a Python dictionary.

Parameters:

aton_str: ATON formatted string

Returns:

Python dictionary with original data structure

Raises:

ValueError: If ATON string is malformed
SyntaxError: If schema or data format is invalid

Example:

decoder = ATONDecoder()
data = decoder.decode(aton_string)

Advanced Usage

Custom Configuration Profiles

Production Profile (Maximum Savings)

encoder = ATONEncoder(
    optimize=True,
    include_schema=True,
    include_defaults=True,
    min_items=1
)

Use for production deployments where token efficiency is critical.

Development Profile (Easy Debugging)

encoder = ATONEncoder(
    optimize=False,
    include_schema=True,
    include_defaults=False,
    min_items=1
)

Use during development when you want explicit values in every record for easier inspection.

Minimal Profile (Maximum Compression)

encoder = ATONEncoder(
    optimize=True,
    include_schema=False,
    include_defaults=True,
    min_items=1
)

Use when schema is known externally and maximum compression is required.

Working with Complex Data Structures

Nested Objects

data = {
    "transactions": [
        {
            "id": 1,
            "metadata": {"ip": "192.168.1.1", "device": "mobile"},
            "amount": 99.99
        }
    ]
}

encoder = ATONEncoder()
aton = encoder.encode(data)

Output:

@schema[id:int, metadata:obj, amount:float]

transactions(1):
  1, {ip:"192.168.1.1",device:"mobile"}, 99.99

Arrays

data = {
    "users": [
        {
            "id": 1,
            "name": "Alice",
            "permissions": ["read", "write", "admin"]
        }
    ]
}

encoder = ATONEncoder()
aton = encoder.encode(data)

Output:

@schema[id:int, name:str, permissions:arr]

users(1):
  1, "Alice", ["read","write","admin"]

Relationships

data = {
    "documents": [
        {"doc_id": "doc_001", "title": "Report"},
        {"doc_id": "doc_002", "title": "Analysis"}
    ],
    "chunks": [
        {"chunk_id": "ch_001", "doc_id": "doc_001", "content": "..."},
        {"chunk_id": "ch_002", "doc_id": "doc_001", "content": "..."},
        {"chunk_id": "ch_003", "doc_id": "doc_002", "content": "..."}
    ]
}

Output:

@schema[doc_id:str, title:str]

documents(2):
  "doc_001", "Report"
  "doc_002", "Analysis"

@schema[chunk_id:str, doc_id:str, content:str]

chunks(3):
  "ch_001", "doc_001", "..."
  "ch_002", "doc_001", "..."
  "ch_003", "doc_002", "..."

Token Comparison Workflow

import json
from aton import ATONEncoder

encoder = ATONEncoder(optimize=True)

# Your data
data = {"items": [{"id": i, "value": i*10} for i in range(100)]}

# JSON representation
json_str = json.dumps(data)
json_tokens = encoder.estimate_tokens(json_str)

# ATON representation
aton_str = encoder.encode(data)
aton_tokens = encoder.estimate_tokens(aton_str)

# Calculate savings
reduction = (1 - aton_tokens / json_tokens) * 100
saved_tokens = json_tokens - aton_tokens

print(f"JSON: {json_tokens} tokens")
print(f"ATON: {aton_tokens} tokens")
print(f"Reduction: {reduction:.1f}%")
print(f"Saved: {saved_tokens} tokens")

Use Cases

RAG (Retrieval-Augmented Generation) Systems

ATON is particularly effective for RAG systems where document chunks and metadata must be efficiently passed to LLMs.

Scenario: Document retrieval system with 50 chunks

Traditional JSON Approach:

Average: 3,850 tokens per query
Cost per 1M queries: $115.50

ATON Approach:

Average: 1,650 tokens per query (57% reduction)
Cost per 1M queries: $49.50
Annual Savings (1M queries/day): $24,090

Example Structure:

@schema[chunk_id:str, doc_id:ref, page:int, confidence:float, content:str]
@defaults[confidence:0.95]

chunks(50):
  "ch_001", ->documents[doc_123], 1, , "Content here..."
  "ch_002", ->documents[doc_123], 2, 0.98, "More content..."
  ...

Multi-Agent Systems

Efficient state management for multiple AI agents communicating with each other.

Scenario: 10 agents with frequent state updates

Benefits:

Reduced message sizes between agents
Faster state synchronization
Lower bandwidth requirements
Clearer agent relationships

Example Structure:

@schema[agent_id:str, type:str, status:str, task_ref:ref, metrics:obj]
@defaults[status:"active", type:"processor"]

agents(10):
  "agent_001", , , ->tasks[task_42], {cpu:45,mem:2048}
  "agent_002", "analyzer", "busy", ->tasks[task_43], {cpu:78,mem:4096}
  ...

E-commerce Product Catalogs

Efficient product data management for LLM-powered recommendation systems.

Scenario: 1,000 products with detailed attributes

Traditional JSON: ~140,000 tokens ATON Format: ~62,000 tokens Reduction: 55.7%

Use Case Benefits:

More products fit in context window
Faster product search and filtering
Lower API costs for recommendations
Better performance for catalog updates

Time-Series Data Analytics

Efficient representation of sensor data, metrics, and logs.

Scenario: IoT sensors reporting every minute (1,440 readings/day)

Benefits:

Compact representation of repeated structure
Easy addition of new sensor types
Efficient querying by LLMs
Reduced storage requirements

Example Structure:

@schema[timestamp:datetime, sensor_id:str, temperature:float, humidity:float, status:str]
@defaults[status:"normal"]

readings(1440):
  2025-11-18T00:00:00Z, "sensor_01", 22.5, 45.2
  2025-11-18T00:01:00Z, "sensor_01", 22.6, 45.1
  ...

API Response Optimization

Reduce bandwidth and improve response times for LLM-powered APIs.

Scenario: API serving 10M requests/day

Traditional JSON Response:

Average size: 2.1 KB per response
Total daily: 21 GB
Token count: ~5.25M per response

ATON Response:

Average size: 0.92 KB per response
Total daily: 9.2 GB
Token count: ~2.3M per response

Benefits:

56% bandwidth reduction
Faster response times
Lower cloud egress costs
Improved API scalability

Technical Details

Format Specification

Schema Declaration

@schema[field1:type1, field2:type2, ..., fieldN:typeN]

Rules:

Must appear before entity declaration
Fields defined in order
Types must be valid ATON types
Whitespace around colons and commas is optional

Defaults Declaration

@defaults[field1:value1, field2:value2, ..., fieldN:valueN]

Rules:

Must appear after schema, before entity data
Only fields defined in schema can have defaults
String values must be quoted
Boolean values are lowercase (true/false)

Entity Declaration

entity_name(count):
  value1, value2, ..., valueN
  value1, value2, ..., valueN

Rules:

Entity name must be alphanumeric (+ underscore)
Count must match number of data rows
Each row must have correct number of values
Empty values (defaults) represented by empty string between commas
Values must conform to schema types

Value Formatting

Strings: Enclosed in double quotes, escaped quotes allowed

"simple string"
"string with \"quotes\""

Numbers: No quotes, decimal notation

42
3.14159
-17.5

Booleans: Lowercase, no quotes

true
false

Arrays: Square brackets, comma-separated

["item1","item2","item3"]
[1,2,3,4,5]

Objects: Curly braces, colon-separated key:value pairs

{key1:"value1",key2:42,key3:true}

References: Arrow notation pointing to collection and ID

->collection_name[entity_id]

Datetime: ISO 8601 format

2025-11-18T10:30:00Z
2025-11-18T10:30:00+01:00

Tokenization Efficiency

ATON achieves superior tokenization through several mechanisms:

Eliminates Key Repetition
- JSON: Every object repeats all keys
- ATON: Keys declared once in schema
Reduces Syntax Overhead
- JSON: {"key": "value"} = 5 tokens
- ATON: "value" = 1 token
Leverages Default Values
- JSON: Must state every value explicitly
- ATON: Omit values matching defaults
Tabular Layout
- JSON: Nested structures with brackets
- ATON: Flat row structure

Comparison with Other Formats

ATON vs JSON

Aspect	JSON	ATON
Token Efficiency	Baseline	50-60% reduction
Type Safety	Implicit	Explicit schemas
Human Readable	Yes	Yes
Default Values	No	Yes
Relationships	Implicit	Explicit
Browser Support	Native	Requires parser
Ecosystem	Mature	Emerging

When to use ATON:

LLM-intensive applications
Token cost is significant
Data has repetitive structure
Type safety is important

When to use JSON:

Browser-based applications
Public REST APIs
Existing tooling required
Single small objects

ATON vs Protocol Buffers

Aspect	Protocol Buffers	ATON
Token Efficiency	N/A (binary)	50-60% vs JSON
Human Readable	No	Yes
Schema Required	Yes	Optional
LLM Optimization	No	Yes
Type Safety	Strong	Strong

When to use ATON:

LLM applications
Human inspection needed
Debugging required
Text-based workflows

When to use Protocol Buffers:

Binary protocols
Maximum compression
No human inspection
Non-LLM services

ATON vs CSV

Aspect	CSV	ATON
Token Efficiency	High	Higher
Type Safety	No	Yes
Nested Data	No	Yes
Relationships	No	Yes
Multiple Entities	No	Yes

When to use ATON:

Complex data structures
Type safety required
Multiple related entities
LLM applications

When to use CSV:

Simple tabular data
Excel compatibility
Single flat entity
Data analysis tools

Examples

Example 1: Basic Product Catalog

Python Code:

from aton import ATONEncoder

encoder = ATONEncoder(optimize=True)

data = {
    "products": [
        {"id": 1, "name": "Laptop", "price": 999.99, "stock": 15, "category": "electronics"},
        {"id": 2, "name": "Mouse", "price": 29.99, "stock": 150, "category": "electronics"},
        {"id": 3, "name": "Desk", "price": 299.99, "stock": 8, "category": "furniture"},
        {"id": 4, "name": "Chair", "price": 199.99, "stock": 12, "category": "furniture"}
    ]
}

aton = encoder.encode(data)
print(aton)

Output:

@schema[id:int, name:str, price:float, stock:int, category:str]

products(4):
  1, "Laptop", 999.99, 15, "electronics"
  2, "Mouse", 29.99, 150, "electronics"
  3, "Desk", 299.99, 8, "furniture"
  4, "Chair", 199.99, 12, "furniture"

Token Comparison:

JSON: 142 tokens
ATON: 67 tokens
Reduction: 52.8%

Example 2: User Management with Defaults

Python Code:

from aton import ATONEncoder

encoder = ATONEncoder(optimize=True)

data = {
    "users": [
        {"id": 1, "username": "alice", "role": "admin", "active": True, "verified": True},
        {"id": 2, "username": "bob", "role": "user", "active": True, "verified": True},
        {"id": 3, "username": "charlie", "role": "user", "active": True, "verified": False},
        {"id": 4, "username": "diana", "role": "user", "active": False, "verified": True}
    ]
}

aton = encoder.encode(data)
print(aton)

Output:

@schema[id:int, username:str, role:str, active:bool, verified:bool]
@defaults[role:"user", active:true, verified:true]

users(4):
  1, "alice", "admin"
  2, "bob"
  3, "charlie", , , false
  4, "diana", , false

Token Comparison:

JSON: 168 tokens
ATON: 58 tokens
Reduction: 65.5%

Example 3: RAG System Documents and Chunks

Python Code:

from aton import ATONEncoder

encoder = ATONEncoder(optimize=True)

data = {
    "documents": [
        {"doc_id": "doc_001", "filename": "report.pdf", "pages": 25, "processed": True},
        {"doc_id": "doc_002", "filename": "analysis.pdf", "pages": 40, "processed": True}
    ],
    "chunks": [
        {"chunk_id": "ch_001", "doc_id": "doc_001", "page": 1, "content": "Executive summary..."},
        {"chunk_id": "ch_002", "doc_id": "doc_001", "page": 2, "content": "Introduction..."},
        {"chunk_id": "ch_003", "doc_id": "doc_002", "page": 1, "content": "Methodology..."}
    ]
}

aton = encoder.encode(data)
print(aton)

Output:

@schema[doc_id:str, filename:str, pages:int, processed:bool]
@defaults[processed:true]

documents(2):
  "doc_001", "report.pdf", 25
  "doc_002", "analysis.pdf", 40

@schema[chunk_id:str, doc_id:str, page:int, content:str]

chunks(3):
  "ch_001", "doc_001", 1, "Executive summary..."
  "ch_002", "doc_001", 2, "Introduction..."
  "ch_003", "doc_002", 1, "Methodology..."

Token Comparison:

JSON: 189 tokens
ATON: 92 tokens
Reduction: 51.3%

Testing

Running Tests

# Install with dev dependencies
pip install aton-format[dev]

# Run all tests
pytest tests/

# Run with coverage
pytest tests/ --cov=aton

# Run specific test file
pytest tests/test_encoder.py

# Run with verbose output
pytest tests/ -v

Test Structure

tests/
├── test_encoder.py       # Encoder functionality tests
├── test_decoder.py       # Decoder functionality tests
├── test_roundtrip.py     # End-to-end round-trip tests
└── test_performance.py   # Performance benchmarks

Writing Custom Tests

import pytest
from aton import ATONEncoder, ATONDecoder

def test_custom_data_structure():
    encoder = ATONEncoder(optimize=True)
    decoder = ATONDecoder()
    
    data = {
        "items": [
            {"id": 1, "value": "test"},
            {"id": 2, "value": "example"}
        ]
    }
    
    # Encode
    aton = encoder.encode(data)
    
    # Verify schema is present
    assert "@schema" in aton
    
    # Decode
    result = decoder.decode(aton)
    
    # Verify round-trip
    assert result == data

Contributing

We welcome contributions to ATON! Here's how you can help:

Reporting Issues

Use GitHub Issues for bug reports and feature requests
Provide minimal reproducible examples
Include Python version and ATON version
Describe expected vs actual behavior

Pull Requests

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes
Add tests for new functionality
Ensure all tests pass: pytest tests/
Follow PEP 8 style guidelines
Commit with clear messages: git commit -m "Add amazing feature"
Push to your fork: git push origin feature/amazing-feature
Open a Pull Request

Code Style

Follow PEP 8 conventions
Use type hints where appropriate
Add docstrings to all public functions
Keep line length to 100 characters maximum
Use meaningful variable names

Testing Requirements

All new features must include tests
Maintain or improve code coverage
Tests must pass on Python 3.8+
Include both positive and negative test cases

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Citation

If you use ATON in your research or project, please cite:

D'Agostino, S. (2025). ATON: Adaptive Token-Oriented Notation - 
A Data Serialization Format Optimized for Large Language Models.
https://github.com/dagoSte/aton-format

ATON - Optimized for the age of Large Language Models

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.1

Nov 22, 2025

This version

1.0.2

Nov 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aton_format-1.0.2.tar.gz (26.3 kB view details)

Uploaded Nov 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aton_format-1.0.2-py3-none-any.whl (16.5 kB view details)

Uploaded Nov 18, 2025 Python 3

File details

Details for the file aton_format-1.0.2.tar.gz.

File metadata

Download URL: aton_format-1.0.2.tar.gz
Upload date: Nov 18, 2025
Size: 26.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for aton_format-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`93bccd5aa661619f5f4c2aa6974a765255af1ac7f218a763e2f5dbaad095f51f`
MD5	`3543ea9ec1c82d4abacaf201ea3b96b6`
BLAKE2b-256	`9a4abe7a9aeb4a93c0310a2b5adc5f68ee44fab6db1ce44b1d6ba8055d6308ed`

See more details on using hashes here.

File details

Details for the file aton_format-1.0.2-py3-none-any.whl.

File metadata

Download URL: aton_format-1.0.2-py3-none-any.whl
Upload date: Nov 18, 2025
Size: 16.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for aton_format-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b1710f0956a33a7ca55424f8cbf9c7fb4c877b62fb5dbe77b79ceaac54e744ad`
MD5	`993a949a4758f85e54bb1a785b8cf234`
BLAKE2b-256	`3cc1a3a0f7dc2e53cd2e4f579d53907d3221295d6a67ee78995b9785593a298d`

See more details on using hashes here.

aton-format 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ATON - Adaptive Token-Oriented Notation

Table of Contents

Overview

Key Metrics

Why ATON?

Installation

From PyPI

From Source

Development Installation

Requirements

Quick Start

Basic Encoding

Basic Decoding

Core Concepts

Schema Definition

Default Values

Tabular Structure

Native Relationships

Features

Type Safety

Human Readability

Zero Data Loss

Configuration Flexibility

Performance

Token Efficiency Comparison

Cost Savings

Encoding/Decoding Performance

API Reference

ATONEncoder

encode(data: Dict[str, Any]) -> str

estimate_tokens(text: str) -> int

ATONDecoder

decode(aton_str: str) -> Dict[str, Any]

Advanced Usage

Custom Configuration Profiles

Production Profile (Maximum Savings)

Development Profile (Easy Debugging)

Minimal Profile (Maximum Compression)

Working with Complex Data Structures

Nested Objects

Arrays

Relationships

Token Comparison Workflow

Use Cases

RAG (Retrieval-Augmented Generation) Systems

Multi-Agent Systems

E-commerce Product Catalogs

Time-Series Data Analytics

API Response Optimization

Technical Details

Format Specification

Schema Declaration

Defaults Declaration

Entity Declaration

Value Formatting

Tokenization Efficiency

Comparison with Other Formats

ATON vs JSON

ATON vs Protocol Buffers

ATON vs CSV

Examples

Example 1: Basic Product Catalog

Example 2: User Management with Defaults

Example 3: RAG System Documents and Chunks

Testing

Running Tests

Test Structure

Writing Custom Tests

Contributing

Reporting Issues

Pull Requests