Arkadia Data Format (AK-DATA) - A versatile data serialization format optimized for AI applications.
Project description
Arkadia Data Format (AKD)
; i :J
U, .j..fraaM. nl
b h.obWMkkWWMMWMCdkvz,k
! .mQWM:o hiMoMW v.uaXMdohbi
hI,MMmaIao.Wo .IMkoh FCMwqoXa
,.c.aWdM. d,aToW . Mb!. MopfQ.L
jhj.xoM :k aCu F: w MpmqMvMMI,I
bzMhz:W .Mw . o lYh ai M iMa pM.j
hzqWWM; M;o.WMWWMkMX f.a aa bModpo.
;tMbbv xp oJMMWWWWMMMM iv dLMXakM:T
mdh MMWWWWWWWbQLCzurjktvMor
,QFw ;M,b .MWWWWWWWMWMWd xz M,kd X
qjMIo IMTW.WWWWWMWWWM.o.I rpULaMdi.
.mMM uoWWWMWWWWWWp qM,,M l M;mMbrI
f nm MMW MWWjMuMj I o LbMac
WWdMWWWW Mv a.b..aauMhMwQf
MoWWW,WWtjonJMWtoMdoaoMI
MMMM Mi xd:Mm tMwo Cr,
xMMc .otqokWMMMao:oio.
MW . C..MkTIo
WW
QWM
WW
uMW
WW
MW
The High-Density, Token-Efficient Data Protocol for Large Language Models.
Arkadia Data Format (AKD) is a schema-first protocol designed specifically to optimize communication with LLMs. By stripping away redundant syntax (like repeated JSON keys) and enforcing strict typing, AKD offers up to 30% token savings, faster parsing, and a metadata layer invisible to your application logic but fully accessible to AI models.
This Python package includes the full core library and the akd CLI tool.
โจ Key Features
- ๐ Token Efficiency: Reduces context window usage by replacing verbose JSON objects with dense Positional Records (Tuples).
- ๐ก๏ธ Type Safety: Enforces types (
int,float,bool,string) explicitly in the schema before data reaches the LLM. - ๐ง Metadata Injection: Use
#tagsand$attributesto pass context (e.g., source confidence, deprecation warnings) to the LLM without polluting your data structure. - ๐ฅ๏ธ Powerful CLI: Includes the
akdterminal tool for encoding, decoding, and benchmarking files or streams. - โก Zero Dependencies: Pure Python implementation, lightweight and fast.
๐ฆ Installation
Install directly from PyPI:
pip install arkadia-data
๐ Quick Start (Library)
Basic Usage
import arkadia.data as akd
# 1. Encode: Python Dict -> AKD String
data = { "id": 1, "name": "Alice", "active": True }
encoded = akd.encode(data)
print(encoded)
# Output: <id:number,name:string,active:bool>(1,"Alice",true)
# 2. Decode: AKD String -> Python Dict
input_str = '<score:number>(98.5)'
result = akd.decode(input_str)
if not result.errors:
print(result.node.value) # 98.5
else:
print("Errors:", result.errors)
๐ CLI Usage
The Python package installs the akd (alias: ak-data) command globally.
USAGE:
akd / ak-data <command> [flags]
COMMANDS:
enc [ENCODE] Convert JSON/YAML to AK Data format
dec [DECODE] Parse AK Data format back to JSON
benchmark [BENCHMARK] Run performance and token usage tests
Examples
1. Pipe JSON to AKD (Compact Mode):
echo '{ "data": 2}' | akd enc - -c
# Output: <data:number>(2)
2. Decode AKD file to JSON:
akd dec payload.akd -f json
3. Run Benchmarks on a directory:
akd benchmark ./data_samples
โก Benchmarks
Why switch? Because every token counts. AKCD (Arkadia Compressed Data) consistently outperforms standard formats.
BENCHMARK SUMMARY:
JSON โโโโโโโโโโโโโโโโโโโโโโโโโ 6921 tok 0.15 ms
AKCD โโโโโโโโโโโโโโโโโโโโโโโโโ 5416 tok 4.40 ms
AKD โโโโโโโโโโโโโโโโโโโโโโโโโ 6488 tok 4.29 ms
TOON โโโโโโโโโโโโโโโโโโโโโโโโโ 8198 tok 2.36 ms
FORMAT TOKENS VS JSON
---------------------------------
AKCD 5416 -21.7%
AKD 6488 -6.3%
JSON 6921 +0.0%
TOON 8198 +18.5%
CONCLUSION: Switching to AKCD saves 1505 tokens (21.7%) compared to JSON.
๐ Syntax Specification
AKD separates structure (Schema) from content (Data).
1. Primitives
Primitive values are automatically typed. Strings are quoted, numbers and booleans are bare.
| Type | Input | Encoded Output |
|---|---|---|
| Integer | 123 |
<number>123 |
| String | "hello" |
<string>"hello" |
| Boolean | true |
<bool>true |
| Null | null |
<null>null |
2. Schema Definition (@Type)
Define the structure once to avoid repeating keys.
/* Define a User type */
@User <
id: number,
name: string,
role: string
>
3. Data Structures
Positional Records (Tuples)
The most efficient way to represent objects. Values must match the schema order.
/* Schema: <x:number, y:number> */
(10, 20)
Named Records (Objects)
Flexible key-value pairs, similar to JSON, used when schema is loose or data is sparse.
{
id: 1,
name: "Admin"
}
Lists
Dense arrays. Can be homogenous (list of strings) or mixed.
[ "active", "pending", "closed" ]
4. Metadata System
AKD allows you to inject metadata that is visible to the LLM but ignored by the parser when decoding back to your application.
Attributes ($key=value) & Tags (#flag)
@Product <
$version="2.0"
sku: string,
/* Tagging a field as deprecated */
#deprecated
legacy_id: int
>
๐ฎ Roadmap
- Binary Types: Hex (
~[hex]1A...~) and Base64 (~[b64]...~) support. - Pointers: Reference existing objects by ID (
*User[1]). - Ranges: Numeric range validation in schema (
score: 0..100).
๐ License
This project is licensed under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arkadia_data-0.1.9.tar.gz.
File metadata
- Download URL: arkadia_data-0.1.9.tar.gz
- Upload date:
- Size: 113.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
465965035c7f5e263aaa25a398b4b31e0f30d6eba50e7eb9375548e237255c75
|
|
| MD5 |
ab622260ce7e1a66b2439c90628057d0
|
|
| BLAKE2b-256 |
83915d27d96f37afada5c70dedd3099e36af35d0603c29aa55ecea01f4e1613a
|
File details
Details for the file arkadia_data-0.1.9-py3-none-any.whl.
File metadata
- Download URL: arkadia_data-0.1.9-py3-none-any.whl
- Upload date:
- Size: 50.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02421c12cc1d60ca724009df939172808ee39573841446e3a0b19da55ea75777
|
|
| MD5 |
564c027d866ef8112ae561d48f723795
|
|
| BLAKE2b-256 |
6624e4d3cf74952364346a869b858332328a703a2713a80312f24e4163373716
|