Python implementation of GCF (Graph Compact Format): token-optimized wire format for LLM tool responses
Project description
gcf-python
Python implementation of GCF (Graph Compact Format).
84% fewer tokens than JSON. 32% fewer than TOON. 100% LLM comprehension accuracy at 500 symbols, where JSON fails.
Install
pip install gcf-python
Zero dependencies. Pure Python. Python 3.9+. Includes CLI.
CLI
gcf encode < payload.json # JSON to GCF
gcf decode < payload.gcf # GCF to JSON
gcf stats < payload.json # token comparison with visual bar
Payload: 50 symbols, 20 edges
JSON ██████████████████████████████ 4,200 tokens
GCF ████████░░░░░░░░░░░░░░░░░░░░░░ 1,150 tokens
Savings: 73% fewer tokens with GCF
Library
Quick Start
from gcf import encode, Payload, Symbol, Edge
p = Payload(
tool="context_for_task",
token_budget=5000,
tokens_used=1847,
symbols=[
Symbol(qualified_name="pkg.AuthMiddleware", kind="function", score=0.78, provenance="lsp_resolved", distance=0),
Symbol(qualified_name="pkg.NewServer", kind="function", score=0.54, provenance="lsp_resolved", distance=1),
],
edges=[
Edge(source="pkg.NewServer", target="pkg.AuthMiddleware", edge_type="calls"),
],
)
output = encode(p)
Output:
GCF tool=context_for_task budget=5000 tokens=1847 symbols=2
## targets
@0 fn pkg.AuthMiddleware 0.78 lsp_resolved
## related
@1 fn pkg.NewServer 0.54 lsp_resolved
## edges
@0<@1 calls
Decode
from gcf import decode
p = decode(input_text)
print(p.tool, len(p.symbols), "symbols", len(p.edges), "edges")
Session Deduplication
Track transmitted symbols across multiple tool responses. Previously-sent symbols become bare references instead of full declarations:
from gcf import encode_with_session, Session, Payload, Symbol
sess = Session()
out1 = encode_with_session(payload1, sess) # full declarations
out2 = encode_with_session(payload2, sess) # reused symbols as "@N # previously transmitted"
By the 5th call in a session: 92.7% token savings vs JSON.
Delta Encoding
When the consumer already has a prior context pack, send only what changed:
from gcf import encode_delta, DeltaPayload, Symbol, Edge
delta = DeltaPayload(
tool="context_for_task",
base_root="aaa111",
new_root="bbb222",
removed=[Symbol(qualified_name="pkg.OldFunc", kind="function")],
added=[Symbol(qualified_name="pkg.NewFunc", kind="function", score=0.85, provenance="rwr")],
delta_tokens=30,
full_tokens=200,
)
output = encode_delta(delta)
81.2% savings on re-queries where the pack changed slightly.
Generic Encoding
Encode any Python value (not just graph payloads) into GCF tabular format:
from gcf import encode_generic
output = encode_generic({
"employees": [
{"id": 1, "name": "Alice", "department": "Engineering", "salary": 95000},
{"id": 2, "name": "Bob", "department": "Sales", "salary": 72000},
],
})
Output:
## employees [2]{id,name,department,salary}
1|Alice|Engineering|95000
2|Bob|Sales|72000
Works on dicts, lists, and primitives. Lists of uniform dicts get tabular rows. Nested dicts use ## key section headers.
API
| Function | Description |
|---|---|
encode(p: Payload) -> str |
Encode a graph payload to GCF text |
encode_generic(data: Any) -> str |
Encode any value to GCF tabular format |
decode(input_text: str) -> Payload |
Parse GCF text back to a Payload |
encode_with_session(p: Payload, s: Session) -> str |
Encode with session deduplication |
encode_delta(d: DeltaPayload) -> str |
Encode a delta (added/removed only) |
Session() |
Create a new session tracker (thread-safe) |
Types
| Type | Purpose |
|---|---|
Payload |
Full GCF payload: tool, budget, symbols, edges, pack root |
Symbol |
Graph node: qualified name, kind, score, provenance, distance |
Edge |
Directed relationship: source, target, edge type |
DeltaPayload |
Diff between two packs: added/removed symbols and edges |
Session |
Thread-safe tracker for multi-call deduplication |
KIND_ABBREV / KIND_EXPAND |
Bidirectional kind abbreviation dicts |
Comprehension Eval
Rigorous 3-way benchmark (GCF vs TOON vs JSON) at 500 symbols, 200 edges. Six structured extraction questions sent to an LLM:
| Format | Accuracy | Tokens | vs JSON |
|---|---|---|---|
| GCF | 100% (6/6) | 11,090 | 79% fewer |
| TOON | 100% (6/6) | 16,378 | 69% fewer |
| JSON | 66.7% (4/6) | 53,341 | baseline |
JSON failed on counting tasks. GCF and TOON both achieved perfect accuracy. GCF does it in 32% fewer tokens.
Token Efficiency (TOON's Own Benchmark)
Running TOON's benchmark harness with GCF inserted (their datasets, their tokenizer):
| Track | GCF | TOON | Result |
|---|---|---|---|
| Mixed-structure (nested, semi-uniform) | 169,554 | 227,896 | GCF 34% smaller |
| Flat-only (tabular) | 66,026 | 67,837 | GCF 3% smaller |
| Semi-uniform event logs | 107,269 | 154,032 | GCF 44% smaller |
GCF wins on every dataset except deeply nested config (75 tokens on a 618-token payload). On semi-uniform data, GCF uses 44% fewer tokens than TOON.
Reproducible: blackwell-systems/toon@gcf-comparison
Other Implementations
- Go: github.com/blackwell-systems/gcf-go
- TypeScript: github.com/blackwell-systems/gcf-typescript
- Specification: github.com/blackwell-systems/gcf
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gcf_python-0.1.1.tar.gz.
File metadata
- Download URL: gcf_python-0.1.1.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20cce6321a831eab3694de2ea53844dc01e7e9a005f813decd2827bf4aec24dc
|
|
| MD5 |
5dbaa60ffef92faf738c79d7fe5ba8a6
|
|
| BLAKE2b-256 |
84d1166167bcf0b99531986128894d27db63a29bc30f1f5f2ba2f9ef979bb357
|
File details
Details for the file gcf_python-0.1.1-py3-none-any.whl.
File metadata
- Download URL: gcf_python-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a120870d2eca82673fa239eebdc59cee1fb70464e678213eb5c62ccfb956fe2
|
|
| MD5 |
b4998840d436acde454b6ff7837255b5
|
|
| BLAKE2b-256 |
0bfbd794761b688dd7ddaf2ce5baa2beed9a4bbd22b88cc8ff201f7b88631b11
|