Agent-First Data (AFDATA) — suffix-driven output formatting and protocol templates for AI agents
Project description
agent-first-data
Agent-First Data (AFDATA) — Suffix-driven output formatting and protocol templates for AI agents.
The field name is the schema. Agents read latency_ms and know milliseconds, api_key_secret and know to redact, no external schema needed.
Installation
pip install agent-first-data
Quick Example
A backup tool invoked from the CLI — flags, env vars, and config all use the same suffixes:
API_KEY_SECRET=sk-1234 cloudback --timeout-s 30 --max-file-size-bytes 10737418240 /data/backup.tar.gz
For CLI diagnostics, enable log categories explicitly:
--log startup,request,progress,retry,redirect
--verbose # shorthand for all categories
Without these flags, startup diagnostics should stay off by default.
The tool reads env vars, flags, and config — all with AFDATA suffixes — and can emit a startup diagnostic event:
from agent_first_data import *
import os
startup = build_json(
"log",
{
"event": "startup",
"config": {"timeout_s": 30, "max_file_size_bytes": 10737418240},
"args": {"input_path": "/data/backup.tar.gz"},
"env": {"API_KEY_SECRET": os.environ.get("API_KEY_SECRET")},
},
trace=None,
)
Three output formats, same data:
JSON: {"code":"log","event":"startup","args":{"input_path":"/data/backup.tar.gz"},"config":{"max_file_size_bytes":10737418240,"timeout_s":30},"env":{"API_KEY_SECRET":"***"}}
YAML: code: "log"
event: "startup"
args:
input_path: "/data/backup.tar.gz"
config:
max_file_size: "10.0GB"
timeout: "30s"
env:
API_KEY: "***"
Plain: args.input_path=/data/backup.tar.gz code=log event=startup config.max_file_size=10.0GB config.timeout=30s env.API_KEY=***
--timeout-s → timeout_s → timeout: 30s. API_KEY_SECRET → API_KEY: "***". The suffix is the schema.
API Reference
Total: 13 public APIs and 2 types + AFDATA logging (3 protocol builders + 4 output functions + 1 internal + 1 utility + 4 CLI helpers + OutputFormat + RedactionPolicy)
Protocol Builders (returns dict)
Build AFDATA protocol structures. Return dict objects for transport payloads.
# Success (result)
build_json_ok(result: Any, trace: Any = None) -> dict
# Error (simple message, optional hint)
build_json_error(message: str, hint: str = None, trace: Any = None) -> dict
# Generic (any code + fields)
build_json(code: str, fields: Any, trace: Any = None) -> dict
Use case: structured protocol payloads (frameworks automatically serialize)
Example:
from agent_first_data import *
# Startup
startup = build_json(
"log",
{
"event": "startup",
"config": {"api_key_secret": "sk-123", "timeout_s": 30},
"args": {"config_path": "config.yml"},
"env": {"RUST_LOG": "info"},
},
trace=None,
)
# Success (always include trace)
response = build_json_ok(
{"user_id": 123},
trace={"duration_ms": 150, "source": "db"},
)
# Error
err = build_json_error("user not found", trace={"duration_ms": 5})
# Error with hint
err_hint = build_json_error("wallet not found", hint="list wallets with: afpay wallet list", trace={"duration_ms": 5})
# Specific error code
not_found = build_json(
"not_found",
{"resource": "user", "id": 123},
trace={"duration_ms": 8},
)
CLI/Log Output (returns str)
Format values for CLI output and logs. output_json uses full _secret redaction by default. output_json_with supports explicit scoped policies. YAML and Plain always redact _secret and apply human-readable formatting.
output_json(value: Any) -> str # Single-line JSON, original keys, for programs/logs
output_json_with(value: Any, redaction_policy: RedactionPolicy) -> str
output_yaml(value: Any) -> str # Multi-line YAML, keys stripped, values formatted
output_plain(value: Any) -> str # Single-line logfmt, keys stripped, values formatted
class RedactionPolicy(enum.Enum):
RedactionTraceOnly = "RedactionTraceOnly"
RedactionNone = "RedactionNone"
Example:
from agent_first_data import *
data = {
"user_id": 123,
"api_key_secret": "sk-1234567890abcdef",
"created_at_epoch_ms": 1738886400000,
"file_size_bytes": 5242880,
}
# JSON (secrets redacted, original keys, raw values)
print(output_json(data))
# {"api_key_secret":"***","created_at_epoch_ms":1738886400000,"file_size_bytes":5242880,"user_id":123}
# YAML (keys stripped, values formatted, secrets redacted)
print(output_yaml(data))
# ---
# api_key: "***"
# created_at: "2025-02-07T00:00:00.000Z"
# file_size: "5.0MB"
# user_id: 123
# Plain logfmt (keys stripped, values formatted, secrets redacted)
print(output_plain(data))
# api_key=*** created_at=2025-02-07T00:00:00.000Z file_size=5.0MB user_id=123
Internal Tools
internal_redact_secrets(value: Any) -> None # Manually redact secrets in-place
Most users don't need this. Output functions automatically protect secrets.
Utility Functions
parse_size(s: str) -> int | None # Parse "10M" → bytes
Example:
from agent_first_data import *
assert parse_size("10M") == 10485760
assert parse_size("1.5K") == 1536
assert parse_size("512") == 512
CLI Helpers (for tools built on AFDATA)
Shared helpers that prevent flag-parsing drift between CLI tools. Use these instead of reimplementing --output and --log handling in each tool.
class OutputFormat(enum.Enum): # JSON="json", YAML="yaml", PLAIN="plain"
cli_parse_output(s: str) -> OutputFormat # Parse --output flag; raises ValueError on unknown
cli_parse_log_filters(entries: list[str]) -> list[str] # Normalize --log: trim, lowercase, dedup, remove empty
cli_output(value: Any, format: OutputFormat) -> str # Dispatch to output_json/yaml/plain
build_cli_error(message: str, hint: str = None) -> dict # {code:"error", error_code:"invalid_request", hint?, retryable:False, trace:{duration_ms:0}}
Canonical pattern — parse all flags before doing work, emit JSONL errors to stdout:
import sys
from agent_first_data import (
OutputFormat, cli_parse_output, cli_parse_log_filters,
cli_output, build_cli_error, output_json,
)
try:
fmt = cli_parse_output(args.output)
except ValueError as e:
print(output_json(build_cli_error(str(e))))
sys.exit(2)
log = cli_parse_log_filters(args.log.split(",") if args.log else [])
# ... do work ...
print(cli_output(result, fmt))
See examples/agent_cli.py for the complete working example (pytest examples/agent_cli.py).
Usage Examples
Example 1: REST API
from agent_first_data import *
from fastapi import FastAPI
app = FastAPI()
@app.get("/users/{user_id}")
async def get_user(user_id: int):
response = build_json_ok(
{"user_id": user_id, "name": "alice"},
trace={"duration_ms": 150, "source": "db"},
)
# API returns raw JSON — no output processing, no key stripping
return response
Example 2: CLI Tool (Complete Lifecycle)
from agent_first_data import *
# 1. Startup
startup = build_json(
"log",
{
"event": "startup",
"config": {"api_key_secret": "sk-sensitive-key", "timeout_s": 30},
"args": {"input_path": "data.json"},
"env": {"RUST_LOG": "info"},
},
trace=None,
)
print(output_yaml(startup))
# ---
# code: "log"
# event: "startup"
# args:
# input_path: "data.json"
# config:
# api_key: "***"
# timeout: "30s"
# env:
# RUST_LOG: "info"
# 2. Progress
progress = build_json(
"progress",
{"current": 3, "total": 10, "message": "processing"},
trace={"duration_ms": 1500},
)
print(output_plain(progress))
# code=progress current=3 message=processing total=10 trace.duration=1.5s
# 3. Result
result = build_json_ok(
{
"records_processed": 10,
"file_size_bytes": 5242880,
"created_at_epoch_ms": 1738886400000,
},
trace={"duration_ms": 3500, "source": "file"},
)
print(output_yaml(result))
# ---
# code: "ok"
# result:
# created_at: "2025-02-07T00:00:00.000Z"
# file_size: "5.0MB"
# records_processed: 10
# trace:
# duration: "3.5s"
# source: "file"
Example 3: JSONL Output
from agent_first_data import *
result = build_json_ok(
{"status": "success"},
trace={"duration_ms": 250, "api_key_secret": "sk-123"},
)
# Print JSONL to stdout (secrets redacted, one JSON object per line)
# Channel policy: machine-readable protocol/log events must not use stderr.
print(output_json(result))
# {"code":"ok","result":{"status":"success"},"trace":{"api_key_secret":"***","duration_ms":250}}
Complete Suffix Example
from agent_first_data import *
data = {
"created_at_epoch_ms": 1738886400000,
"request_timeout_ms": 5000,
"cache_ttl_s": 3600,
"file_size_bytes": 5242880,
"payment_msats": 50000000,
"price_usd_cents": 9999,
"success_rate_percent": 95.5,
"api_key_secret": "sk-1234567890abcdef",
"user_name": "alice",
"count": 42,
}
# YAML output (keys stripped, values formatted, secrets redacted)
print(output_yaml(data))
# ---
# api_key: "***"
# cache_ttl: "3600s"
# count: 42
# created_at: "2025-02-07T00:00:00.000Z"
# file_size: "5.0MB"
# payment: "50000000msats"
# price: "$99.99"
# request_timeout: "5.0s"
# success_rate: "95.5%"
# user_name: "alice"
# Plain logfmt output (same transformations, single line)
print(output_plain(data))
# api_key=*** cache_ttl=3600s count=42 created_at=2025-02-07T00:00:00.000Z file_size=5.0MB payment=50000000msats price=$99.99 request_timeout=5.0s success_rate=95.5% user_name=alice
AFDATA Logging
AFDATA-compliant structured logging via Python's logging module. Every log line is formatted using the library's own output_json/output_plain/output_yaml functions. Span fields are carried via contextvars (async-safe), automatically flattened into each log line.
API
from agent_first_data import init_logging_json, init_logging_plain, init_logging_yaml
from agent_first_data.afdata_logging import AfdataHandler, get_logger, span
# Convenience initializers — set up the root logger with AFDATA output to stdout
init_logging_json(level="INFO") # Single-line JSONL (secrets redacted, original keys)
init_logging_plain(level="INFO") # Single-line logfmt (keys stripped, values formatted)
init_logging_yaml(level="INFO") # Multi-line YAML (keys stripped, values formatted)
# Low-level — create a handler for custom logger stacks
AfdataHandler(format="json") # format: "json" | "plain" | "yaml"
# Logger with default fields (returns logging.LoggerAdapter)
get_logger(name, **fields)
# Span context manager — adds fields to all log events within the block
span(**fields)
Setup
from agent_first_data import init_logging_json, init_logging_plain, init_logging_yaml
# JSON output for production (one JSONL line per event, secrets redacted)
init_logging_json("INFO")
# Plain logfmt for development (keys stripped, values formatted)
init_logging_plain("DEBUG")
# YAML for detailed inspection (multi-line, keys stripped, values formatted)
init_logging_yaml("DEBUG")
Log Output
Standard logging calls work unchanged. Output format depends on the init function used.
import logging
logger = logging.getLogger("myapp")
logger.info("Server started")
# JSON: {"timestamp_epoch_ms":1739000000000,"message":"Server started","target":"myapp","code":"info"}
# Plain: code=info message="Server started" target=myapp timestamp_epoch_ms=1739000000000
# YAML: ---
# code: "info"
# message: "Server started"
# target: "myapp"
# timestamp_epoch_ms: 1739000000000
logger.warning("DNS lookup failed")
# JSON: {"timestamp_epoch_ms":...,"message":"DNS lookup failed","target":"myapp","code":"warn"}
Span Support
Use the span context manager to add fields to all log events within the block. Spans nest and work with both sync and async code.
from agent_first_data import span
with span(request_id="abc-123"):
logger.info("Processing")
# {"timestamp_epoch_ms":...,"message":"Processing","target":"myapp","request_id":"abc-123","code":"info"}
with span(step="validate"):
logger.info("Validating input")
# {"timestamp_epoch_ms":...,"message":"Validating input","target":"myapp","request_id":"abc-123","step":"validate","code":"info"}
Logger with Default Fields
Use get_logger for per-component fields that appear on every log line:
from agent_first_data import get_logger
logger = get_logger("myapp.auth", component="auth")
logger.info("Token verified")
# {"timestamp_epoch_ms":...,"message":"Token verified","target":"myapp.auth","component":"auth","code":"info"}
Custom Code Override
The code field defaults to the log level. Override with an explicit field:
from agent_first_data import get_logger
logger = get_logger("myapp")
logger.info("Server ready", extra={"code": "log", "event": "startup"})
# {"timestamp_epoch_ms":...,"message":"Server ready","target":"myapp","code":"log","event":"startup"}
Output Fields
Every log line contains:
| Field | Type | Description |
|---|---|---|
timestamp_epoch_ms |
number | Unix milliseconds |
message |
string | Log message |
target |
string | Logger name |
code |
string | Level (debug/info/warn/error) or explicit override |
| span fields | any | From span() context manager |
| event fields | any | From extra= or get_logger fields |
Log Output Formats
All three formats use the library's own output functions, so AFDATA suffix processing applies to log fields too:
| Format | Function | Keys | Values | Use case |
|---|---|---|---|---|
| JSON | init_logging_json |
original (with suffix) | raw | production, log aggregation |
| Plain | init_logging_plain |
stripped | formatted | development, compact scanning |
| YAML | init_logging_yaml |
stripped | formatted | debugging, detailed inspection |
All formats automatically redact _secret fields in log output.
Output Formats
Three output formats for different use cases:
| Format | Structure | Keys | Values | Use case |
|---|---|---|---|---|
| JSON | single-line | original (with suffix) | raw | programs, logs |
| YAML | multi-line | stripped | formatted | human inspection |
| Plain | single-line logfmt | stripped | formatted | compact scanning |
All formats automatically redact _secret fields.
Supported Suffixes
- Duration:
_ms,_s,_ns,_us,_minutes,_hours,_days - Timestamps:
_epoch_ms,_epoch_s,_epoch_ns,_rfc3339 - Size:
_bytes(auto-scales to KB/MB/GB/TB),_size(config input, pass through) - Currency:
_msats,_sats,_btc,_usd_cents,_eur_cents,_jpy,_{code}_cents - Other:
_percent,_secret(auto-redacted in all formats)
Repository
This package is part of the agent-first-data repository, which also contains:
spec/— Full AFDATA specification with suffix definitions, protocol format rules, and cross-language test fixturesskills/— AI coding agent skill for working with AFDATA conventions
To run tests, clone the full repository (tests use shared cross-language fixtures from spec/fixtures/):
git clone https://github.com/cmnspore/agent-first-data
cd agent-first-data/python
python -m pytest
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_first_data-0.6.0.tar.gz.
File metadata
- Download URL: agent_first_data-0.6.0.tar.gz
- Upload date:
- Size: 21.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bff85cd6ea3f304f9f483986ef2d414c4a9f59ff530c488ae8700163315d8a8b
|
|
| MD5 |
a19f9849c2438f1c09f639f44deec885
|
|
| BLAKE2b-256 |
89a9d6b894f9dfde58ebae6ef9a912a6303ab3185ef9c0f4174eb07caa7ee13e
|
File details
Details for the file agent_first_data-0.6.0-py3-none-any.whl.
File metadata
- Download URL: agent_first_data-0.6.0-py3-none-any.whl
- Upload date:
- Size: 14.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f129e86d52f1e1decbe121224ab8f349be2a9a57190ba761fea4b6f41f2a4269
|
|
| MD5 |
8c02e613a5c5d1e93eb0dec63f41b900
|
|
| BLAKE2b-256 |
3106a6bb9e67ade0bd9327658b8d7fe144d9f0652be1b701f032d8de16ede764
|