Budget‑constrained JSON preview renderer (Python bindings)

These details have not been verified by PyPI

Operating System
- OS Independent
Programming Language
- Python
- Python :: 3
- Rust

Project description

Terminal demo

head/tail for JSON, YAML — but structure‑aware. Get a compact preview that shows both the shape and representative values of your data, all within a strict byte budget. (Just like head/tail, hson can also work with unstructured text files.)

Available as:

CLI (see Usage)
Python library (see Python Bindings)

Codecov Crates.io Version PyPI - Version

Install

Using Cargo:

cargo install headson

Note: the CLI installs as hson. All examples below use hson ....

From source:

cargo build --release
target/release/hson --help

Features

Budgeted output: specify exactly how much you want to see
Output formats: auto | json | yaml | text
- Styles: strict | default | detailed
  - JSON family: strict → strict JSON, default → human‑friendly Pseudo, detailed → JS with inline comments
  - YAML: always YAML; strict has no comments, default uses “# …”, detailed uses “# N more …”
  - Text: prints raw lines. In default style, omissions are shown as a single line …; in detailed, as … N more lines …. strict omits array‑level summaries.
Multiple inputs: preview many files at once with a shared or per‑file budget
Fast: processes gigabyte‑scale files in seconds (mostly disk‑bound)
Available as a CLI app and as a Python library

Fits into command line workflows

If you’re comfortable with tools like head and tail, use hson when you want a quick, structured peek into a JSON file without dumping the entire thing.

head/tail operate on bytes/lines - their output is not optimized for tree structures
jq: you need to craft filters to preview large JSON files
hson: head/tail for trees—zero‑config by default; force text with -i text when you want raw lines

Usage

hson [FLAGS] [INPUT...]

INPUT (optional, repeatable): file path(s). If omitted, reads from stdin. Multiple input files are supported.
Prints the preview to stdout. On parse errors, exits non‑zero and prints an error to stderr.

Common flags:

-c, --bytes <BYTES>: per‑file output budget (bytes). For multiple inputs, default total budget is <BYTES> * number_of_inputs.
-u, --chars <CHARS>: per‑file output budget (Unicode code points). Behaves like --bytes but counts characters instead of bytes.
-C, --global-bytes <BYTES>: total output budget across all inputs. With --bytes, the effective total is the smaller of the two.
-f, --format <auto|json|yaml|text>: output format (default: auto).
- Auto: stdin → JSON family; filesets → per‑file based on extension (.json → JSON family, .yaml/.yml → YAML, unknown → Text).
-t, --template <strict|default|detailed>: output style (default: default).
- JSON family: strict → strict JSON; default → Pseudo; detailed → JS with inline comments.
- YAML: always YAML; style only affects comments (strict none, default “# …”, detailed “# N more …”).
-i, --input-format <json|yaml|text>: ingestion format (default: json). For filesets in auto format, ingestion is chosen by extensions.
-m, --compact: no indentation, no spaces, no newlines
--no-newline: single line output
--no-header: suppress fileset section headers (useful when embedding output in scripts)
--no-space: no space after : in objects
--indent <STR>: indentation unit (default: two spaces)
--string-cap <N>: max graphemes to consider per string (default: 500)
--head: prefer the beginning of arrays when truncating (keep first N). Strings are unaffected. Display styles place omission markers accordingly; strict JSON remains unannotated. Mutually exclusive with --tail.
--tail: prefer the end of arrays when truncating (keep last N). Strings are unaffected. Display styles place omission markers accordingly; strict JSON remains unannotated. Mutually exclusive with --head.

Notes:

Multiple inputs:
- With newlines enabled, file sections are rendered with human‑readable headers (pass --no-header to suppress them). In compact/single‑line modes, headers are omitted.
- Order: inputs are sorted by git frecency (via frecenfile) when available, then by mtime; pass --no-sort to keep the original input order without repo scanning.
In --format auto, each file uses its own best format: JSON family for .json, YAML for .yaml/.yml.
- Unknown extensions are treated as Text (raw lines) — safe for logs and .txt files.
- --global-bytes may truncate or omit entire files to respect the total budget.
- The tool finds the largest preview that fits the budget; even if extremely tight, you still get a minimal, valid preview.
- Directories and binary files are ignored; a notice is printed to stderr for each. Stdin reads the stream as‑is.
- Head vs Tail sampling: these options bias which part of arrays are kept before rendering. Display styles may still insert internal gap markers to honor very small budgets; strict JSON stays unannotated.

Working with multiple files

Budgets: per-file caps (--bytes/--chars/--lines) apply to each input; global caps (--global-*) constrain the combined output. Default byte budget scales by input count when no globals are set.
Sorting: inputs are pre-sorted by git frecency (frecenfile) with last-modified-time fallback so recently touched files appear first. Pass --no-sort to preserve the order you provided and skip repo scanning.
Headers: fileset sections get ==> headers when newlines are enabled; hide them with --no-header. Compact and single-line modes omit headers automatically.
Formats: in --format auto, each file picks JSON/YAML/Text based on extension; unknowns fall back to Text so mixed filesets “just work.”

Budget Modes

Bytes (-c/--bytes, -C/--global-bytes)
- Measures UTF‑8 bytes in the output.
- Default per‑file budget is 500 bytes when neither --lines nor --chars is provided.
- Multiple inputs: total default budget is <BYTES> * number_of_inputs; --global-bytes caps the total.
Characters (-u/--chars)
- Measures Unicode code points (not grapheme clusters).
Lines (-n/--lines, -N/--global-lines)
- Caps the number of lines in the output.
- Incompatible with --no-newline.
- Multiple inputs: defaults to <LINES> * number_of_inputs; --global-lines caps the total.
- Fileset headers, blank separators, and summary lines do not count toward the line cap; only actual content lines are considered.
Interactions and precedence
- All active budgets are enforced simultaneously. The render must satisfy all of: bytes (if set), chars (if set), and lines (if set). The strictest cap wins.
- When only lines are specified, no implicit byte cap applies. When neither lines nor chars are specified, a 500‑byte default applies.

Quick one‑liners:

Peek a big JSON stream (keeps structure):

zstdcat huge.json.zst | hson -c 800 -f json -t default

Many files with a fixed overall size:

hson -C 1200 -f json -t strict logs/*.json

Glance at a file, JavaScript‑style comments for omissions:
```
hson -c 400 -f json -t detailed data.json
```

YAML with detailed comments:

hson -c 400 -f yaml -t detailed config.yaml

Text mode

Single file (auto):
```
hson -c 200 notes.txt
```

Force Text ingest/output (useful when mixing with other extensions, or when the extension suggests JSON/YAML):

hson -c 200 -i text -f text notes.txt
# Force text ingest even if the file looks like JSON
hson -i text notes.json

Styles on Text:
- default: omission as a standalone … line.
- detailed: omission as … N more lines ….
- strict: no array‑level omission line (individual long lines may still truncate with …).

Note: Filesets always render with per-file auto templates. When you need to preview a directory of mixed formats, skip -f text and let -f auto pick the right renderer for each entry.

Show help:

hson --help

Note: flags align with head/tail conventions (-c/--bytes, -C/--global-bytes).

Examples: head vs hson

Input:

{"users":[{"id":1,"name":"Ana","roles":["admin","dev"]},{"id":2,"name":"Bo"}],"meta":{"count":2,"source":"db"}}

Naive cut (can break mid‑token):

jq -c . users.json | head -c 80
# {"users":[{"id":1,"name":"Ana","roles":["admin","dev"]},{"id":2,"name":"Bo"}],"me

Structured preview with hson (JSON family, default style → Pseudo):

hson -c 120 -f json -t default users.json
# {
#   users: [
#     { id: 1, name: "Ana", roles: [ "admin", … ] },
#     …
#   ]
#   meta: { count: 2, … }
# }

Machine‑readable preview (JSON family, strict style → strict JSON):

hson -c 120 -f json -t strict users.json
# {"users":[{"id":1,"name":"Ana","roles":["admin"]}],"meta":{"count":2}}

Terminal Demos

Regenerate locally:

Place tapes under docs/tapes (e.g., docs/tapes/demo.tape)
Run: cargo make tapes
Outputs are written to docs/assets/tapes

Python Bindings

A thin Python extension module is available on PyPI as headson.

Install: pip install headson (ABI3 wheels for Python 3.10+ on Linux/macOS/Windows).
API:
- headson.summarize(text: str, *, format: str = "auto", style: str = "default", input_format: str = "json", byte_budget: int | None = None, skew: str = "balanced") -> str
  - format: "auto" | "json" | "yaml" (auto maps to JSON family for single inputs)
  - style: "strict" | "default" | "detailed"
  - input_format: "json" | "yaml" (ingestion)
  - byte_budget: maximum output size in bytes (default: 500)
  - skew: "balanced" | "head" | "tail" (affects display styles; strict JSON remains unannotated)

Examples:

import json
import headson

data = {"foo": [1, 2, 3], "bar": {"x": "y"}}
preview = headson.summarize(json.dumps(data), format="json", style="strict", byte_budget=200)
print(preview)

# Prefer the tail of arrays (annotations show with style="default"/"detailed")
print(
    headson.summarize(
        json.dumps(list(range(100))),
        format="json",
        style="detailed",
        byte_budget=80,
        skew="tail",
    )
)

# YAML support
doc = "root:\n  items: [1,2,3,4,5,6,7,8,9,10]\n"
print(headson.summarize(doc, format="yaml", style="default", input_format="yaml", byte_budget=60))

Source Code Support

Source code support is a challenging area. While headson's algorithm and code structure would allow for the use of completely accurate parsing using language-specific parsers using tree-sitter, this would increase the complexity of the application and its number of dependencies.

Instead of attempting a deep parse of source code files, we convert them into nested arrays based on a heuristic that understands indentation patterns in the file.

When headson detects a code-like file, it uses a set of additional heuristics:

Atomic line ingest: each line is treated as an atomic string so omission markers never split a code line.
Depth-aware sampling:
- We attempt to include more of the top level of the source code in order to give a good overview of classes, function and constants at the top level.
- Nested blocks (function bodies, loops) prefer to omit lines in the middle to attempt to preserve natural "block" boundaries
Header priority: lines that introduce a nested block (e.g., def foo():) get a small priority boost to ensure they survive tight budgets.

Algorithm

Algorithm overview

Footnotes

^[1] Optimized tree representation: An arena‑style tree stored in flat, contiguous buffers. Each node records its kind and value plus index ranges into shared child and key arrays. Arrays are ingested in a single pass and may be deterministically pre‑sampled: the first element is always kept; additional elements are selected via a fixed per‑index inclusion test; for kept elements, original indices are stored and full lengths are counted. This enables accurate omission info and internal gap markers later, while minimizing pointer chasing.
^[2] Priority order: Nodes are scored so previews surface representative structure and values first. Arrays can favor head/mid/tail coverage (default) or strictly the head; tail preference flips head/tail when configured. Object properties are ordered by key, and strings expand by grapheme with early characters prioritized over very deep expansions.
^[3] Choose top N nodes (binary search): Iteratively picks N so that the rendered preview fits within the byte budget, looping between “choose N” and a render attempt to converge quickly.
^[4] Render attempt: Serializes the currently included nodes using the selected template. Omission summaries and per-file section headers appear in display templates (pseudo/js); json remains strict. For arrays, display templates may insert internal gap markers between non‑contiguous kept items using original indices.
^[5] Diagram source: The Algorithm diagram is generated from docs/diagrams/algorithm.mmd. Regenerate the SVG with cargo make diagrams before releasing.

License

MIT

Project details

These details have not been verified by PyPI

Operating System
- OS Independent
Programming Language
- Python
- Python :: 3
- Rust

Release history Release notifications | RSS feed

0.16.1

Feb 4, 2026

0.16.0

Feb 1, 2026

0.15.0

Jan 18, 2026

0.14.0

Jan 15, 2026

0.13.1

Jan 10, 2026

0.13.0

Dec 24, 2025

0.12.0

Dec 23, 2025

0.11.5

Dec 20, 2025

0.11.4

Dec 18, 2025

0.11.3

Dec 18, 2025

0.11.2

Dec 16, 2025

0.11.1

Dec 15, 2025

0.11.0

Dec 11, 2025

0.10.1

Dec 4, 2025

0.10.0

Dec 1, 2025

0.9.0

Nov 29, 2025

0.8.0

Nov 25, 2025

0.7.29

Nov 25, 2025

0.7.28

Nov 24, 2025

0.7.27

Nov 24, 2025

0.7.26

Nov 24, 2025

0.7.25

Nov 24, 2025

This version

0.7.24

Nov 23, 2025

0.7.23

Nov 23, 2025

0.7.22

Nov 23, 2025

0.7.21

Nov 23, 2025

0.7.20

Nov 23, 2025

0.7.19

Nov 23, 2025

0.7.18

Nov 23, 2025

0.7.17

Nov 22, 2025

0.7.16

Nov 22, 2025

0.7.15

Nov 22, 2025

0.7.14

Nov 22, 2025

0.7.13

Nov 22, 2025

0.7.11

Nov 22, 2025

0.7.8

Nov 18, 2025

0.7.7

Nov 18, 2025

0.7.6

Nov 17, 2025

0.7.5

Nov 17, 2025

0.7.3

Nov 17, 2025

0.7.2

Nov 11, 2025

0.7.1

Nov 9, 2025

0.7.0

Nov 9, 2025

0.6.8

Nov 8, 2025

0.6.7

Nov 5, 2025

0.6.6

Nov 2, 2025

0.6.5

Nov 2, 2025

0.6.4

Nov 2, 2025

0.6.3

Nov 1, 2025

0.6.2

Nov 1, 2025

0.6.1

Oct 28, 2025

0.6.0

Oct 28, 2025

0.5.4

Oct 27, 2025

0.5.3

Oct 26, 2025

0.5.2

Oct 26, 2025

0.5.1

Oct 26, 2025

0.5.0

Oct 26, 2025

0.4.0

Oct 26, 2025

0.3.0

Oct 25, 2025

0.2.5

Oct 25, 2025

0.2.4

Oct 25, 2025

0.2.3

Oct 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

headson-0.7.24.tar.gz (2.1 MB view details)

Uploaded Nov 23, 2025 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

headson-0.7.24-cp310-abi3-win_amd64.whl (974.4 kB view details)

Uploaded Nov 23, 2025 CPython 3.10+Windows x86-64

headson-0.7.24-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded Nov 23, 2025 CPython 3.10+manylinux: glibc 2.17+ x86-64

headson-0.7.24-cp310-abi3-macosx_11_0_arm64.whl (997.3 kB view details)

Uploaded Nov 23, 2025 CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file headson-0.7.24.tar.gz.

File metadata

Download URL: headson-0.7.24.tar.gz
Upload date: Nov 23, 2025
Size: 2.1 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: maturin/1.10.2

File hashes

Hashes for headson-0.7.24.tar.gz
Algorithm	Hash digest
SHA256	`9b3a26fcf0f45e1f83557652db5bfa396101038afebc38a3ed2a476d8a8e25c0`
MD5	`9f4a7c310824230f02fb3c2f4c9bb04e`
BLAKE2b-256	`618009a8e08e1a6b7d5df7b1115c750feb4e6ad4a3784ef87ac409decdd6e502`

See more details on using hashes here.

File details

Details for the file headson-0.7.24-cp310-abi3-win_amd64.whl.

File metadata

Download URL: headson-0.7.24-cp310-abi3-win_amd64.whl
Upload date: Nov 23, 2025
Size: 974.4 kB
Tags: CPython 3.10+, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: maturin/1.10.2

File hashes

Hashes for headson-0.7.24-cp310-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`114e76d96c8f1560d6aef14db46059669003246495a210cf38271af5af08efa9`
MD5	`f197157b01481ce76e2786044a4a72e6`
BLAKE2b-256	`1147b1c37d74c66441e401d407cc97f6ed964562a65cb8add7903ce52b7616b4`

See more details on using hashes here.

File details

Details for the file headson-0.7.24-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: headson-0.7.24-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Nov 23, 2025
Size: 1.1 MB
Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: maturin/1.10.2

File hashes

Hashes for headson-0.7.24-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`5c94ac864ab5c941741fdba8cfd6044b44244203d40f9fb7aac031f703587303`
MD5	`8aab1402ea0265ddf9b9227f8fed691d`
BLAKE2b-256	`f787d7293b865fd930038fadf2fb956e01f78037c3f3d3b0e081d1831ab83f10`

See more details on using hashes here.

File details

Details for the file headson-0.7.24-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: headson-0.7.24-cp310-abi3-macosx_11_0_arm64.whl
Upload date: Nov 23, 2025
Size: 997.3 kB
Tags: CPython 3.10+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: maturin/1.10.2

File hashes

Hashes for headson-0.7.24-cp310-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`7b5c81f847a423cfa78b7bca0baf8b486e9ff20b037bb4f663e05ae02c639224`
MD5	`5b3b1712f5fb2c215870d341e18cbb1b`
BLAKE2b-256	`2aaeff635eb7c0aec0b124b2cf4c2aa3f3c0677b4046ba150c8feff83f0fca2b`

See more details on using hashes here.

headson 0.7.24

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Install

Features

Fits into command line workflows

Usage

Working with multiple files

Budget Modes

Text mode

Examples: head vs hson

Terminal Demos

Python Bindings

Source Code Support

Algorithm

Footnotes

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes