Skip to main content

Reduces JSON, YAML, and NDJSON volume by collapsing repeated structures while preserving the schema.

Project description

JSON's Razor — Cut the fat

tests

Reduces JSON, YAML, and NDJSON volume by collapsing repeated structures while preserving the schema.

Large structured data files are hard to parse — not because the structure is complex, but because repetition obscures it. A list of 10,000 objects with identical shape tells you nothing more than a list of 1. JSON's Razor collapses that repetition to its minimum essential form: one representative example of each repeated structure, at every level of nesting.

The output is valid, parseable data in the same format as input — not a summary, not a schema definition. It just has far less volume.


Install

pip install json-razor

Usage

cat big.json | json-razor                    # stdin → stdout
json-razor big.json                          # file input → stdout
json-razor big.json -o small.json            # file input → file output
json-razor big.yaml                          # auto-detected as YAML
json-razor app.log --format ndjson           # NDJSON log file

Options

Flag Default Description
--keep N 1 Number of examples to keep per repeated structure
--depth N unlimited Stop collapsing below this nesting depth
--format auto Force format: json, yaml, or ndjson
--truncate N 100 Max string length before truncating

How it works

Arrays — collapsed to one item. Mixed-type arrays keep one of each distinct type.

// input
[{"id": 1, "name": "alice"}, {"id": 2, "name": "bob"}, {"id": 3, "name": "carol"}]

// output
[{"id": 1, "name": "alice"}]

Mixed types — one representative per JSON type (null, bool, number, string, array, object).

// input
[1, "hello", {"id": 1}, null, true, [1, 2, 3]]

// output
[1, "hello", {"id": 1}, null, true, [1]]

Nested structures — collapsed recursively at every level.

NDJSON — collapsed across lines; one representative line kept.

Nulls and empty values — always preserved (null, [], {}).

Long strings — truncated to a configurable preview.


Supported formats

Format Auto-detected from
JSON .json
YAML .yaml, .yml
NDJSON .ndjson

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

json_razor-0.1.0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

json_razor-0.1.0-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file json_razor-0.1.0.tar.gz.

File metadata

  • Download URL: json_razor-0.1.0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for json_razor-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5190a4a4bff04a030abcb22cb811498d7883a68e4b3b3b7b02d72c6020f5fed5
MD5 9c02ea4f385c52956d54683f21ba9228
BLAKE2b-256 52c8b2e5848d3dc835a61d42fa0ccb9f249f05e81fdfe51122b7560dfafc87f4

See more details on using hashes here.

File details

Details for the file json_razor-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: json_razor-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for json_razor-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9dad921e849d1ad700d3e3a377626d35d0499be6322271eb716f5d098047226b
MD5 b3d400b697600029e31b89cdf86e900a
BLAKE2b-256 0eaee3599847d3fbcad1728403389ab9864e22a87264b8e022e1575955b14144

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page