Skip to main content

A pure Python implementation of jq

Project description

purejq

CI PyPI Python Conformance License: MIT

jq, as a pure Python library. No C extension, no binary: if Python runs, purejq runs — Pyodide/WASM, sandboxes, Lambda, anywhere pip install is all you get.

pip install purejq
import purejq

purejq.first(".users[] | select(.age > 26) | .name", data)   # work on your dicts directly
prog = purejq.compile("group_by(.team) | map(length)")        # compile once, run many
prog.first(batch)
echo '{"a":[1,2,3]}' | purejq '.a | map(. * 2)'               # familiar CLI, same flags

Why purejq

  • Embedding jq in Python? purejq is 6–40x faster than the C bindings. The jq PyPI package serializes your data to JSON text and back on every call; purejq evaluates directly on Python objects.
  • On big files, the CLI beats the C jq binary end-to-end. Large-file runs are dominated by JSON parsing, and CPython's C-backed parser is faster than jq's.
  • It's real jq: 751/781 cases (96.2%) of jq's own test suite pass — the suite is vendored in this repo and run in CI on every commit.

Where C jq still wins: raw filter throughput on already-parsed streams in shell pipelines. If you can install binaries and that's your workload, use jq.

Benchmarks

Measured with tools/bench.py (M-series MacBook, CPython 3.13, jq 1.8.1, best of 3). Reproduce: python3 tools/bench.py 1000000.

Embedded in Python — 100k-object array, already parsed, in-process:

workload purejq jq PyPI (C bindings)
field-access stream 9 ms 410 ms
filter + count 56 ms 485 ms
map + aggregate 18 ms 483 ms
group_by 114 ms 765 ms
transform + sort 141 ms 943 ms
regex filter 130 ms 789 ms

Command line, end to end — 93 MB file (1M objects), parse + filter + output:

workload purejq jq 1.8 (C binary)
single lookup 0.5 s 1.6 s
filter + count 1.1 s 2.0 s
group_by 2.3 s 4.0 s

purejq CLI measured with the optional orjson extra (pip install 'purejq[speed]'); with stdlib json alone it is ~25–35% slower and still ahead on these workloads.

Loading large JSON into Python: the 93 MB file parses in 0.73 s with stdlib json (128 MB/s) or 0.43 s with orjson (219 MB/s) — input loading is C-speed either way and scales linearly.

PyPy (100k objects, same code, no changes): filter + count 13 ms, map + aggregate 2 ms, group_by 33 ms, transform + sort 70 ms — roughly another 2–9x over CPython for heavy workloads.

How it's fast, in one line: programs compile once into Python closures with static binding and single-output fast paths — evaluation never re-walks the AST, and common shapes skip generator machinery entirely.

jq compatibility

751/781 of jq's official test suite. Every remaining difference is listed in expected_failures.txt; they fall into three buckets:

  • the module system (import/include) is not implemented yet
  • integers are exact (arbitrary precision, like gojq) instead of rounding to doubles — deliberate
  • a few error-message wordings differ

Everything else is there: paths and all assignment operators, reduce/foreach, try/catch, label/break, ?// destructuring, string interpolation, @formats, regex builtins, streaming (tostream/fromstream), dates, and jq 1.8 additions.

CLI flags: -n -r -j -c -s -e -f --arg --argjson. Outputs are lazy iterators — purejq.compile("repeat(. * 2)").run(1) happily yields forever.

Compatibility

CPython 3.9–3.14 and PyPy, zero runtime dependencies, enforced by CI on every push.

Contributing & internals

See CONTRIBUTING.md — the conformance suite is the scoreboard, tools/bench.py is the speedometer.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

purejq-0.2.0.tar.gz (37.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

purejq-0.2.0-py3-none-any.whl (37.1 kB view details)

Uploaded Python 3

File details

Details for the file purejq-0.2.0.tar.gz.

File metadata

  • Download URL: purejq-0.2.0.tar.gz
  • Upload date:
  • Size: 37.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for purejq-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3f3954b42ad48adacc78902361fc0f3aeb3c33d1ba9b0dcfd37408f6320c84d5
MD5 a6bc0c292022e18dff4e1f38f30fa8e0
BLAKE2b-256 8729db734c1a7199a49ff8bb8df7aeb999f18d9364a29b1652729fcdb4ca4324

See more details on using hashes here.

Provenance

The following attestation bundles were made for purejq-0.2.0.tar.gz:

Publisher: release.yml on adam2go/purejq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file purejq-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: purejq-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 37.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for purejq-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 20dce375f77a9d23415b30a0d6f8b4c9e73ad64e1fc1ff724acb68cd5e5dd7f8
MD5 5f08995d84e7464aa5d33bb1f6008390
BLAKE2b-256 a1b4670bc82cbe74d31d18d4d4c0717a590967d9e6adeb7d887484ee409c9c5b

See more details on using hashes here.

Provenance

The following attestation bundles were made for purejq-0.2.0-py3-none-any.whl:

Publisher: release.yml on adam2go/purejq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page