A pure Python implementation of jq
Project description
purejq
A pure Python implementation of jq, the command-line JSON processor — in the spirit of gojq (Go) and jaq (Rust).
No C extension. No binary. If Python runs, purejq runs — including
Pyodide/WASM, AWS Lambda layers you can't compile in, restricted sandboxes,
and anywhere pip install is all you get.
$ echo '{"users":[{"name":"alice","age":30},{"name":"bob","age":25}]}' \
| purejq '.users[] | select(.age > 26) | .name'
"alice"
Why another jq?
| C jq | jq PyPI package |
purejq | |
|---|---|---|---|
| needs a compiled binary / wheel | yes | yes (C bindings) | no |
| runs on Pyodide / WASM | no | no | yes |
| embeds in Python (call functions, pass dicts) | no | partially | yes |
| arbitrary-precision integers | no | no | yes |
The existing jq PyPI package is excellent
when you can ship compiled wheels. purejq is for when you can't — or when you
want to read and hack the implementation in one afternoon.
Install
pip install purejq # nothing but Python
pip install 'purejq[speed]' # + orjson for faster JSON parsing in the CLI
(Not yet published to PyPI — install from source for now: pip install git+https://github.com/adam2go/purejq)
Usage
CLI (drop-in for common jq usage)
purejq '.foo[] | select(.bar > 2)' data.json
cat data.json | purejq -r '.items[].name' # raw output
purejq -n 'range(3) | . * 2' # null input
purejq -c --arg name alice '{user: $name}' # compact output, variables
Supported flags: -n -r -j -c -s -e -f --arg --argjson.
Python API
import purejq
# one-shot
purejq.first(".a.b", {"a": {"b": 42}}) # 42
purejq.all_outputs(".[] | . * 2", [1, 2, 3]) # [2, 4, 6]
# compile once, run on many inputs (the fast way)
prog = purejq.compile("[.[] | select(.score > 50)] | length")
for batch in batches:
print(prog.first(batch))
# results are a lazy iterator — infinite streams are fine
prog = purejq.compile("repeat(. * 2)")
it = prog.run(1)
next(it), next(it), next(it) # 2, 4, 8
Conformance: measured, not claimed
purejq is tested against jq's own official test suite (vendored in tests/conformance/): 751 of 781 cases pass (96.2%).
Every remaining failure is listed with its reason in expected_failures.txt and falls in one of these known buckets:
- module system (
import/include/modulemeta) — not implemented yet - number representation — Python integers are arbitrary-precision, so
13911860366432393stays exact instead of rounding like a C double. This is the same deliberate difference gojq made; for AI/data pipelines exactness is usually what you want - error-message wording in a handful of edge cases (e.g. Python's JSON parser phrases syntax errors differently)
Implemented and conformance-tested: the full expression language (paths,
all assignment operators, reduce/foreach, try/catch, label/break,
destructuring with ?// alternatives, string interpolation, all @formats),
regex builtins (test/match/capture/scan/sub/gsub via Python re),
tostream/fromstream, date builtins, SQL-ish builtins, and jq 1.8 additions
(pick, abs, toboolean, trim, have_decnum, …).
Performance
Honest framing: a pure Python jq will not beat the C implementation on raw throughput. The design keeps it in usable territory:
- compile once, run many — programs compile to Python generator closures; evaluation never re-walks the AST
- fully lazy streams —
first(f),limit, and infinite generators cost only what they consume - C-speed JSON parsing — input parsing uses Python's C-backed
jsonmodule (or orjson if installed viapurejq[speed]), so the parse-heavy part of typical workloads is not written in Python at all - PyPy as the escape hatch — purejq is tested on PyPy in CI; interpreter workloads typically run ~10x faster there
Run the benchmark yourself (compares against the system jq if installed):
python3 tools/bench.py 100000
Reference numbers (M-series MacBook, CPython 3.13, 100k objects): field-access streams ~19 ms, map+aggregate ~46 ms, group_by ~560 ms.
Compatibility
CPython 3.9 – 3.14 and PyPy, enforced by the CI matrix on every push. Zero runtime dependencies.
Architecture
source ──lexer──▶ tokens ──parser──▶ AST (tuples)
│ compile (once)
▼
generator closures: f(value, env) → iterator
│
path mode: g(value, path, env) → (path, value) pairs
(powers path(), del(), and all assignments)
- lexer.py / parser.py — jq grammar, including string interpolation
- compiler.py — closure compilation, environments, value & path modes
- ops.py — jq value semantics: total ordering, arithmetic, path read/write
- builtins.py — Python-native builtins (regex, sort, math, dates, formats)
- prelude.py — derived builtins defined in jq itself, mirroring jq's
builtin.jq
Contributing
See CONTRIBUTING.md. The short version: make a conformance
number go up, and python3 tools/update_expected_failures.py is your
scoreboard.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file purejq-0.1.0.tar.gz.
File metadata
- Download URL: purejq-0.1.0.tar.gz
- Upload date:
- Size: 36.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6341e14885762c887ba1a5b49576fbf758b2706bf60370aeef40c9e5842f5dc1
|
|
| MD5 |
3c1961823adc1e193ebee9ef541e8cb8
|
|
| BLAKE2b-256 |
428b6e9598e0451a5945b31e10f6472074a1f42f773bc3d38edb55c740dd2139
|
Provenance
The following attestation bundles were made for purejq-0.1.0.tar.gz:
Publisher:
release.yml on adam2go/purejq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
purejq-0.1.0.tar.gz -
Subject digest:
6341e14885762c887ba1a5b49576fbf758b2706bf60370aeef40c9e5842f5dc1 - Sigstore transparency entry: 1778728850
- Sigstore integration time:
-
Permalink:
adam2go/purejq@e3c4ee7f58d780f4ef3884877bd2a0129597ca24 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/adam2go
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e3c4ee7f58d780f4ef3884877bd2a0129597ca24 -
Trigger Event:
release
-
Statement type:
File details
Details for the file purejq-0.1.0-py3-none-any.whl.
File metadata
- Download URL: purejq-0.1.0-py3-none-any.whl
- Upload date:
- Size: 35.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b00390e067cc9b82ffaa866754732575a8ec3c11fe6d9c62e852c66e545326b
|
|
| MD5 |
1dab2047e10d04ee2e5faba14630aae4
|
|
| BLAKE2b-256 |
0b1343df518c95b86065016f50d72c93d6227645e0e6bdab41d5111077d8ead0
|
Provenance
The following attestation bundles were made for purejq-0.1.0-py3-none-any.whl:
Publisher:
release.yml on adam2go/purejq
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
purejq-0.1.0-py3-none-any.whl -
Subject digest:
3b00390e067cc9b82ffaa866754732575a8ec3c11fe6d9c62e852c66e545326b - Sigstore transparency entry: 1778729019
- Sigstore integration time:
-
Permalink:
adam2go/purejq@e3c4ee7f58d780f4ef3884877bd2a0129597ca24 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/adam2go
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e3c4ee7f58d780f4ef3884877bd2a0129597ca24 -
Trigger Event:
release
-
Statement type: