High-performance, pure-Python implementation of the JOLT JSON-to-JSON transformation library
Project description
jolt-py
A high-performance, pure-Python implementation of the JOLT JSON-to-JSON transformation library.
Features
| Transform | Operation name | Description |
|---|---|---|
Shift |
shift |
Re-map fields from input paths to output paths |
Default |
default |
Fill in missing or null fields |
Remove |
remove |
Delete specified fields |
Sort |
sort |
Sort all dict keys alphabetically |
Cardinality |
cardinality |
Enforce ONE or MANY cardinality on fields |
ModifyOverwrite |
modify-overwrite-beta |
Apply functions, always overwriting |
ModifyDefault |
modify-default-beta |
Apply functions only to absent fields |
Chainr |
— | Chain multiple transforms sequentially |
Installation
pip install jolt-py
Quick Start
The canonical JOLT example — re-shape a nested rating object:
from pyjolt import Chainr
spec = [
{
"operation": "shift",
"spec": {
"rating": {
"primary": {
"value": "Rating",
"max": "RatingRange"
},
"*": {
"value": "SecondaryRatings.&1.Value",
"max": "SecondaryRatings.&1.Range"
}
}
}
},
{
"operation": "default",
"spec": {"Rating": 0}
}
]
input_data = {
"rating": {
"primary": {"value": 3, "max": 5},
"quality": {"value": 4, "max": 5},
"sharpness":{"value": 2, "max": 10}
}
}
result = Chainr.from_spec(spec).apply(input_data)
# {
# "Rating": 3,
# "RatingRange": 5,
# "SecondaryRatings": {
# "quality": {"Value": 4, "Range": 5},
# "sharpness": {"Value": 2, "Range": 10}
# }
# }
Real-World Examples
E-commerce order normalisation
Transform a raw checkout payload into an internal order schema — renaming fields, typing prices, and stripping sensitive data:
from pyjolt import Chainr
spec = [
{
"operation": "shift",
"spec": {
"orderId": "id",
"customer": {
"firstName": "customer.first",
"lastName": "customer.last",
"emailAddress": "customer.email",
},
"lineItems": {
"*": {
"sku": "items[].sku",
"qty": "items[].quantity",
"unitPrice": "items[].price",
}
},
"shippingMethod": "shipping.method",
},
},
{
# Convert price strings to floats inside each item
"operation": "modify-overwrite-beta",
"spec": {"items": {"*": {"price": "=toDouble"}}},
},
{
"operation": "default",
"spec": {"shipping": {"method": "standard"}},
},
{
"operation": "remove",
"spec": {"couponCode": ""},
},
]
raw_order = {
"orderId": "ORD-9921",
"customer": {
"firstName": "Jane", "lastName": "Doe",
"emailAddress": "jane.doe@example.com",
},
"lineItems": [
{"sku": "ABC-1", "qty": 2, "unitPrice": "19.99"},
{"sku": "XYZ-7", "qty": 1, "unitPrice": "5.49"},
],
"shippingMethod": "express",
"couponCode": None,
}
result = Chainr.from_spec(spec).apply(raw_order)
# {
# "id": "ORD-9921",
# "customer": {"first": "Jane", "last": "Doe", "email": "jane.doe@example.com"},
# "items": [
# {"sku": "ABC-1", "quantity": 2, "price": 19.99},
# {"sku": "XYZ-7", "quantity": 1, "price": 5.49}
# ],
# "shipping": {"method": "express"}
# }
Note — the
items[].fieldsyntax builds an array of objects where each wildcard iteration (*overlineItems) contributes one element. Multiple fields from the same iteration (sku,qty,price) all land in the same array slot automatically.
API response normalisation
Flatten a paginated search response, rename fields, and default missing values:
spec = [
{
"operation": "shift",
"spec": {
"total_count": "meta.total",
"items": {
"*": {
"id": "repos[].id",
"full_name": "repos[].name",
"stargazers_count": "repos[].stars",
"language": "repos[].language",
"private": "repos[].private",
}
},
},
},
{
"operation": "default",
"spec": {"repos": {"*": {"language": "unknown"}}},
},
{"operation": "sort"},
]
User profile flattening + PII scrub
Flatten a nested CMS user object into a flat CRM record and strip PII before export:
spec = [
{
"operation": "shift",
"spec": {
"userId": "crm.id",
"profile": {
"displayName": "crm.name",
"address": {
"city": "crm.city",
"country": "crm.country",
},
},
"account": {
"plan": "crm.plan",
"createdAt": "crm.joinDate",
"tags": "crm.tags",
},
# profile.email, profile.phone, internal.* are intentionally
# omitted from the spec and therefore dropped from the output
},
},
{"operation": "default", "spec": {"crm": {"plan": "free", "tags": []}}},
{"operation": "modify-overwrite-beta", "spec": {"crm": {"plan": "=toUpperCase"}}},
{"operation": "cardinality", "spec": {"crm": {"tags": "MANY"}}},
]
IoT sensor normalisation
Three device types emit subtly different payloads — one pipeline normalises them into a uniform time-series schema:
spec = [
{
"operation": "shift",
"spec": {
"device_id": "deviceId",
"type": "sensorType",
"ts": "timestamp",
"reading": {
"celsius": "value", # temperature devices
"percent": "value", # humidity devices
"hpa": "value", # pressure devices
"unit": "unit",
},
"battery_pct": "batteryPercent",
},
},
{
"operation": "modify-overwrite-beta",
"spec": {
"value": "=toDouble",
"batteryPercent": "=toInteger",
},
},
{
"operation": "default",
"spec": {"batteryPercent": -1}, # sentinel for older firmware
},
]
Transform Reference
Shift
Re-map fields by specifying where each input field should go in the output.
from pyjolt.transforms import Shift
s = Shift({"user": {"name": "profile.fullName", "age": "profile.years"}})
s.apply({"user": {"name": "Alice", "age": 30}})
# → {"profile": {"fullName": "Alice", "years": 30}}
Spec tokens — input side:
| Token | Meaning |
|---|---|
* |
Match any key (combinable: prefix_*_suffix) |
a|b |
Match key a OR b |
@ |
Self-reference — use the current input node directly |
$ / $N |
Emit the matched key name N levels up as the value |
#literal |
Emit the literal string literal as a constant value |
Spec tokens — output path:
| Token | Meaning |
|---|---|
literal |
Literal key name |
& / &N |
Key matched N levels up (&0 = current, &1 = parent, …) |
&(N,M) |
M-th wildcard capture group at N levels up |
@(N,path) |
Value found at N levels up following dot-separated path |
[] suffix |
Array-append: append value, or share a slot across fields |
Wildcard back-references:
# *-* matches "foo-bar"; &(0,1)="foo", &(0,2)="bar"
s = Shift({"*-*": "out.&(0,1).&(0,2)"})
s.apply({"foo-bar": 42}) # → {"out": {"foo": {"bar": 42}}}
Array of objects:
# Each "*" iteration creates one element; multiple fields share the same slot
s = Shift({"items": {"*": {"id": "out[].id", "name": "out[].name"}}})
s.apply({"items": [{"id": 1, "name": "a"}, {"id": 2, "name": "b"}]})
# → {"out": [{"id": 1, "name": "a"}, {"id": 2, "name": "b"}]}
Array flatten (append scalars):
s = Shift({"a": "vals[]", "b": "vals[]"})
s.apply({"a": 1, "b": 2}) # → {"vals": [1, 2]}
Multiple output paths:
s = Shift({"id": ["primary.id", "backup.id"]})
s.apply({"id": 7}) # → {"primary": {"id": 7}, "backup": {"id": 7}}
Key-as-value ($ / $N):
# $ writes the matched key name as the value
s = Shift({"*": {"$": "keys[]"}})
s.apply({"foo": 1, "bar": 2}) # → {"keys": ["foo", "bar"]}
# $1 writes the key matched one level up
s = Shift({"sensors": {"*": {"value": "out[].v", "$1": "out[].section"}}})
s.apply({"sensors": {"temp": {"value": 22}}})
# → {"out": [{"v": 22, "section": "sensors"}]}
Constant-as-value (#literal):
# #literal writes the fixed string "literal" as the value
s = Shift({"*": {"#photo": "types[]"}})
s.apply({"a": 1, "b": 2}) # → {"types": ["photo", "photo"]}
# Combine with back-references in the output path
s = Shift({"*": {"#widget": "catalog.&1.kind"}})
s.apply({"foo": {}, "bar": {}})
# → {"catalog": {"foo": {"kind": "widget"}, "bar": {"kind": "widget"}}}
Default
Fill in absent or null fields.
from pyjolt.transforms import Default
d = Default({"status": "unknown", "meta": {"version": 1}})
d.apply({"name": "test"})
# → {"name": "test", "status": "unknown", "meta": {"version": 1}}
Apply a default to every element of an array:
Default({"items": {"*": {"active": True}}}).apply(
{"items": [{"name": "x"}, {"name": "y", "active": False}]}
)
# → {"items": [{"name": "x", "active": True}, {"name": "y", "active": False}]}
Remove
Delete specified fields.
from pyjolt.transforms import Remove
r = Remove({"password": "", "token": ""})
r.apply({"user": "alice", "password": "s3cr3t", "token": "xyz"})
# → {"user": "alice"}
Use "*" to remove all keys at a level:
Remove({"*": ""}).apply({"a": 1, "b": 2}) # → {}
Sort
Recursively sort all dict keys alphabetically.
from pyjolt.transforms import Sort
Sort().apply({"b": 2, "a": 1, "c": {"z": 26, "a": 1}})
# → {"a": 1, "b": 2, "c": {"a": 1, "z": 26}}
Cardinality
Ensure fields are a single value (ONE) or a list (MANY).
from pyjolt.transforms import Cardinality
c = Cardinality({"tags": "MANY", "primary": "ONE"})
c.apply({"tags": "python", "primary": ["first", "second"]})
# → {"tags": ["python"], "primary": "first"}
ModifyOverwrite / ModifyDefault
Apply built-in functions to field values.
from pyjolt.transforms import ModifyOverwrite, ModifyDefault
m = ModifyOverwrite({"score": "=toInteger", "label": "=toUpperCase"})
m.apply({"score": "42", "label": "hello"})
# → {"score": 42, "label": "HELLO"}
ModifyDefault only touches fields that are absent:
m = ModifyDefault({"count": 0, "active": True})
m.apply({"count": 5}) # → {"count": 5, "active": True}
Apply a function to every element of an array:
ModifyOverwrite({"prices": {"*": {"amount": "=toDouble"}}}).apply(
{"prices": [{"amount": "9.99"}, {"amount": "4.49"}]}
)
# → {"prices": [{"amount": 9.99}, {"amount": 4.49}]}
Built-in functions:
| Function | Description |
|---|---|
=toInteger / =toLong |
Convert to int |
=toDouble / =toFloat |
Convert to float |
=toString |
Convert to str |
=toBoolean |
Convert to bool |
=trim |
Strip whitespace |
=toUpperCase / =toLowerCase |
Change case |
=abs |
Absolute value |
=min(N) / =max(N) |
Clamp to min/max |
=intSum(N) / =doubleSum(N) / =longSum(N) / =floatSum(N) |
Add N to value |
=sum |
Sum all elements of a numeric list |
=avg |
Average of a numeric list |
=sqrt |
Square root |
=not |
Boolean negation |
=size |
Length of string/list |
=concat(suffix) |
Append suffix to string value |
=join(sep) |
Join list with separator |
=split(sep) |
Split string by separator |
=leftPad(width,char) / =rightPad(width,char) |
Pad string to width |
=substring(start,end) |
Slice a string |
=startsWith(prefix) / =endsWith(suffix) |
Predicate on string |
=contains(item) |
True if item is in string or list |
=squashNulls |
Remove null entries from list |
=recursivelySquashNulls |
Recursively remove null entries |
=toList |
Wrap value in a list if not already one |
=firstElement / =lastElement |
First or last list element |
=elementAt(N) |
Nth list element |
=indexOf(item) |
Index of item in list/string |
=coalesce(fallback,…) |
First non-null from value + args |
=noop |
Identity (leave value unchanged) |
Chainr
Chain multiple transforms, applying them in order.
from pyjolt import Chainr
chain = Chainr.from_spec([
{"operation": "shift", "spec": {"score": "score"}},
{"operation": "modify-overwrite-beta","spec": {"score": "=toDouble"}},
{"operation": "default", "spec": {"score": 0.0}},
])
chain.apply({"score": "3.14"}) # → {"score": 3.14}
chain.apply({}) # → {"score": 0.0}
Compose transform instances directly:
from pyjolt import Chainr
from pyjolt.transforms import Shift, Sort
chain = Chainr([Shift({"b": "b", "a": "a"}), Sort()])
chain.apply({"b": 2, "a": 1}) # → {"a": 1, "b": 2} (sorted)
Contributing
Contributions are welcome — bug reports, documentation improvements, new features, and spec-compatibility fixes all help.
git clone https://github.com/sthitaprajnas/pyjolt.git
cd pyjolt
pip install -e ".[dev]"
pytest # run test suite
ruff check src/pyjolt # lint
mypy src/pyjolt # type-check
Please read CONTRIBUTING.md for the full workflow. For security issues, see SECURITY.md.
License
Copyright 2024 Sthitaprajna Sahoo and contributors.
Licensed under the Apache License, Version 2.0 — see LICENSE for the full text.
You are free to use, modify, and distribute this software under the terms of the Apache 2.0 license. Contributions submitted to the project are also licensed under Apache 2.0.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jolt_py-1.1.0.tar.gz.
File metadata
- Download URL: jolt_py-1.1.0.tar.gz
- Upload date:
- Size: 35.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9090d2735f504abeb7dbb3df145dd759e59e7ad7094e1e79d360d267718eeb6b
|
|
| MD5 |
81c2274916b758f3dba03d2357fa2c0d
|
|
| BLAKE2b-256 |
351f306338b4fcc79ecf5c58259a9edcf281b35d10cbe31a5eb840c6bad9c71c
|
Provenance
The following attestation bundles were made for jolt_py-1.1.0.tar.gz:
Publisher:
release.yml on STHITAPRAJNAS/pyjolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jolt_py-1.1.0.tar.gz -
Subject digest:
9090d2735f504abeb7dbb3df145dd759e59e7ad7094e1e79d360d267718eeb6b - Sigstore transparency entry: 1272442940
- Sigstore integration time:
-
Permalink:
STHITAPRAJNAS/pyjolt@5027dd6b0791476ba230caf858aaa9356598e6af -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/STHITAPRAJNAS
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5027dd6b0791476ba230caf858aaa9356598e6af -
Trigger Event:
push
-
Statement type:
File details
Details for the file jolt_py-1.1.0-py3-none-any.whl.
File metadata
- Download URL: jolt_py-1.1.0-py3-none-any.whl
- Upload date:
- Size: 29.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4099f7bdf6c20e10863d6a043503f6d8ecd3480bacc5653e84c80117621b7160
|
|
| MD5 |
7e8b9b5f2480e6b472a9c93fc40d06a2
|
|
| BLAKE2b-256 |
248a3a1b982a685a4767c3e05266b7e3c709a367e47591e45e754ac68d8222ea
|
Provenance
The following attestation bundles were made for jolt_py-1.1.0-py3-none-any.whl:
Publisher:
release.yml on STHITAPRAJNAS/pyjolt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jolt_py-1.1.0-py3-none-any.whl -
Subject digest:
4099f7bdf6c20e10863d6a043503f6d8ecd3480bacc5653e84c80117621b7160 - Sigstore transparency entry: 1272443003
- Sigstore integration time:
-
Permalink:
STHITAPRAJNAS/pyjolt@5027dd6b0791476ba230caf858aaa9356598e6af -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/STHITAPRAJNAS
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5027dd6b0791476ba230caf858aaa9356598e6af -
Trigger Event:
push
-
Statement type: