Skip to main content

Native Rust implementations of synalinks JSON/schema operations.

Project description

synaops

Rust implementations of the JSON and JSON-schema operations used by synalinks, exposed to Python as a PyO3 extension module (synaops).

The goal is a drop-in, faster replacement for the equivalent pure-Python helpers. Input/output types are plain Python dict / list / scalars — the boundary is handled via pythonize, so callers do not need to know there is Rust underneath.

Parity with the Python reference is asserted on every op and payload size (see bench/test_parity.py). Headline speedups on realistic payloads: ~485× on factorize_schema, ~280× on factorize_json at 600 keys, 4–8× on masking ops, 2–4× on simple key rewrites. Full table below.

Build

pip install maturin
maturin develop --release   # builds and installs into the active venv

Python API

import synaops

JSON object operations

Function Signature Description
prefix_json (json, prefix) Prepend prefix_ to every top-level key.
suffix_json (json, suffix) Append _suffix to every top-level key.
concatenate_json (json1, json2) Merge two objects; on key collision append _1, _2, … to disambiguate.
factorize_json (json) Group keys sharing a singular base into a single array under the plural key.
out_mask_json (json, mask=None, pattern=None, recursive=True) Drop keys whose base name is in mask or whose base name matches the regex pattern. Numerical suffixes are ignored when matching.
in_mask_json (json, mask=None, pattern=None, recursive=True) Keep only the keys whose base name is in mask or matches pattern. In recursive mode, arrays are preserved and their object items are filtered in place.

JSON schema operations

Operate on JSON-Schema-shaped dicts (properties, required, $defs, type, …).

Function Signature Description
prefix_schema (schema, prefix) Prepend prefix_ to every property key and update title / required accordingly.
suffix_schema (schema, suffix) Append _suffix to every property key and update title / required accordingly.
concatenate_schema (schema1, schema2) Merge two schemas (properties, required, $defs); on key collision append numeric suffixes and regenerate titles.
factorize_schema (schema) Group similar singular-keyed properties into array-typed plural-keyed properties; folds heterogeneous items into anyOf.
out_mask_schema (schema, mask=None, pattern=None, recursive=True) Remove properties whose base name is in mask or matches pattern. With recursive=True, descends into nested object/array properties and $defs, then prunes $defs entries no longer referenced.
in_mask_schema (schema, mask=None, pattern=None, recursive=True) Keep only properties whose base name is in mask or matches pattern. Same recursive/$defs-pruning behavior as out_mask_schema.
standardize_schema (schema) Placeholder for schema normalization (currently identity).

is_object, is_array, is_schema_equal, contains_schema are intentionally not ported. They are single-key lookups or dict comparisons whose cost is dominated by the PyO3 / dict-conversion boundary, so the pure-Python versions in synalinks are faster.

Matching semantics

Both *_mask_* families and factorize_* rely on the NLP helpers in nlp_utils.rs: they strip trailing numerical suffixes (answer_3answer) and normalize singular/plural forms (answersanswer) before comparing keys. The pattern argument is a regular expression matched via substring search against the base key (same semantics as Python's re.search).

Benchmark

Measured on realistic payloads: nested objects, arrays of $ref-based objects, schema $defs. Three payload sizes (12, 96, 600 top-level keys). Parity with the Python reference is verified before each timing run (pytest bench/test_parity.py, 45/45 pass).

Speedup summary

Ratio py_median / rs_median per op. Higher is better; dashed line is parity (1×).

speedup

Operation small (12) medium (96) large (600)
factorize_schema 8.78× 64.8× 485×
factorize_json 9.73× 46.2× 282×
in_mask_json 7.75× 7.11× 7.47×
out_mask_json 4.23× 4.12× 4.29×
out_mask_json_pattern 3.77× 4.15× 4.20×
in_mask_schema 4.83× 4.12× 4.15×
out_mask_schema 4.25× 3.78× 4.11×
prefix_schema 3.63× 3.62× 3.89×
suffix_schema 3.66× 3.64× 3.85×
concatenate_schema 2.21× 2.15× 2.85×
suffix_json 2.36× 2.22× 2.28×
concatenate_json 2.25× 2.18× 2.27×
prefix_json 2.26× 2.25× 2.26×

factorize_* scales super-linearly because the Python reference does repeated O(n) key scans per group; the Rust path groups in a single pass. Simple key rewrites (prefix_*, suffix_*, concatenate_*) are bounded by the PyO3 dict-conversion boundary, which is why they cluster around 2–4×.

Before (Python) vs After (Rust) — absolute medians

Log scale, lower is better. Rows are the three payload sizes.

before/after

See bench/README.md for the harness, payload shapes, and how to regenerate these charts.

Development

cargo test              # run Rust unit tests
maturin develop         # install debug build into venv
maturin develop --release

# Parity + performance against the Python reference impl
pytest bench/test_parity.py -v
pytest bench/test_bench.py --benchmark-save=latest
python bench/plot.py    # writes bench/speedup.png

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synaops-0.2.1.tar.gz (406.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

synaops-0.2.1-cp310-abi3-win_amd64.whl (830.1 kB view details)

Uploaded CPython 3.10+Windows x86-64

synaops-0.2.1-cp310-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

synaops-0.2.1-cp310-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

synaops-0.2.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

synaops-0.2.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

synaops-0.2.1-cp310-abi3-macosx_11_0_arm64.whl (924.4 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

synaops-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl (965.5 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file synaops-0.2.1.tar.gz.

File metadata

  • Download URL: synaops-0.2.1.tar.gz
  • Upload date:
  • Size: 406.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for synaops-0.2.1.tar.gz
Algorithm Hash digest
SHA256 637afd39e6f36a1caaebfc1aca035b705256c88020d20f7902ff97e8ab50dc1c
MD5 3c2a6cceb73f3580066712e0ea0a7bcd
BLAKE2b-256 8a9c50162352aeeeff5dbeac006e117947fd37a234c0129fbf63a4a878f1e251

See more details on using hashes here.

File details

Details for the file synaops-0.2.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: synaops-0.2.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 830.1 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for synaops-0.2.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1856daae7a3769a7d15755d9e9fc322049cd8d534bf55ee59e3cbd36a400c718
MD5 822f1e1bb7e76fdc083123b3cf74104c
BLAKE2b-256 05b38004a06a2e539546541858f0d56536fc56b859b124aaf0cb2b4c8d17266d

See more details on using hashes here.

File details

Details for the file synaops-0.2.1-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.2.1-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 85652f6d1fbd40d5d476c3e10db66585cc2095a37a2857a01b03c6a62cbf5021
MD5 ce3252399c4dea3c55cd7b8178703c12
BLAKE2b-256 0cf4d8c9018064ae32a6c3c2fb70aac733b85203547023d0cde0038c36fecceb

See more details on using hashes here.

File details

Details for the file synaops-0.2.1-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for synaops-0.2.1-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 25d5d2c9804cfc007824ee92d26d791ef67b4a3ea194c1378055a05fb223981b
MD5 d93f53ae20ef2a1d9784b0eaee283430
BLAKE2b-256 ac341045297d5473d9f1c2dac11972ca5207e3b0992a0b19ae96ade81cb35116

See more details on using hashes here.

File details

Details for the file synaops-0.2.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.2.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 09245eec17230b3168aee82e096ecfb3a1351ad1cb5c1de265aca67db7ae6948
MD5 e4c6bb6722a336a73ef6f45e975a475f
BLAKE2b-256 609777e4e5c64bdf8058375e623159b840b20dd114ac9df4ad6b6192a7ea4ff8

See more details on using hashes here.

File details

Details for the file synaops-0.2.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for synaops-0.2.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 40cfe02bb4fd2f8f84c3e1b185a263778da2f9f31f126196dc605051405dfa2d
MD5 589f5ebdf8a8a3edcd04f92545d4f3af
BLAKE2b-256 2b267254f910f54ec4143bd15422bc85f8881b168a34bb98ee29d6d8b9c0058e

See more details on using hashes here.

File details

Details for the file synaops-0.2.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for synaops-0.2.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 96264ce575c287911d65c524bf73eb5318dfb3093756ac9940d69147d798d637
MD5 4b70df69e2f76abfddfeba5b038dc448
BLAKE2b-256 f090128917a33538fe8bc2f04474daa5efda94dcdba7e81b5eab4d0cf4e53e1f

See more details on using hashes here.

File details

Details for the file synaops-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.2.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0f5a63b215d365d1fb17c60b9f9c3de14709fdd4e19e93c951dbc4c2f1af45c3
MD5 a9f4d3015f132564498a8f02671dc59f
BLAKE2b-256 86292e96f87925ab14c6f86dfcddc38a3bfa4214180e78d2166577c430b39cfe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page