Skip to main content

Native Rust implementations of synalinks JSON/schema operations.

Project description

synaops

Rust implementations of the JSON and JSON-schema operations used by synalinks, exposed to Python as a PyO3 extension module (synaops).

The goal is a drop-in, faster replacement for the equivalent pure-Python helpers. Input/output types are plain Python dict / list / scalars — the boundary is handled via pythonize, so callers do not need to know there is Rust underneath.

Parity with the Python reference is asserted on every op and payload size (see bench/test_parity.py). Headline speedups on realistic payloads: ~485× on factorize_schema, ~280× on factorize_json at 600 keys, 4–8× on masking ops, 2–4× on simple key rewrites. Full table below.

Build

pip install maturin
maturin develop --release   # builds and installs into the active venv

Python API

import synaops

JSON object operations

Function Signature Description
prefix_json (json, prefix) Prepend prefix_ to every top-level key.
suffix_json (json, suffix) Append _suffix to every top-level key.
concatenate_json (json1, json2) Merge two objects; on key collision append _1, _2, … to disambiguate.
factorize_json (json) Group keys sharing a singular base into a single array under the plural key.
out_mask_json (json, mask=None, pattern=None, recursive=True) Drop keys whose base name is in mask or whose base name matches the regex pattern. Numerical suffixes are ignored when matching.
in_mask_json (json, mask=None, pattern=None, recursive=True) Keep only the keys whose base name is in mask or matches pattern. In recursive mode, arrays are preserved and their object items are filtered in place.

JSON schema operations

Operate on JSON-Schema-shaped dicts (properties, required, $defs, type, …).

Function Signature Description
prefix_schema (schema, prefix) Prepend prefix_ to every property key and update title / required accordingly.
suffix_schema (schema, suffix) Append _suffix to every property key and update title / required accordingly.
concatenate_schema (schema1, schema2) Merge two schemas (properties, required, $defs); on key collision append numeric suffixes and regenerate titles.
factorize_schema (schema) Group similar singular-keyed properties into array-typed plural-keyed properties; folds heterogeneous items into anyOf.
out_mask_schema (schema, mask=None, pattern=None, recursive=True) Remove properties whose base name is in mask or matches pattern. With recursive=True, descends into nested object/array properties and $defs, then prunes $defs entries no longer referenced.
in_mask_schema (schema, mask=None, pattern=None, recursive=True) Keep only properties whose base name is in mask or matches pattern. Same recursive/$defs-pruning behavior as out_mask_schema.
standardize_schema (schema) Placeholder for schema normalization (currently identity).

is_object, is_array, is_schema_equal, contains_schema are intentionally not ported. They are single-key lookups or dict comparisons whose cost is dominated by the PyO3 / dict-conversion boundary, so the pure-Python versions in synalinks are faster.

Matching semantics

Both *_mask_* families and factorize_* rely on the NLP helpers in nlp_utils.rs: they strip trailing numerical suffixes (answer_3answer) and normalize singular/plural forms (answersanswer) before comparing keys. The pattern argument is a regular expression matched via substring search against the base key (same semantics as Python's re.search).

Benchmark

Measured on realistic payloads: nested objects, arrays of $ref-based objects, schema $defs. Three payload sizes (12, 96, 600 top-level keys). Parity with the Python reference is verified before each timing run (pytest bench/test_parity.py, 45/45 pass).

Speedup summary

Ratio py_median / rs_median per op. Higher is better; dashed line is parity (1×).

speedup

Operation small (12) medium (96) large (600)
factorize_schema 8.78× 64.8× 485×
factorize_json 9.73× 46.2× 282×
in_mask_json 7.75× 7.11× 7.47×
out_mask_json 4.23× 4.12× 4.29×
out_mask_json_pattern 3.77× 4.15× 4.20×
in_mask_schema 4.83× 4.12× 4.15×
out_mask_schema 4.25× 3.78× 4.11×
prefix_schema 3.63× 3.62× 3.89×
suffix_schema 3.66× 3.64× 3.85×
concatenate_schema 2.21× 2.15× 2.85×
suffix_json 2.36× 2.22× 2.28×
concatenate_json 2.25× 2.18× 2.27×
prefix_json 2.26× 2.25× 2.26×

factorize_* scales super-linearly because the Python reference does repeated O(n) key scans per group; the Rust path groups in a single pass. Simple key rewrites (prefix_*, suffix_*, concatenate_*) are bounded by the PyO3 dict-conversion boundary, which is why they cluster around 2–4×.

Before (Python) vs After (Rust) — absolute medians

Log scale, lower is better. Rows are the three payload sizes.

before/after

See bench/README.md for the harness, payload shapes, and how to regenerate these charts.

Development

cargo test              # run Rust unit tests
maturin develop         # install debug build into venv
maturin develop --release

# Parity + performance against the Python reference impl
pytest bench/test_parity.py -v
pytest bench/test_bench.py --benchmark-save=latest
python bench/plot.py    # writes bench/speedup.png

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synaops-0.2.0.tar.gz (406.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

synaops-0.2.0-cp310-abi3-win_amd64.whl (823.6 kB view details)

Uploaded CPython 3.10+Windows x86-64

synaops-0.2.0-cp310-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

synaops-0.2.0-cp310-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

synaops-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

synaops-0.2.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

synaops-0.2.0-cp310-abi3-macosx_11_0_arm64.whl (914.1 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

synaops-0.2.0-cp310-abi3-macosx_10_12_x86_64.whl (954.9 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file synaops-0.2.0.tar.gz.

File metadata

  • Download URL: synaops-0.2.0.tar.gz
  • Upload date:
  • Size: 406.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.1

File hashes

Hashes for synaops-0.2.0.tar.gz
Algorithm Hash digest
SHA256 5144e92cf41973e4a2a15377dd5498048f949dedaeade55d812f352c5249dfda
MD5 0a15bb42b1870e85b4e8e53e5065e280
BLAKE2b-256 c35bc15e994f7ebff956b3de9adfcf6d6f16f5fa0213be09025bc0ad47959308

See more details on using hashes here.

File details

Details for the file synaops-0.2.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: synaops-0.2.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 823.6 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.1

File hashes

Hashes for synaops-0.2.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 28353d0313d8c501df8f4f47b8ecb438d8d41775da691c01eb4a7feee5bb7c5d
MD5 6b5d546399ab2a9935e129b6a17b0f39
BLAKE2b-256 a61aaf17717ca1f5b0dc7c1f36af9af17860fccf7f4bb4003c5a2fda43273c63

See more details on using hashes here.

File details

Details for the file synaops-0.2.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.2.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1a8e38a9db5928d8db3903589badd78c829d9e8fcb58cf6a96e2c85b3daeab39
MD5 19a0d8b8c129efb7de43b63facfef331
BLAKE2b-256 5e161dded525e078ee9d5c4986d2bea224c5cdb1cb9f817b9d6d2c36afb46105

See more details on using hashes here.

File details

Details for the file synaops-0.2.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for synaops-0.2.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 7c6bf35c12f20054346e60613fa6f1aacf52244e91ac879f35831168ee0744ff
MD5 33397c7f33405893a5670db5f26f808d
BLAKE2b-256 42880d335d6352b6b2e9708e1e099cb0675f68e420f822b98d438987d82de773

See more details on using hashes here.

File details

Details for the file synaops-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.2.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b16520a70bdf932b375198f0328e36b739b79cc133a9caa01e6fac482a2a356e
MD5 54346d33f0a9ff58c214370f27d12351
BLAKE2b-256 e322c11b047a50573652fbc6efe4676f4c9ff82bf6171833e4155997861fc3ae

See more details on using hashes here.

File details

Details for the file synaops-0.2.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for synaops-0.2.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 388fb6dfbd94a343d1b5c8b73dfe1e7ed0304e6af9eee40d01f43f1613410e19
MD5 e1aa7bb42ec03c64f00972772015f691
BLAKE2b-256 f3dd0fc37b8bf18f93a783d345468f2c7a02297d6d0de0cf44713bebca312f2e

See more details on using hashes here.

File details

Details for the file synaops-0.2.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for synaops-0.2.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1b5100601528c7f1ad9290629efd586594218f038fe40d46f9105ee75ccdcfba
MD5 e9c45e4ddf5948db50535281a3283a6d
BLAKE2b-256 19d7742183516f9ba8e61363588c77cc35ba4e109a6e0bd590ad1d0ca6584d3e

See more details on using hashes here.

File details

Details for the file synaops-0.2.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.2.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e79611ba633ed4db138badfa3e766bb506f42e61047f3954f045b528245587de
MD5 12a464012569a9300ea30182749f08cf
BLAKE2b-256 3ec539e9e02ceeb39163c560d5f288bbf0a0d695fbed7ea7a6c04237e35e5c4f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page