Skip to main content

Native Rust implementations of synalinks JSON/schema operations.

Project description

synaops

Rust implementations of the JSON and JSON-schema operations used by synalinks, exposed to Python as a PyO3 extension module (synaops).

The goal is a drop-in, faster replacement for the equivalent pure-Python helpers. Input/output types are plain Python dict / list / scalars — the boundary is handled via pythonize, so callers do not need to know there is Rust underneath.

Parity with the Python reference is asserted on every op and payload size (see bench/test_parity.py). Headline speedups on realistic payloads: ~470× on factorize_schema, ~290× on factorize_json at 600 keys, 4–8× on masking ops, 2–4× on simple key rewrites. Full table below.

Build

pip install maturin
maturin develop --release   # builds and installs into the active venv

Python API

import synaops

JSON object operations

Function Signature Description
prefix_json (json, prefix) Prepend prefix_ to every top-level key.
suffix_json (json, suffix) Append _suffix to every top-level key.
concatenate_json (json1, json2) Merge two objects; on key collision append _1, _2, … to disambiguate.
factorize_json (json) Group keys sharing a singular base into a single array under the plural key. Inverse of decompose_json.
decompose_json (json) Expand plural-keyed array properties into individual singular-keyed properties with numerical suffixes. Inverse of factorize_json.
out_mask_json (json, mask=None, pattern=None, recursive=True) Drop keys whose base name is in mask or whose base name matches the regex pattern. Numerical suffixes are ignored when matching.
in_mask_json (json, mask=None, pattern=None, recursive=True) Keep only the keys whose base name is in mask or matches pattern. In recursive mode, arrays are preserved and their object items are filtered in place.

JSON schema operations

Operate on JSON-Schema-shaped dicts (properties, required, $defs, type, …).

Function Signature Description
prefix_schema (schema, prefix) Prepend prefix_ to every property key and update title / required accordingly.
suffix_schema (schema, suffix) Append _suffix to every property key and update title / required accordingly.
concatenate_schema (schema1, schema2) Merge two schemas (properties, required, $defs); on key collision append numeric suffixes and regenerate titles.
factorize_schema (schema) Group similar singular-keyed properties into array-typed plural-keyed properties; folds heterogeneous items into anyOf.
decompose_schema (schema) Expand plural-keyed array properties into a single singular-keyed property carrying the items schema.
out_mask_schema (schema, mask=None, pattern=None, recursive=True) Remove properties whose base name is in mask or matches pattern. With recursive=True, descends into nested object/array properties and $defs, then prunes $defs entries no longer referenced.
in_mask_schema (schema, mask=None, pattern=None, recursive=True) Keep only properties whose base name is in mask or matches pattern. Same recursive/$defs-pruning behavior as out_mask_schema.
standardize_schema (schema) Placeholder for schema normalization (currently identity).

is_object, is_array, is_schema_equal, contains_schema are intentionally not ported. They are single-key lookups or dict comparisons whose cost is dominated by the PyO3 / dict-conversion boundary, so the pure-Python versions in synalinks are faster.

Matching semantics

Both *_mask_* families and factorize_* / decompose_* rely on the NLP helpers in nlp_utils.rs: they strip trailing numerical suffixes (answer_3answer) and normalize singular/plural forms (answersanswer) before comparing keys. The pattern argument is a regular expression matched via substring search against the base key (same semantics as Python's re.search).

Benchmark

Measured on realistic payloads: nested objects, arrays of $ref-based objects, schema $defs. Three payload sizes (12, 96, 600 top-level keys). Parity with the Python reference is verified before each timing run (pytest bench/test_parity.py, 45/45 pass).

Speedup summary

Ratio py_median / rs_median per op. Higher is better; dashed line is parity (1×).

speedup

Operation small (12) medium (96) large (600)
factorize_schema 9.43× 68.0× 472×
factorize_json 10.3× 48.9× 291×
in_mask_json 8.11× 7.37× 7.75×
out_mask_json_pattern 4.16× 4.27× 4.49×
out_mask_json 4.37× 4.17× 4.42×
in_mask_schema 5.14× 4.21× 4.31×
out_mask_schema 4.67× 3.92× 4.16×
prefix_schema 3.91× 3.79× 4.16×
suffix_schema 3.90× 3.80× 4.10×
decompose_schema 2.65× 2.44× 3.33×
concatenate_schema 2.25× 2.08× 2.89×
decompose_json 2.74× 2.48× 2.56×
prefix_json 2.41× 2.41× 2.46×
suffix_json 2.63× 2.40× 2.42×
concatenate_json 2.44× 2.46× 2.30×

factorize_* scales super-linearly because the Python reference does repeated O(n) key scans per group; the Rust path groups in a single pass. Simple key rewrites (prefix_*, suffix_*, concatenate_*) are bounded by the PyO3 dict-conversion boundary, which is why they cluster around 2–4×.

Before (Python) vs After (Rust) — absolute medians

Log scale, lower is better. Rows are the three payload sizes.

before/after

See bench/README.md for the harness, payload shapes, and how to regenerate these charts.

Development

cargo test              # run Rust unit tests
maturin develop         # install debug build into venv
maturin develop --release

# Parity + performance against the Python reference impl
pytest bench/test_parity.py -v
pytest bench/test_bench.py --benchmark-save=latest
python bench/plot.py    # writes bench/speedup.png

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synaops-0.1.0.tar.gz (450.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

synaops-0.1.0-cp310-abi3-win_amd64.whl (824.7 kB view details)

Uploaded CPython 3.10+Windows x86-64

synaops-0.1.0-cp310-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

synaops-0.1.0-cp310-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

synaops-0.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

synaops-0.1.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

synaops-0.1.0-cp310-abi3-macosx_11_0_arm64.whl (914.5 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

synaops-0.1.0-cp310-abi3-macosx_10_12_x86_64.whl (955.1 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file synaops-0.1.0.tar.gz.

File metadata

  • Download URL: synaops-0.1.0.tar.gz
  • Upload date:
  • Size: 450.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.1

File hashes

Hashes for synaops-0.1.0.tar.gz
Algorithm Hash digest
SHA256 61b791265680fbc7ce10db81707a9b73dcb1b6defb11ce8a47d7c8b27cf63c06
MD5 a09ae2728aedba16765adfc13806ce24
BLAKE2b-256 3af61a6d69d7523a917b945de1490cc17addb39d2b8f147e4cce1eefeea8fac5

See more details on using hashes here.

File details

Details for the file synaops-0.1.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: synaops-0.1.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 824.7 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.1

File hashes

Hashes for synaops-0.1.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 47e81fb3adcde5cdfb71b6aaa8b97ceced6d7b3bf878f94df390a905c7cca59b
MD5 9712ecc5b3b7e01198d2d97f8a643944
BLAKE2b-256 f94005d748f8805c29aeb226ecb9ccaa723cef9e6a36d6776c6c90b5b2580177

See more details on using hashes here.

File details

Details for the file synaops-0.1.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.1.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1f0d54c439640e4575cd09f4e6e78b4322d2ab08d8720ebc8bfccb932a29babc
MD5 49bbba71a493388246a59bc3e0e19f7f
BLAKE2b-256 bf6916e7178ada6f2c39e3cdc351b925174281d6b3fc69a65e6fd3b0427b074d

See more details on using hashes here.

File details

Details for the file synaops-0.1.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for synaops-0.1.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 e02b5f618a2fd43290ed97abbe6257ed590d29857ea209a3b59a3857acbe2faa
MD5 3787cc60054ea90fc33a2e49bb51c720
BLAKE2b-256 d75a68c03699835042087a6b7a632b31c0d5c13d5a9cce713b13c28ae1ef5503

See more details on using hashes here.

File details

Details for the file synaops-0.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a3b134de202c1dbe662636dded146fd786dfa2b98bef373d6190708ca45ba9b2
MD5 47223152f02123cf304d07cd8a219418
BLAKE2b-256 2d69d664152adc0a4c079563ef324170ae4c61a39ee6775379eb38c74635edea

See more details on using hashes here.

File details

Details for the file synaops-0.1.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for synaops-0.1.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 4464fa478982b77807c66beb020518b914603d02fb4ee82a22fc33406974e5d3
MD5 e2e24f133f8955df5638b4c4799c6bd1
BLAKE2b-256 6dfb04506d1958c52e070265f9331bba1fb9ec5a3efedb689ae19b65a661ba70

See more details on using hashes here.

File details

Details for the file synaops-0.1.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for synaops-0.1.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0e90f0d7cedda93be4b7975838b37f340d236b643b0077ff430956181ff164e6
MD5 ec2760695553be41ef510cfb9e62f878
BLAKE2b-256 d772de410d35585c9945fbd1c192cc5a663c83a7e0880adfae77c7e4a8dc20e7

See more details on using hashes here.

File details

Details for the file synaops-0.1.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for synaops-0.1.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 89a09dbbac59a8f7761239d3c24fe71d146aec32db973c0bfbfae54bb91018db
MD5 266a97f2779fbc59a2f1230519c6d1a2
BLAKE2b-256 7a16e8c7de3fa45358a6215a280f79778e86740507c85831cd193ae92b086a69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page