Skip to main content

JSONSki_Python is the Python binding port for JSONSki

Project description

CircleCI GitHub npm GitHub code size in bytes

JSONSki

JSONSki is the Python binding port for JSONSki written in c++ found here here - https://github.com/AutomataLab/JSONSki

What is it?

JSONSki is a streaming JSONPath processor with fast-forward functionality. During the streaming, it can automatically fast-forward over certain JSON substructures that are irrelavent to the query evaluation, without parsing them in detail. To make the fast-forward efficient, JSONSki features a highly bit-parallel solution that intensively utilizes bitwise and SIMD operations that are prevelent on modern CPUs to implement the fast-forward APIs.

NPM Package

You can download the npm package from here - https://www.npmjs.com/package/jsonski

Installation

pip install JSONSki

Quick Start

const JSki = require('jsonski')
console.log(JSki.JSONSkiParser("$.features[150].actor.login", "datasets/test.json"));
  • We interface the following method:
JSki.JSONSkiParser(args1, args2)    //args1 - String(query) and args2 - String(file_location)

Requirements

Hardware requirements

  • CPUs: 64-bit ALU instructions, 256-bit SIMD instruction set, and the carry-less multiplication instruction (pclmulqdq)
  • Operating System: Linux, MacOs (Intel Chips only)
  • C++ Compiler: g++ (7.4.0 or higher)

Software requirements

Before starting to use Node-API you need to assure you have the following prerequisites:

Getting Started with Querying using JSONSki

JSONPath

JSONPath is the basic query language of JSON data. It refers to substructures of JSON data in a similar way as XPath queries are used for XML data. For the details of JSONPath syntax, please refer to Stefan Goessner's article.

JSONSki Queries Operators

Operator Description
$ root object
. child object
[] child array
* wildcard, all objects or array members
[index] array index
[start:end] array slice operator

Path Examples

Consider a piece of geo-referenced tweet in JSON

{
    "coordinates": [
        40.74118764, -73.9998279
    ],
    "user": {
        "id": 6253282
    },
    "place": {
        "name": "Manhattan",
        "bounding_box": {
            "type": "Ploygon",
            "pos": [
                [-74.026675, 40.683935],
                ......
            ]
        }
    }
}
JsonPath Result
$.coordinates[*] all coordinates
$.place.name place name
$.place.bounding_box.pos[0] first position of the bounding box in place
$.place.bounding_box.pos[0:2] first two positions of the bounding box in place

Performance Comparison with Javascript Parsing

Below is an example usage of Jsonski npm package.

const JSki = require('jsonski')
const fs = require('fs');
console.time();
console.log('JsonSki Runtime', JSki.JSONSkiParser("$[*].entities.urls[*].url", "dataset/twitter_sample_large_record.json"));
console.timeEnd();

file_contents = fs.readFileSync('dataset/twitter_sample_large_record.json')
str = file_contents.toString()
console.log("Javascript Runtime")
console.time();
var json = JSON.parse(str);
console.timeEnd();
  • Note: The code snippet above benchmarks performance for Javascript parsing VS JSONSki_nodejs parsing.

Publication

[1] Lin Jiang and Zhijia Zhao. JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022.

@inproceedings{jsonski,
  title={JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding},
  author={Lin Jiang and Zhijia Zhao},
  booktitle={Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)},
  year={2022}
}

Performance

image

Benchmarking

Performance of JSONSki_nodejs is compared with simdjson_nodejs and Javascript Parsing - https://github.com/AutomataLab/NPM-JSON-Parser-Benchmarking

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

JSONSki-0.1.3.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

JSONSki-0.1.3-cp39-cp39-macosx_10_9_universal2.whl (142.3 kB view details)

Uploaded CPython 3.9macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file JSONSki-0.1.3.tar.gz.

File metadata

  • Download URL: JSONSki-0.1.3.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for JSONSki-0.1.3.tar.gz
Algorithm Hash digest
SHA256 3250021be788cdbaa8ea91c6c8f8c0b5bec4e3d6857a9b422026842d26ac38c1
MD5 d8892789a5fe55f9c3fa21d57e2a31f3
BLAKE2b-256 372585bd6e0127547ea924b98eaa0f1ac8cc51463f250bc2cfe883b3cdc4bc24

See more details on using hashes here.

File details

Details for the file JSONSki-0.1.3-cp39-cp39-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for JSONSki-0.1.3-cp39-cp39-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 85af3a4a4a4aaa724d322c3e0e32b53aae1c5999dab25aab8441f6f6587a3972
MD5 3d61f24606519ef49d6a0a5a1e73ba45
BLAKE2b-256 caca7b268e4e8888dd9a189f262ac7b3111118aa719dd61c1ef701ea17a75af7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page