JSONSki_Python is the Python binding port for JSONSki
Project description
JSONSki
JSONSki is the Python binding port for JSONSki written in c++ found here here - https://github.com/AutomataLab/JSONSki
What is it?
JSONSki is a streaming JSONPath processor with fast-forward functionality. During the streaming, it can automatically fast-forward over certain JSON substructures that are irrelavent to the query evaluation, without parsing them in detail. To make the fast-forward efficient, JSONSki features a highly bit-parallel solution that intensively utilizes bitwise and SIMD operations that are prevelent on modern CPUs to implement the fast-forward APIs.
NPM Package
You can download the npm package from here - https://www.npmjs.com/package/jsonski
Installation
pip install JSONSki
Quick Start
const JSki = require('jsonski')
console.log(JSki.JSONSkiParser("$.features[150].actor.login", "datasets/test.json"));
- We interface the following method:
JSki.JSONSkiParser(args1, args2) //args1 - String(query) and args2 - String(file_location)
Requirements
Hardware requirements
- CPUs: 64-bit ALU instructions, 256-bit SIMD instruction set, and the carry-less multiplication instruction (pclmulqdq)
- Operating System: Linux, MacOs (Intel Chips only)
- C++ Compiler: g++ (7.4.0 or higher)
Software requirements
Before starting to use Node-API you need to assure you have the following prerequisites:
-
Python (v3.9) see: Installing Python
-
C++ : g++ (v7.4.0 and above) see: Installing C++
Getting Started with Querying using JSONSki
JSONPath
JSONPath is the basic query language of JSON data. It refers to substructures of JSON data in a similar way as XPath queries are used for XML data. For the details of JSONPath syntax, please refer to Stefan Goessner's article.
JSONSki Queries Operators
| Operator | Description |
|---|---|
$ |
root object |
. |
child object |
[] |
child array |
* |
wildcard, all objects or array members |
[index] |
array index |
[start:end] |
array slice operator |
Path Examples
Consider a piece of geo-referenced tweet in JSON
{
"coordinates": [
40.74118764, -73.9998279
],
"user": {
"id": 6253282
},
"place": {
"name": "Manhattan",
"bounding_box": {
"type": "Ploygon",
"pos": [
[-74.026675, 40.683935],
......
]
}
}
}
| JsonPath | Result |
|---|---|
$.coordinates[*] |
all coordinates |
$.place.name |
place name |
$.place.bounding_box.pos[0] |
first position of the bounding box in place |
$.place.bounding_box.pos[0:2] |
first two positions of the bounding box in place |
Performance Comparison with Javascript Parsing
Below is an example usage of Jsonski npm package.
const JSki = require('jsonski')
const fs = require('fs');
console.time();
console.log('JsonSki Runtime', JSki.JSONSkiParser("$[*].entities.urls[*].url", "dataset/twitter_sample_large_record.json"));
console.timeEnd();
file_contents = fs.readFileSync('dataset/twitter_sample_large_record.json')
str = file_contents.toString()
console.log("Javascript Runtime")
console.time();
var json = JSON.parse(str);
console.timeEnd();
- Note: The code snippet above benchmarks performance for Javascript parsing VS JSONSki_nodejs parsing.
Publication
[1] Lin Jiang and Zhijia Zhao. JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022.
@inproceedings{jsonski,
title={JSONSki: Streaming Semi-structured Data with Bit-Parallel Fast-Forwarding},
author={Lin Jiang and Zhijia Zhao},
booktitle={Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)},
year={2022}
}
Performance
Benchmarking
Performance of JSONSki_nodejs is compared with simdjson_nodejs and Javascript Parsing - https://github.com/AutomataLab/NPM-JSON-Parser-Benchmarking
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file JSONSki-0.1.6.tar.gz.
File metadata
- Download URL: JSONSki-0.1.6.tar.gz
- Upload date:
- Size: 4.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00ac3f5ade43848d6cdb4bad78de2126c15c970127f06027fe22894354309a6a
|
|
| MD5 |
bc875fcab54c74d75b5152264c2284c0
|
|
| BLAKE2b-256 |
25b59d2675ac47a016cbfdcb4c2851ebe3f23ad08c147e397a5be337118fce72
|