Skip to main content

Cython-based wrapper for SIMDJSON

Project description

cysimdjson

Fast JSON parsing library for Python, 7-12 times faster than standard Python JSON parser.
It is Python bindings for the simdjson using Cython.

Standard Python JSON parser (json.load() etc.) is relatively slow, and if you need to parse large JSON files or a large number of small JSON files, it may represent a significant bottleneck.

Whilst there are other fast Python JSON parsers, such as pysimdjson, libpy_simdjson or orjson, they don't reach the raw speed that is provided by the brilliant SIMDJSON project. SIMDJSON is C++ JSON parser based on SIMD instructions, reportedly the fastest JSON parser on the planet.

Test in Python 3.7 Test in Python 3.8 Test in Python 3.9

Usage

import cysimdjson

json_bytes = b'''
{
  "foo": [1,2,[3]]
}
'''

parser = cysimdjson.JSONParser()
json_element = parser.parse(json_bytes)

# Access using JSON Pointer
print(json_element.at_pointer("/foo/2/0"))

Note: parser object can be reused for maximum performance.

Pythonic drop-in API

parser = cysimdjson.JSONParser()
json_parsed = parser.loads(json_bytes)

# Access using JSON Pointer
print(json_parsed.json_parsed['foo'])

The json_parsed is a read-only dictionary-like object, that provides an access to JSON data.

Documentation

JSONParser.parse(json_bytes)

Parse JSON json_bytes, represented as bytes.

JSONParser.parse_in_place(bytes)

Parse JSON json_bytes, represented as bytes, assuming that there is a padding expected by SIMDJSON. This is the fastest parsing variant.

JSONParser.parse_string(string)

Parse JSON json_bytes, represented as str (string).

JSONParser.load(path)

Installation

pip3 install cython
pip3 install cysimdjson

Project cysimdjson is distributed via PyPI: https://pypi.org/project/cysimdjson/ .

Performance

----------------------------------------------------------------
# 'jsonexamples/test.json' 2397 bytes
----------------------------------------------------------------
* cysimdjson parse          510291.81 EPS (  1.00)  1223.17 MB/s
* libpy_simdjson loads      374615.54 EPS (  1.36)   897.95 MB/s
* pysimdjson parse          362195.46 EPS (  1.41)   868.18 MB/s
* orjson loads              110615.70 EPS (  4.61)   265.15 MB/s
* python json loads          72096.80 EPS (  7.08)   172.82 MB/s
----------------------------------------------------------------

SIMDJSON: 543335.93 EPS, 1241.52 MB/s
----------------------------------------------------------------
# 'jsonexamples/twitter.json' 631515 bytes
----------------------------------------------------------------
* cysimdjson parse            2556.10 EPS (  1.00)  1614.22 MB/s
* libpy_simdjson loads        2444.53 EPS (  1.05)  1543.76 MB/s
* pysimdjson parse            2415.46 EPS (  1.06)  1525.40 MB/s
* orjson loads                 387.11 EPS (  6.60)   244.47 MB/s
* python json loads            278.63 EPS (  9.17)   175.96 MB/s
----------------------------------------------------------------

SIMDJSON: 2536.16 EPS,  1527.28 MB/s
----------------------------------------------------------------
# 'jsonexamples/canada.json' 2251051 bytes
----------------------------------------------------------------
* cysimdjson parse             284.67 EPS (  1.00)   640.81 MB/s
* pysimdjson parse             284.62 EPS (  1.00)   640.70 MB/s
* libpy_simdjson loads         277.13 EPS (  1.03)   623.84 MB/s
* orjson loads                  81.80 EPS (  3.48)   184.13 MB/s
* python json loads             22.52 EPS ( 12.64)    50.68 MB/s
----------------------------------------------------------------

SIMDJSON: 307.95 EPS, 661.08 MB/s
----------------------------------------------------------------
# 'jsonexamples/gsoc-2018.json' 3327831 bytes
----------------------------------------------------------------
* cysimdjson parse             775.61 EPS (  1.00)  2581.09 MB/s
* pysimdjson parse             743.67 EPS (  1.04)  2474.81 MB/s
* libpy_simdjson loads         654.15 EPS (  1.19)  2176.88 MB/s
* orjson loads                 166.67 EPS (  4.65)   554.66 MB/s
* python json loads            113.72 EPS (  6.82)   378.43 MB/s
----------------------------------------------------------------

SIMDJSON: 703.59 EPS, 2232.92 MB/s
----------------------------------------------------------------
# 'jsonexamples/verysmall.json' 7 bytes
----------------------------------------------------------------
* cysimdjson parse         3972376.53 EPS (  1.00)    27.81 MB/s
* orjson loads             3637369.63 EPS (  1.09)    25.46 MB/s
* libpy_simdjson loads     1774211.19 EPS (  2.24)    12.42 MB/s
* pysimdjson parse          977530.90 EPS (  4.06)     6.84 MB/s
* python json loads         527932.65 EPS (  7.52)     3.70 MB/s
----------------------------------------------------------------

SIMDJSON: 3799392.10 EPS

CPU: AMD EPYC 7452

More performance testing:

Tests are reproducible

pip3 install orjson
pip3 install pysimdjson
pip3 install libpy_simdjson
python3 setup.py build_ext --inplace
PYTHONPATH=. python3 ./perftest/test_benchmark.py

Manual build

python3 setup.py build_ext --inplace

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cysimdjson-21.11b2.tar.gz (197.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cysimdjson-21.11b2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

cysimdjson-21.11b2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

cysimdjson-21.11b2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

cysimdjson-21.11b2-cp38-cp38-macosx_10_9_x86_64.whl (119.7 kB view details)

Uploaded CPython 3.8macOS 10.9+ x86-64

cysimdjson-21.11b2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64

cysimdjson-21.11b2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64

File details

Details for the file cysimdjson-21.11b2.tar.gz.

File metadata

  • Download URL: cysimdjson-21.11b2.tar.gz
  • Upload date:
  • Size: 197.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for cysimdjson-21.11b2.tar.gz
Algorithm Hash digest
SHA256 79f03f924926dd39b784cdfb6ba465ed91f4992c5bbec9c80a3e549a31ef4367
MD5 f375e9673ad02ca0588ddb54af3c420e
BLAKE2b-256 906f7673679c1ed438a18068e3ccdc1a8874fdb3084ff7403006faac92163542

See more details on using hashes here.

File details

Details for the file cysimdjson-21.11b2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cysimdjson-21.11b2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c1e832683d233d55a295765a028b904260a8d6ee0ca3339c6a6f96315e297fb8
MD5 2c652fe1a4b3834dd6ff54aaf39ca40f
BLAKE2b-256 e4cd7ac41341f28386b8c816b7bc2f41e7cb7cdf5c8085ed65d5fd3eda787e6f

See more details on using hashes here.

File details

Details for the file cysimdjson-21.11b2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cysimdjson-21.11b2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d484aa7b1cf31255705a70a9f18106904711593f3151a40433b1a0f4a6c3c70b
MD5 9df20d14f00b461b971800b0e12715d7
BLAKE2b-256 7d2a489397410768daf8e7723f2aedf78151780c5fb03513084aedfb26e1fb3a

See more details on using hashes here.

File details

Details for the file cysimdjson-21.11b2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cysimdjson-21.11b2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7bd4d403be2c03e8d6537f3cb871d1b3a2b5d64822161111620edd71557ba156
MD5 8a7b32035353b936dd33c1430405ccaf
BLAKE2b-256 2fb3cb5fcf1c06505e20e7299a22d7772275311231474b96fe29ec2d3c3bacdf

See more details on using hashes here.

File details

Details for the file cysimdjson-21.11b2-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: cysimdjson-21.11b2-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 119.7 kB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for cysimdjson-21.11b2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0eed43d84004912ba8619073958835afdbb5e92a0583a8c497cc6a0885335d2a
MD5 837000280caed21127c209b3326522f9
BLAKE2b-256 c0e413e0c9ce1562a5b79292316f7f71f6b501c8148fba32d5612aeb1dbcaa3e

See more details on using hashes here.

File details

Details for the file cysimdjson-21.11b2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cysimdjson-21.11b2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bf8698bfdcc5afa1e07773fcb40da0b91ee134e2134c60760468e02940bd1187
MD5 e4d374f5a94fdd8f88ce2ab305fed95b
BLAKE2b-256 7bf39c0e437a14933497bb034c6bcd4f178d1d31aee8029b6ddf2bde83ccca16

See more details on using hashes here.

File details

Details for the file cysimdjson-21.11b2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cysimdjson-21.11b2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 43597c5fffcfc90da70109ecc854c2b9290aa1024fd2b4bc2921b8d31732b902
MD5 8fbe1de2a6611c0fd5b1e010a2d9148a
BLAKE2b-256 c285f54c21dc8923afc31a7517426452cffa6f1505fcc2ec5201c1de79295d9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page