Skip to main content

an efficient binary serialization format for numerical data

Project description

NumBin

PyPI version Python Version License Check status

An efficient binary serialization format for numerical data.

Install

pip install numbin

Usage

Work with pure NumPy data:

import numbin as nb
import numpy as np


arr = np.random.rand(5, 3)

# in memory
binary = nb.dumps(arr)
print(nb.loads(binary))

# file
with open("num.bin", "wb") as f:
    nb.dump(arr, f)

with open("num.bin", "rb") as f:
    print(nb.load(f))

Work with complex data:

from numbin.msg_ext import NumBinMessage


nbm = NumBinMessage()
data = {"tensor": arr, "labels": ["dog", "cat"], "safe": True}

# in memory
binary = nbm.dumps(data)
print(nbm.loads(binary))

# file
with open("data.bin", "wb") as f:
    nbm.dump(data, f)

with open("data.bin", "rb") as f:
    print(nbm.load(f))

Benchmark

The code can be found in bench.py

Tested with Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz Python 3.11.0.

>>> benchmark for numpy array
========================================================================================================================
pickle_serde    size:         1 times: min(9.952e-06)   mid(1.0284e-05) max(0.00012502) 95%(1.0083e-05) Std.(1.2433e-06)
numbin_serde    size:         1 times: min(2.727e-06)   mid(2.824e-06)  max(5.563e-05)  95%(2.7679e-06) Std.(6.9532e-07)
numpy_save_load size:         1 times: min(0.00013145)  mid(0.00013528) max(0.0021547)  95%(0.00013252) Std.(2.1891e-05)
========================================================================================================================
pickle_serde    size:      1024 times: min(1.1219e-05)  mid(1.1626e-05) max(6.7387e-05) 95%(1.1352e-05) Std.(1.0844e-06)
numbin_serde    size:      1024 times: min(3.326e-06)   mid(3.444e-06)  max(4.3707e-05) 95%(3.371e-06)  Std.(7.5806e-07)
numpy_save_load size:      1024 times: min(0.00013785)  mid(0.00014344) max(0.00026173) 95%(0.00013913) Std.(6.8812e-06)
========================================================================================================================
pickle_serde    size:     65536 times: min(4.8282e-05)  mid(4.9399e-05) max(0.00029774) 95%(4.8512e-05) Std.(2.9386e-06)
numbin_serde    size:     65536 times: min(3.8543e-05)  mid(3.9357e-05) max(9.9388e-05) 95%(3.8681e-05) Std.(3.1692e-06)
numpy_save_load size:     65536 times: min(0.00021826)  mid(0.00022547) max(0.00074907) 95%(0.00021916) Std.(2.0406e-05)
========================================================================================================================
pickle_serde    size:   3145728 times: min(0.014393)    mid(0.017675)   max(0.030137)   95%(0.01456)    Std.(0.0036429)
numbin_serde    size:   3145728 times: min(0.020834)    mid(0.023769)   max(0.037167)   95%(0.02309)    Std.(0.0022647)
numpy_save_load size:   3145728 times: min(0.036523)    mid(0.038008)   max(0.29076)    95%(0.036596)   Std.(0.095368)
========================================================================================================================
pickle_serde    size: 201326592 times: min(1.5232)      mid(1.5588)     max(1.7401)     95%(1.5234)     Std.(0.09501)
numbin_serde    size: 201326592 times: min(1.5286)      mid(1.5301)     max(1.5348)     95%(1.5286)     Std.(0.0024368)
numpy_save_load size: 201326592 times: min(2.2289)      mid(2.3567)     max(2.7892)     95%(2.229)      Std.(0.21158)
>>> benchmark for normal data mixed with numpy array
========================================================================================================================
pickle_serde    size: <unknown> times: min(1.4121e-05)  mid(1.4568e-05) max(7.9592e-05) 95%(1.4355e-05) Std.(1.7201e-06)
nbmsg_serde     size: <unknown> times: min(9.188e-06)   mid(9.518e-06)  max(0.011783)   95%(9.331e-06)  Std.(3.5517e-05)
========================================================================================================================
pickle_serde    size: <unknown> times: min(5.4409e-05)  mid(5.4909e-05) max(0.00016216) 95%(5.4595e-05) Std.(4.3362e-06)
nbmsg_serde     size: <unknown> times: min(0.00010215)  mid(0.00010288) max(0.00023232) 95%(0.0001024)  Std.(6.7374e-06)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numbin-0.4.0.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

numbin-0.4.0-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file numbin-0.4.0.tar.gz.

File metadata

  • Download URL: numbin-0.4.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for numbin-0.4.0.tar.gz
Algorithm Hash digest
SHA256 f8149b492e9290141537082c30b51596424a36709b077e77f8dbf78d4b9911fd
MD5 628d7df1418203839beb5cfecd61a9f1
BLAKE2b-256 f98d0d740f09f516c3d208ecbbc5409f09959d5314b63d31fbbe0c5505a60075

See more details on using hashes here.

File details

Details for the file numbin-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: numbin-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.2

File hashes

Hashes for numbin-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 19fb12a9235b9eab3099192532702072392c830c5c5114405c26ac0f938fae33
MD5 50cb8eb829c0ddbed5c359926790c303
BLAKE2b-256 5a369edd23ac857bae5c2fc270058521f83cefa5c79b1cff0998fed067528291

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page