A high-performance array storage and manipulation library
Project description
NumPack
NumPack is a high-performance array storage and manipulation library designed to efficiently handle large NumPy arrays. Built with Rust for performance and exposed to Python through PyO3, NumPack provides a seamless interface for storing, loading, and manipulating large numerical arrays with better performance compared to traditional NumPy storage methods.
Features
- High Performance: Optimized for both reading and writing large numerical arrays
- Memory Mapping Support: Efficient memory usage through memory mapping capabilities
- Selective Loading: Load only the arrays you need, when you need them
- In-place Operations: Support for in-place array modifications without full file rewrite
- Parallel I/O: Utilizes parallel processing for improved performance
- Multiple Data Types: Supports various numerical data types including:
- Boolean
- Unsigned integers (8-bit to 64-bit)
- Signed integers (8-bit to 64-bit)
- Floating point (32-bit and 64-bit)
Installation
pip install numpack
Requirements
- Python >= 3.9
- NumPy
Usage
Basic Operations
import numpy as np
from numpack import NumPack
# Create a NumPack instance
npk = NumPack("data_directory")
# Save arrays
arrays = {
'array1': np.random.rand(1000, 100).astype(np.float32),
'array2': np.random.rand(500, 200).astype(np.float32)
}
npk.save(arrays)
# Load arrays
# Normal mode
loaded = npk.load(mmap_mode=False)
# Memory mapping mode for large arrays
lazy_loaded = npk.load(mmap_mode=True)
# Access specific arrays
array1 = loaded['array1']
array2 = loaded['array2']
Advanced Operations
# Replace specific rows
replacement = np.random.rand(10, 100).astype(np.float32)
npk.replace({'array1': replacement}, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# Append new arrays
new_arrays = {
'array3': np.random.rand(200, 100).astype(np.float32)
}
npk.append(new_arrays)
# Drop arrays or specific rows
npk.drop('array1') # Drop entire array
npk.drop('array2', [0, 1, 2]) # Drop specific rows
# Get metadata
shapes = npk.get_shape() # Get shapes of all arrays
members = npk.get_member_list() # Get list of array names
mtime = npk.get_modify_time('array1') # Get modification time
Performance
NumPack offers significant performance improvements compared to traditional NumPy storage methods, especially in data modification operations and random access. Below are detailed benchmark results:
Benchmark Results
The following benchmarks were performed on an MacBook Pro (M1, 2020, 32GB Memory) with arrays of size 1M x 10 and 500K x 5 (float32).
Storage Operations
| Operation | NumPack | NumPy NPZ | NumPy NPY |
|---|---|---|---|
| Save | 0.014s (0.93x NPZ, 0.57x NPY) | 0.013s | 0.008s |
| Full Load | 0.008s (1.75x NPZ, 1.00x NPY) | 0.014s | 0.008s |
| Selective Load | 0.005s (2.00x NPZ, -) | 0.010s | - |
| Mmap Load | 0.006s (2.17x NPZ, 0.00x NPY) | 0.013s | 0.000s |
Data Modification Operations
| Operation | NumPack | NumPy NPZ | NumPy NPY |
|---|---|---|---|
| Single Row Replace | 0.000s (23.00x NPZ, 14.00x NPY) | 0.023s | 0.014s |
| Continuous Rows (10K) | 0.001s (23.00x NPZ, 12.00x NPY) | 0.023s | 0.012s |
| Random Rows (10K) | 0.015s (1.53x NPZ, 0.87x NPY) | 0.023s | 0.013s |
| Large Data Replace (500K) | 0.019s (1.16x NPZ, 0.79x NPY) | 0.022s | 0.015s |
Drop Operations
| Operation | NumPack | NumPy NPZ | NumPy NPY |
|---|---|---|---|
| Drop Array | 0.001s (24.00x NPZ, 1.00x NPY) | 0.024s | 0.001s |
| Drop Rows (500K) | 0.036s (1.36x NPZ, 0.86x NPY) | 0.049s | 0.031s |
Append Operations
| Operation | NumPack | NumPy NPZ |
|---|---|---|
| Append | 0.003s (5.33x NPZ) | 0.016s |
Random Access Performance (10K indices)
| Operation | NumPack | NumPy NPZ | NumPy NPY |
|---|---|---|---|
| Random Access | 0.008s (1.88x NPZ, 1.13x NPY) | 0.015s | 0.009s |
File Size Comparison
| Format | Size | Ratio |
|---|---|---|
| NumPack | 47.68 MB | 1.0x |
| NPZ | 47.68 MB | 1.0x |
| NPY | 47.68 MB | 1.0x |
Key Performance Highlights
-
Data Modification:
- Single row replacement: NumPack is 23x faster than NPZ and 14x faster than NPY
- Continuous rows: NumPack is 23x faster than NPZ and 12x faster than NPY
- Random rows: NumPack is 1.53x faster than NPZ but 0.87x slower than NPY
- Large data replacement: NumPack is 1.16x faster than NPZ but 0.79x slower than NPY
-
Drop Operations:
- Drop array: NumPack is 24x faster than NPZ and comparable to NPY
- Drop rows: NumPack is 1.36x faster than NPZ but 0.86x slower than NPY
- NumPack provides efficient in-place row deletion without full file rewrite
-
Loading Performance:
- Full load: NumPack is 1.75x faster than NPZ and comparable to NPY
- Memory-mapped load: NumPack is 2.17x faster than NPZ but slower than NPY
- Selective load: NumPack is 2.00x faster than NPZ
-
Random Access:
- NumPack is 1.88x faster than NPZ and 1.13x faster than NPY for random index access
-
Storage Efficiency:
- All formats achieve identical compression ratios (47.68 MB)
- NumPack maintains high performance while keeping file sizes competitive
Note: All benchmarks were performed with float32 arrays. Performance may vary depending on data types, array sizes, and system configurations. Numbers greater than 1.0x indicate faster performance, while numbers less than 1.0x indicate slower performance.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.
Copyright 2024 NumPack Contributors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file numpack-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 542.7 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d35972fd7733d15dd9ce0d39f99868471adf59d99d749b6a5268bd39af57b829
|
|
| MD5 |
1a3581121692d3c07685105adf4e7090
|
|
| BLAKE2b-256 |
c3f9ac44010241315442fef483818436f4233c0a885e1e0b83c3d053ce8871f6
|
File details
Details for the file numpack-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl.
File metadata
- Download URL: numpack-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl
- Upload date:
- Size: 559.5 kB
- Tags: CPython 3.13, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c323a4692661d003cffef0490cbdd59ee1682fabddb58c138c1dd87b402a99c4
|
|
| MD5 |
4038f9ac8ddf3294f564093830c4cbc9
|
|
| BLAKE2b-256 |
f343091105a32fe9efc5283cc860d07a9da7453ee25b1db80e1883ce6ca79fec
|
File details
Details for the file numpack-0.1.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: numpack-0.1.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 405.3 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d90b9263807ac4bc93b1b1fdad34dcc44dc860622eed4559e64a9ecb67f3d56
|
|
| MD5 |
f2ccb050043e0b51e924ded5783c72c5
|
|
| BLAKE2b-256 |
72ec29a933765f550d09fac74e3db503884fb28c3a54b86a4ff60a86590449a9
|
File details
Details for the file numpack-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: numpack-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 614.1 kB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92fe2331b892be378f3f6b91699ff601a2d1f4c29abc5f3669451f09fbe8334e
|
|
| MD5 |
8d0dc39cad51b771dd5d0c7821e1ed1a
|
|
| BLAKE2b-256 |
edcdf43a5a73513396539aa61048ea860d104f110d0195bbe297bd5512d02c39
|
File details
Details for the file numpack-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 542.7 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b1d83fa853d6813481dcac0206cc6eb65aa4969f55b180f43537e058cf02903
|
|
| MD5 |
025bf4ec2c6f53719c4c47b4c79f9d21
|
|
| BLAKE2b-256 |
f9b24ccfcd8c10d2cc6146c7fb2cad2c8035c9aa758d5fbb613e1d76d68417ac
|
File details
Details for the file numpack-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl.
File metadata
- Download URL: numpack-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl
- Upload date:
- Size: 559.5 kB
- Tags: CPython 3.12, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db5308e78de9d8ad5cd8ff05782aa3ce5877664da2c7cef3b6ac31f7187abc31
|
|
| MD5 |
88db401bbfa1f34d2d3164817684a400
|
|
| BLAKE2b-256 |
6ed0756b98c9807a82559ef00ca03ea9c4f8b028a7a73043411ba758893c3753
|
File details
Details for the file numpack-0.1.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: numpack-0.1.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 404.1 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2eb1cd4edbf05f3a2548e3027b5de08ea52cc29e02ddbfd6ca200e6b3354c332
|
|
| MD5 |
e8267de4d9bcec7533fe38ff3491c825
|
|
| BLAKE2b-256 |
023bde9ba5fa3697d21800bf5700ab1367f7cb7417e8d8b1791eaf11b01a044d
|
File details
Details for the file numpack-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: numpack-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 615.7 kB
- Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c086a205b6162369d08d3c95dc9d812f3684a16ba916906881b698de9edf1361
|
|
| MD5 |
9596fb739ac61364403288b136d717fb
|
|
| BLAKE2b-256 |
e26d61c706b070add069089a897206ee4efc41a678bd1ed85335b5203035e2ce
|
File details
Details for the file numpack-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 542.8 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33088aeb493ebe0a8d0866fd6598c3714a0959f73c38ae0e718cf9fb62bf4704
|
|
| MD5 |
e4935cac50766e31ced3f51727102104
|
|
| BLAKE2b-256 |
b18e46c0397420e5dec287b8188313119c19c890323bd9ad9a227ee4ed234f78
|
File details
Details for the file numpack-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl.
File metadata
- Download URL: numpack-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl
- Upload date:
- Size: 560.5 kB
- Tags: CPython 3.11, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb4fb1e777febf08c591872bed860cea4953ed39cd79384b5d0b9eb15935d56a
|
|
| MD5 |
d4106da8f75fc6f0165c0bdd4c385ac9
|
|
| BLAKE2b-256 |
0f44da7d4e91cc4e300d25c87c9e54474c349bae88fc8af50c32b0566d7c917b
|
File details
Details for the file numpack-0.1.0-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: numpack-0.1.0-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 404.1 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69f58c0a1105a43edf833d4c9a27bf15e89d82e8148fc8635415efc42950367b
|
|
| MD5 |
046929e3de6ab52bf2a0e9745831b751
|
|
| BLAKE2b-256 |
b350d344bb0b275d4937c94fab47fc6348ceb32bfa17dc975fad0134bcb404da
|
File details
Details for the file numpack-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: numpack-0.1.0-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 615.6 kB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3de1d3e429546a23ccabdff1e4b1cf6d736385b47500a15159d9fb98d2d9485d
|
|
| MD5 |
7ad881c1b3e85052b6ea5c0e089f5081
|
|
| BLAKE2b-256 |
8cf9ad40dca9349e916b063710c7a20dc0f614cc0c47cd7333500662e4d7c333
|
File details
Details for the file numpack-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 542.7 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27b37bb4d4eee5eae58e260a185bbd279518e46e99f69623eb5adda96e25b53b
|
|
| MD5 |
401f34f6bf1e8caeaa8234be8111e643
|
|
| BLAKE2b-256 |
a2e837f39c6d2a5aa5a815c7da1af6c5f54463a8dd05f056693f4772d9379e02
|
File details
Details for the file numpack-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl.
File metadata
- Download URL: numpack-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl
- Upload date:
- Size: 560.3 kB
- Tags: CPython 3.10, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0fc4eac6e383a16fb7ed7b903d12f95ba869746ce41024b0d651f5ab100dd21
|
|
| MD5 |
6d69cede970afdfc73c16ba71a6a5b10
|
|
| BLAKE2b-256 |
6cefab909a453f35452c36f033acfdeade3b214b659f0887ecc54d87d5b4904a
|