A high-performance array storage and manipulation library
Project description
NumPack
A high-performance NumPy array storage library combining Rust's speed with Python's simplicity. Optimized for frequent read/write operations on large arrays, with built-in SIMD-accelerated vector similarity search.
Highlights
| Feature | Performance |
|---|---|
| Row Replacement | 344x faster than NPY |
| Data Append | 338x faster than NPY |
| Lazy Loading | 51x faster than NPY mmap |
| Full Load | 1.64x faster than NPY |
| Batch Mode | 21x speedup |
| Writable Batch | 92x speedup |
Core Capabilities:
- Zero-copy mmap operations with minimal memory footprint
- SIMD-accelerated Vector Engine (AVX2, AVX-512, NEON, SVE)
- Batch & Writable Batch modes for high-frequency modifications
- Supports all NumPy dtypes: bool, int8-64, uint8-64, float16/32/64, complex64/128
Installation
pip install numpack
Requirements: Python ≥ 3.9, NumPy ≥ 1.26.0
Build from Source
# Prerequisites: Rust >= 1.70.0 (rustup.rs), C/C++ compiler
git clone https://github.com/BirchKwok/NumPack.git
cd NumPack
pip install maturin>=1.0,<2.0
maturin develop # or: maturin build --release
Quick Start
import numpy as np
from numpack import NumPack
with NumPack("data.npk") as npk:
# Save
npk.save({'embeddings': np.random.rand(10000, 128).astype(np.float32)})
# Load (normal or lazy)
data = npk.load("embeddings")
lazy = npk.load("embeddings", lazy=True)
# Modify
npk.replace({'embeddings': new_rows}, indices=[0, 1, 2])
npk.append({'embeddings': more_rows})
npk.drop('embeddings', [0, 1, 2]) # drop rows
# Random access
subset = npk.getitem('embeddings', [100, 200, 300])
Batch Modes
# Batch Mode - cached writes (21x speedup)
with npk.batch_mode():
for i in range(1000):
arr = npk.load('data')
arr[:10] *= 2.0
npk.save({'data': arr})
# Writable Batch Mode - direct mmap (108x speedup)
with npk.writable_batch_mode() as wb:
arr = wb.load('data')
arr[:10] *= 2.0 # Auto-persisted
Vector Engine
SIMD-accelerated similarity search (AVX2, AVX-512, NEON, SVE).
from numpack.vector_engine import VectorEngine, StreamingVectorEngine
# In-memory search
engine = VectorEngine()
indices, scores = engine.top_k_search(query, candidates, 'cosine', k=10)
# Multi-query batch (30-50% faster)
all_indices, all_scores = engine.multi_query_top_k(queries, candidates, 'cosine', k=10)
# Streaming from file (for large datasets)
streaming = StreamingVectorEngine()
indices, scores = streaming.streaming_top_k_from_file(
query, 'vectors.npk', 'embeddings', 'cosine', k=10
)
Supported Metrics: cosine, dot, l2, l2sq, hamming, jaccard, kl, js
Format Conversion
Convert between NumPack and other formats (PyTorch, Arrow, Parquet, SafeTensors).
from numpack.io import from_tensor, to_tensor, from_table, to_table
# Memory <-> .npk (zero-copy when possible)
from_tensor(tensor, 'output.npk', array_name='embeddings') # tensor -> .npk
tensor = to_tensor('input.npk', array_name='embeddings') # .npk -> tensor
from_table(table, 'output.npk') # PyArrow Table -> .npk
table = to_table('input.npk') # .npk -> PyArrow Table
# File <-> File (streaming for large files)
from numpack.io import from_pt, to_pt
from_pt('model.pt', 'output.npk') # .pt -> .npk
to_pt('input.npk', 'output.pt') # .npk -> .pt
Supported formats: PyTorch (.pt), Feather, Parquet, SafeTensors, NumPy (.npy), HDF5, Zarr, CSV
Pack & Unpack
Portable .npkg format for easy migration and sharing.
from numpack import pack, unpack, get_package_info
# Pack NumPack directory into a single .npkg file
pack('data.npk') # -> data.npkg (with Zstd compression)
pack('data.npk', 'backup/data.npkg') # Custom output path
# Unpack .npkg back to NumPack directory
unpack('data.npkg') # -> data.npk
unpack('data.npkg', 'restored/') # Custom restore path
# View package info without extracting
info = get_package_info('data.npkg')
print(f"Files: {info['file_count']}, Compression: {info['compression_ratio']:.1%}")
Benchmarks
Tested on macOS Apple Silicon, 1M rows × 10 columns, Float32 (38.1MB)
| Operation | NumPack | NPY | Advantage |
|---|---|---|---|
| Full Load | 4.00ms | 6.56ms | 1.64x |
| Lazy Load | 0.002ms | 0.102ms | 51x |
| Replace 100 rows | 0.040ms | 13.74ms | 344x |
| Append 100 rows | 0.054ms | 18.26ms | 338x |
| Random Access (100) | 0.004ms | 0.002ms | ~equal |
Multi-Format Comparison
Core Operations (1M × 10, Float32, ~38.1MB):
| Operation | NumPack | NPY | Zarr | HDF5 | Parquet | Arrow |
|---|---|---|---|---|---|---|
| Save | 11.94ms | 6.48ms | 70.91ms | 58.07ms | 142.11ms | 16.85ms |
| Full Load | 4.00ms | 6.56ms | 32.86ms | 53.99ms | 16.49ms | 12.39ms |
| Lazy Load | 0.002ms | 0.102ms | 0.374ms | 0.082ms | N/A | N/A |
| Replace 100 | 0.040ms | 13.74ms | 7.61ms | 0.29ms | 162.48ms | 26.93ms |
| Append 100 | 0.054ms | 18.26ms | 9.05ms | 0.39ms | 173.45ms | 42.46ms |
Random Access Performance:
| Batch Size | NumPack | NPY (mmap) | Zarr | HDF5 | Parquet | Arrow |
|---|---|---|---|---|---|---|
| 100 rows | 0.004ms | 0.002ms | 2.66ms | 0.66ms | 16.25ms | 12.43ms |
| 1K rows | 0.025ms | 0.021ms | 2.86ms | 5.02ms | 16.48ms | 12.61ms |
| 10K rows | 0.118ms | 0.112ms | 16.63ms | 505.71ms | 17.45ms | 12.81ms |
Batch Mode Performance (100 consecutive operations):
| Mode | Time | Speedup |
|---|---|---|
| Normal | 414ms | - |
| Batch Mode | 20.1ms | 21x |
| Writable Batch | 4.5ms | 92x |
File Size:
| Format | Size | Compression |
|---|---|---|
| NumPack | 38.15MB | - |
| NPY | 38.15MB | - |
| NPZ | 34.25MB | ✓ |
| Zarr | 34.13MB | ✓ |
| HDF5 | 38.18MB | - |
| Parquet | 44.09MB | ✓ |
| Arrow | 38.16MB | - |
When to Use NumPack
| Use Case | Recommendation |
|---|---|
| Frequent modifications | ✅ NumPack (344x faster) |
| ML/DL pipelines | ✅ NumPack (zero-copy random access, no full load) |
| Vector similarity search | ✅ NumPack (SIMD) |
| Write-once, read-many | ✅ NumPack (1.64x faster read) |
| Extreme compression | ✅ NumPack .npkg (better ratio, streaming, high I/O) |
| RAG/Embedding storage | ✅ NumPack (fast retrieval + SIMD search) |
| Feature store | ✅ NumPack (real-time updates + low latency) |
| Memory-constrained environments | ✅ NumPack (mmap + lazy loading) |
| Multi-process data sharing | ✅ NumPack (zero-copy mmap) |
| Incremental data pipelines | ✅ NumPack (338x faster append) |
| Real-time feature updates | ✅ NumPack (ms-level replace) |
Documentation
See docs/ for detailed guides and unified_benchmark.py for benchmark code.
Contributing
Contributions welcome! Please submit a Pull Request.
License
Apache License 2.0 - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file numpack-0.5.1.tar.gz.
File metadata
- Download URL: numpack-0.5.1.tar.gz
- Upload date:
- Size: 349.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
019dee006babb1fa21073ceef406d78983e8ab7d91cf4e8790e144f9cf33ae53
|
|
| MD5 |
e11221d9a94a66aa2975e0db12906bfa
|
|
| BLAKE2b-256 |
5ff28d745d1a2ca24b1fcff49a09a9bfe2c600349983c6bd8f392e338b61c88e
|
File details
Details for the file numpack-0.5.1-cp314-cp314-win_amd64.whl.
File metadata
- Download URL: numpack-0.5.1-cp314-cp314-win_amd64.whl
- Upload date:
- Size: 742.6 kB
- Tags: CPython 3.14, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e37fb8843b9cac0e23b54650b460e05f0d0ec7f1a6750abfa6ac5f17fc4334fe
|
|
| MD5 |
0c41f627ab171ddea631bcdb2a778f5e
|
|
| BLAKE2b-256 |
4218c727a2f4a93382eaeea0500cfedd3f8f05b6744fa24873a89c61cae574aa
|
File details
Details for the file numpack-0.5.1-cp314-cp314-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.5.1-cp314-cp314-macosx_11_0_arm64.whl
- Upload date:
- Size: 822.9 kB
- Tags: CPython 3.14, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afae5ac0c5246b982b3a4efda78a7a4dbf538c04af3d6020d0e7e0bcbbaa1aa2
|
|
| MD5 |
9160a9187c2e435c68026bf8aa9c399f
|
|
| BLAKE2b-256 |
51b8a9c4e63a89410093d8144728e3c7c49e7edc32f5715af90429887d6b2fe5
|
File details
Details for the file numpack-0.5.1-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: numpack-0.5.1-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 745.6 kB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b967745972ae5cb2b418afc4682d8e40da6abf28866841710e98bf8d74044be1
|
|
| MD5 |
1532968f71d2f3b3cdb9db55c05de10d
|
|
| BLAKE2b-256 |
b41f7eef3bda35edf439690397886ccb29b2ccf456898bd1d1da9a4f85b40df2
|
File details
Details for the file numpack-0.5.1-cp313-cp313-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: numpack-0.5.1-cp313-cp313-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 13.3 MB
- Tags: CPython 3.13, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11cb003bf6e02fd8ae151e104b1eb1bb4334f6a32f07837f0b56769d7983023a
|
|
| MD5 |
a72fe7eb26487816f4298a965a6f558e
|
|
| BLAKE2b-256 |
72ff7aeab3dadf03aaae3b897bba4fede45d101faf11f9deb6b4daffd0958809
|
File details
Details for the file numpack-0.5.1-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.5.1-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 824.9 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc7f424356ab183e3472beeb3103988e047bb56ce19bcd23c2705dd96dcab725
|
|
| MD5 |
5868f91d736fbf1a813a5077a9264731
|
|
| BLAKE2b-256 |
fdfa9ed7f96308e1ed0c416d2a2ed26a9fbab325dd8c62ea61139c6530d31d9b
|
File details
Details for the file numpack-0.5.1-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: numpack-0.5.1-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 745.9 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9464b7b4d7710197a1736e7bc2d99d690160817f4f49c39286ab16a92e9ad349
|
|
| MD5 |
8c7091c4d98d32dd565bd822747b08ce
|
|
| BLAKE2b-256 |
98aaf5567e4640409e22047fcbb5be13a300ceb9f7025102f3cd4c49e995c3a6
|
File details
Details for the file numpack-0.5.1-cp312-cp312-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: numpack-0.5.1-cp312-cp312-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 13.3 MB
- Tags: CPython 3.12, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08bc6588f032f28e62ae6d1bad6547ec712df08991c81b86294e63d43f6d3a84
|
|
| MD5 |
d329f417cda02e0d38f00465ed9386e1
|
|
| BLAKE2b-256 |
e6d4c748de67a134a0c0f8123e45f6da06d4aaa5d07eb488729e700116488f10
|
File details
Details for the file numpack-0.5.1-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.5.1-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 825.3 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be84693849a3a6cc500aa004d70eae05eaf928c3979c1a21fbf36c352a7f8052
|
|
| MD5 |
9754cc2a9fb63e30af64d3855cb6c5d1
|
|
| BLAKE2b-256 |
6c4f8a8865f2b2c4ef6cf6467adf35b2ce075df60f779a18f2143dc21427ef0c
|
File details
Details for the file numpack-0.5.1-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: numpack-0.5.1-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 745.7 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef3a517fcfede939cc10f9520d3741be2a890c46851dbd0793e8f41263a9dafb
|
|
| MD5 |
291a5bc861f3a327899d1343540dabe2
|
|
| BLAKE2b-256 |
2fc9fe6f30ae528695b89635547056be91281877237cef23a5d8e5b7cb2a500f
|
File details
Details for the file numpack-0.5.1-cp311-cp311-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: numpack-0.5.1-cp311-cp311-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 13.3 MB
- Tags: CPython 3.11, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df0881b9617a0a2a7ba4d95da6515e0eefaed8625de0555c98e651ff46ef7cd4
|
|
| MD5 |
502029b0ba990129460d02046a19d2a7
|
|
| BLAKE2b-256 |
d9dfaa9d42b287b07a48f3684a733ced89518cef1a108e2fac85453947422006
|
File details
Details for the file numpack-0.5.1-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.5.1-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 825.9 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9079b73a1c0b0905b15470075e0c90559776cd9498d90d15e13d305e51343093
|
|
| MD5 |
663af4a255a80c6e58247ca09444e1e6
|
|
| BLAKE2b-256 |
f1fabd1c877a2cdc382e4f6e10eadc609e96ad0c9b2bc760624b7009eeaa6dfe
|
File details
Details for the file numpack-0.5.1-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: numpack-0.5.1-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 745.9 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4aa0a17bdb8c929a40ed9f8f878c633d0bdef2056cedf9173e1608fb5a6b335
|
|
| MD5 |
dcab111a714b8f3d8af98c4fec8afa11
|
|
| BLAKE2b-256 |
837a1472bcd0e2550742e459628ed8a33a265e2bd2295e574d43bd6ff66cb3dc
|
File details
Details for the file numpack-0.5.1-cp310-cp310-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: numpack-0.5.1-cp310-cp310-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 13.3 MB
- Tags: CPython 3.10, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f282d6a8bd6ec80ee0bd9c2bfdcd9ccccf1f3a8c1761e2e8afb54622341a0a6b
|
|
| MD5 |
6a31a597cfeca80d956f28fce761d1cd
|
|
| BLAKE2b-256 |
a0730d79805ae027e3cc87e1278a95ac311c1a6f51f162937e29b45915307716
|
File details
Details for the file numpack-0.5.1-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.5.1-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 826.0 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
769c616b9905c26d7e970dc2b5d902ab883c85c199dee761b651b93b9618e869
|
|
| MD5 |
e5ba968742423b9f0cd931755859e9ca
|
|
| BLAKE2b-256 |
2d2795e05cbc588eb707e4cbb3de7c1b01590fe333e68d401f81204240be0432
|
File details
Details for the file numpack-0.5.1-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: numpack-0.5.1-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 745.9 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c452e2b77eeeb3ff55aa5375d87ea0447d3d4f0a982a0b1a5b4e7f4a2fcaf463
|
|
| MD5 |
b51640326d2cc2f206e13846dec3822b
|
|
| BLAKE2b-256 |
53d8fbd507aa6e8e231457ab09fdb206dffc6205c632ebe169381d7e466e2334
|
File details
Details for the file numpack-0.5.1-cp39-cp39-manylinux_2_38_x86_64.whl.
File metadata
- Download URL: numpack-0.5.1-cp39-cp39-manylinux_2_38_x86_64.whl
- Upload date:
- Size: 13.3 MB
- Tags: CPython 3.9, manylinux: glibc 2.38+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a9c6e760284b776699b3cf70b163660aa0cf14d5b42a5b4bd95ecfaf7445634
|
|
| MD5 |
ac73a1f203b83e15d4ba781c13948750
|
|
| BLAKE2b-256 |
321f1619efb41c45e152666307554a73e490d29952dfcc14409905aff3f06cc6
|
File details
Details for the file numpack-0.5.1-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: numpack-0.5.1-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 825.8 kB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06b9f015e219762559940d655a3b1d401c32b26b281cdefa9b3d268aa90178e7
|
|
| MD5 |
b79b0cd808a00fcbd0f08b6f9034cd4c
|
|
| BLAKE2b-256 |
e0e7959757e4df3e92e07aa4277020e0c1f2217e704385f8d39d7bb0897cc23c
|