A variable-length unsinged integer array
Project description
VarUIntArray
A NumPy subclass for working with variable-length unsigned integers that don't fit standard machine word sizes.
Overview
VarUIntArray extends numpy.ndarray to handle arbitrary bit-width unsigned integers (e.g., 3-bit, 10-bit, 12-bit) while correctly managing padding bits when using NumPy's universal functions (ufuncs). This is particularly useful when working with:
- Custom binary formats with non-standard word sizes
- Packed bit arrays where words don't align to 8, 16, 32, or 64 bits
- Data structures that require precise bit-width control
Key Features
- Arbitrary Word Sizes: Support for any word size from 1 to 64 bits
- Automatic Padding Management: Correctly handles padding bits in bitwise operations
- NumPy Integration: Works seamlessly with NumPy ufuncs and array operations
- Pack/Unpack Operations: Convert between bit arrays and packed integer arrays
Installation
This module can be installed from PyPi:
pip install varuintarray
Quick Start
Create a VarUIntArray with 10-bit words
>>> arr = VarUIntArray([1, 2, 1023], word_size=10)
>>> arr
VarUIntArray([ 1, 2, 1023], dtype='>u2', word_size=10)
Bitwise operations respect word_size
>>> inverted = arr.invert()
>>> inverted
VarUIntArray([1022, 1021, 0], dtype='>u2', word_size=10)
Unpack to individual bits
>>> bits = arr.unpackbits()
>>> bits.shape
(3, 10)
Pack bits back into words
>>> packed = VarUIntArray.packbits(bits)
>>> packed
VarUIntArray([ 1, 2, 1023], dtype='>u2', word_size=10)
Core Concepts
Word Size vs Machine Size
Standard computers work with word sizes of 8, 16, 32, or 64 bits. When you need a 10-bit word, it must be stored in a 16-bit container, leaving 6 padding bits unused. VarUIntArray automatically:
- Selects the appropriate machine word size (8, 16, 32, or 64 bits)
- Tracks the actual word size you care about
- Ensures padding bits are handled correctly in operations
Padding Bit Handling
The most important feature is correct handling of padding bits during bitwise operations. For example:
# 3-bit word stored in 8-bit container
>>> arr = VarUIntArray([5], word_size=3) # Binary: 101
# Standard NumPy invert would give 11111010 (250)
# VarUIntArray.invert() gives 010 (2) - correct for 3-bit word
>>> inverted = arr.invert()
>>> int(inverted[0])
2
API Reference
VarUIntArray Class
Constructor
VarUIntArray(input_array, word_size)
Parameters:
input_array: Array-like data to convertword_size: Number of significant bits per word (1-64)
Methods
invert(): Bitwise invert respecting word_sizeunpackbits(): Unpack to individual bits (adds one dimension)packbits(data): Class method to pack bit array into VarUIntArrayto_dict(): Serialize to a dictionaryfrom_dict(data): Static method to deserialize from a dictionaryto_json(): Serialize to a JSON stringfrom_json(string): Class method to deserialize from a JSON string
Attributes
word_size: Number of significant bits per word
Functions
unpackbits(array)
Unpack a VarUIntArray into individual bits, excluding padding.
>>> arr = VarUIntArray([5, 3], word_size=3)
>>> unpackbits(arr)
array([[1, 0, 1],
[0, 1, 1]], dtype=uint8)
Parameters:
array: VarUIntArray to unpack
Returns: ndarray with shape (*original_shape, word_size)
packbits(array)
Pack a bit array into a VarUIntArray.
>>> bits = np.array([[1, 0, 1], [0, 1, 1]], dtype=np.uint8)
>>> packbits(bits)
VarUIntArray([5, 3], dtype=uint8, word_size=3)
Parameters:
array: ndarray of uint8 containing 0s and 1s, where the last dimension contains bits for each word
Returns: VarUIntArray with one fewer dimension
VarUIntArray.to_dict()
Serialize VarUIntArray to JSON-compatible dictionary.
>>> arr = VarUIntArray([1, 2, 3], word_size=10)
>>> arr.to_dict()
{'word_size': 10, 'values': [1, 2, 3]}
VarUIntArray.from_dict(data)
Convert various formats to VarUIntArray.
# From dictionary
>>> VarUIntArray.from_dict({'values': [1, 2, 3], 'word_size': 10})
VarUIntArray([1, 2, 3], dtype='>u2', word_size=10)
VarUIntArray.to_json()
Serialize VarUIntArray to a JSON string.
>>> arr = VarUIntArray([1, 2, 3], word_size=10)
>>> arr.to_json()
'{"word_size": 10, "values": [1, 2, 3]}'
VarUIntArray.from_json(string)
Deserialize a VarUIntArray from a JSON string.
>>> json_str = '{"word_size": 10, "values": [1, 2, 3]}'
>>> VarUIntArray.from_json(json_str)
VarUIntArray([1, 2, 3], dtype='>u2', word_size=10)
Use Cases
Custom Binary Protocols
Working with network protocols or file formats that use non-standard bit widths:
# 12-bit color values (common in some image formats)
>>> colors = VarUIntArray([4095, 2048, 0], word_size=12)
Bit Manipulation
Performing bitwise operations on packed data:
>>> data = VarUIntArray([0b1010, 0b0101], word_size=4)
>>> mask = VarUIntArray([0b1100, 0b0011], word_size=4)
>>> result = data & mask # Bitwise AND
Implementation Details
Memory Layout
- VarUIntArray uses big-endian byte order (
'>'dtype prefix) for consistency. - Data is stored in the smallest standard NumPy unsigned integer type that can hold the specified word_size.
Limitations
- Maximum word size: 64 bits
- Only unsigned integers are supported
- The
axisparameter is not supported fornp.unpackbitson VarUIntArray
Examples
Complete Workflow
>>> import numpy as np
>>> from varuintarray import VarUIntArray
# Create some 5-bit values
>>> data = VarUIntArray([31, 16, 0, 15], word_size=5)
# Unpack to bits
>>> bits = data.unpackbits()
>>> bits
array([[1, 1, 1, 1, 1],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 1, 1, 1, 1]], dtype=uint8)
# Flip specific bits
>>> bits[:, 0] = 1 - bits[:, 0] # Flip first bit
# Pack back
>>> result = VarUIntArray.packbits(bits)
>>> result
VarUIntArray([15, 0, 16, 31], dtype=uint8, word_size=5)
# Bitwise operations
>>> result.invert()
VarUIntArray([16, 31, 15, 0], dtype=uint8, word_size=5)
Serialization
>>> from varuintarray import VarUIntArray
>>> import json
# Serialize dict
>>> arr = VarUIntArray([100, 200, 300], word_size=12)
>>> serialized = arr.to_dict()
>>> serialized
{'word_size': 12, 'values': [100, 200, 300]}
# Deserialize dict
>>> VarUIntArray.from_dict(serialized)
VarUIntArray([100, 200, 300], dtype='>u2', word_size=12)
# Serialize JSON
>>> serialized = arr.to_json()
>>> serialized
'{"word_size": 12, "values": [100, 200, 300]}'
# Deserialize JSON
>>> VarUIntArray.from_json(serialized)
VarUIntArray([100, 200, 300], dtype='>u2', word_size=12)
License
varuintarray is licensed under the MIT License - see the LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file varuintarray-1.0.5.tar.gz.
File metadata
- Download URL: varuintarray-1.0.5.tar.gz
- Upload date:
- Size: 46.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6965110d835945b35ceac8f350bd99f120ddd9af1c616b9d3be6d33987e6ea46
|
|
| MD5 |
27180a17c09ebbaf6beec1c7deb1314e
|
|
| BLAKE2b-256 |
afa901944c9357844931ae979a80946b6391e08a5498395f423b7a323bd0349d
|
File details
Details for the file varuintarray-1.0.5-py3-none-any.whl.
File metadata
- Download URL: varuintarray-1.0.5-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eed179a98363e396116ddd083acca20c2b3e3b1f9a00dc5acd209c6bf84cd943
|
|
| MD5 |
8216e62d7d2682a6cf72c0c3c2cc8689
|
|
| BLAKE2b-256 |
0848ad5b147be66cc2680fa5f19d2abce1fe68623276284c6ca801c530943633
|