Skip to main content

PyArrow extension array of unsigned integers with arbitrary fixed bit width N in [1, 64].

Project description

fletchr-uintn

A PyArrow extension type for unsigned integers of arbitrary fixed bit width N ∈ [1, 64]. The bit width lives in the Arrow type rather than as sidecar schema metadata, so mismatched widths fail loudly on concat, the width survives slice / cast / IPC / Parquet round-trips, and any column-level operation that wants to know "how many bits does this hold" reads it off column.type.bits.

Why?

PyArrow's built-in uint8 / uint16 / uint32 / uint64 cover only the four power-of-two widths native to most CPUs. Protocol and binary formats routinely use other widths (10, 12, 14, 24, 48), and the usual workarounds — over-allocating (uint16 for a 12-bit field) or passing the width out-of-band in schema metadata — either lose the constraint on bitwise ops or drop it on the next slice. fletchr.uintn(bits=N) puts the width in the type and ships bit-width-safe kernels that keep padding bits zero across every operation.

Features

  • Storage in the smallest native uint8 / uint16 / uint32 / uint64 container that fits N; padding bits above N are kept zero across construction and every operation.
  • Lossless round-trip through Arrow IPC, Arrow Flight, and Parquet. Readers that don't have the extension registered see the raw uintN storage transparently — no exotic types in the wire format.
  • Full Arrow null support via the standard validity bitmap.
  • Bit-width-safe bitwise operators (~, &, |, ^, shifts, popcount, bit-reversal) — padding bits never leak.
  • Cross-language wire format pinned in SPEC.md so Arrow readers in Java, C++, Go, R, JavaScript, etc. can implement compatible deserializers.

Install

uv add fletchr-uintn        # or: pip install fletchr-uintn

Requires Python 3.9+, NumPy 2.0+, and PyArrow 17+.

Quickstart

import pyarrow as pa
import pyarrow.parquet as pq
from fletchr_uintn import uintn_array

# 12-bit values — fit in a uint16 container, but the type knows it's 12 bits.
a = uintn_array([0, 1, 4095, None, 100], bits=12)
a.type            # UIntNType(bits=12)
a.to_pylist()     # [0, 1, 4095, None, 100]

# Bitwise ops respect the declared width: ~0 is 4095, not 65535.
(~a).to_pylist()  # [4095, 4094, 0, None, 3995]

# Composes as a column inside any pa.Table; round-trips through Parquet.
pq.write_table(pa.table({"x": a}), "out.parquet")
back = pq.read_table("out.parquet").column("x")
assert back.type.bits == 12

# Mismatched bit widths fail at the Arrow type system, not silently:
pa.concat_arrays([a, uintn_array([0, 1], bits=10)])  # raises ArrowInvalid

Public API

from fletchr_uintn import (
    UIntNType,        # the pa.ExtensionType
    UIntNArray,       # the pa.ExtensionArray (with bitwise methods)
    uintn_array,      # validated factory; dispatches on input type
    pack_bits,        # inverse of UIntNArray.unpack_bits
)

The extension type registers itself on import, so any pa.Table deserialized after import fletchr_uintn will surface UIntNArray columns instead of raw uintN storage.

Links

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fletchr_uintn-0.0.1rc3.tar.gz (20.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fletchr_uintn-0.0.1rc3-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file fletchr_uintn-0.0.1rc3.tar.gz.

File metadata

  • Download URL: fletchr_uintn-0.0.1rc3.tar.gz
  • Upload date:
  • Size: 20.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fletchr_uintn-0.0.1rc3.tar.gz
Algorithm Hash digest
SHA256 7c7f7b09ed9a0e7cdcd1a32566d2f10be44aa1768398e5096deefe18b1a8e04b
MD5 e52063b6b868ab1a4a70cc590057bcb2
BLAKE2b-256 576dce0a2655f43c596200a4139702629cfcfcbeb5b335f68514408afea558ca

See more details on using hashes here.

File details

Details for the file fletchr_uintn-0.0.1rc3-py3-none-any.whl.

File metadata

  • Download URL: fletchr_uintn-0.0.1rc3-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for fletchr_uintn-0.0.1rc3-py3-none-any.whl
Algorithm Hash digest
SHA256 6df1cace5851e69c1be8e7b90d3974605f018b3c6ff1a56ef91c5e51952ad233
MD5 caaa1e009e20d05121392d385a4fbae8
BLAKE2b-256 b7c2668f60c690e1c352b228a47eef3cf2d57e438cbab7989d36a9588f3e0822

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page