Skip to main content

NumPy structured array utilities — joining, flattening, field views, enum mapping, and position arrays

Project description

NumPy Utils

NumPy structured array utilities — dtype construction, field views, joining, enum mapping, position arrays, and byte/string conversion for C++ interop.

Installation

pip install vcti-nputils>=1.4.0

In pyproject.toml dependencies

dependencies = [
    "vcti-nputils>=1.4.0",
]

Quick Start

import numpy as np
from vcti.nputils import (
    as_ndarray,
    check_overflow,
    decode_field,
    drop_fields,
    encode_field,
    fields_view,
    flatten_dtype,
    join_struct_arrays,
    merge_adjacent_fields,
    name_array,
    position_array,
    rename_fields,
    structured_dtype,
    with_encoding,
)

# Join structured arrays horizontally
dt1 = np.dtype([('id', 'i4'), ('value', 'f8')])
dt2 = np.dtype([('name', 'U10')])
arr1 = np.array([(1, 1.5), (2, 2.5)], dtype=dt1)
arr2 = np.array([('Alice',), ('Bob',)], dtype=dt2)
joined = join_struct_arrays([arr1, arr2])
# dtype: [('id', 'i4'), ('value', 'f8'), ('name', 'U10')]

# Create a zero-copy view with selected fields
view = fields_view(joined, ['id', 'name'])

# Drop fields from a structured array (zero-copy)
clean = drop_fields(joined, ['value'])

# Build a structured dtype from a scalar dtype + names
coord_dt = structured_dtype('f8', ['x', 'y', 'z'])
# dtype([('x', '<f8'), ('y', '<f8'), ('z', '<f8')])

# Rename fields in a dtype
new_dt = rename_fields(dt1, {'id': 'node_id', 'value': 'temperature'})

# Flatten array fields into individual columns (default naming)
dt = np.dtype([('id', 'i4'), ('coords', 'f8', (3,))])
_, cols = flatten_dtype(dt)
# cols: ['id', 'coord_0', 'coord_1', 'coord_2']

# Flatten with explicit per-field names
_, cols = flatten_dtype(dt, field_names={'coords': ['x', 'y', 'z']})
# cols: ['id', 'x', 'y', 'z']

# Flatten with a custom format string
_, cols = flatten_dtype(dt, fmt="{name}[{dim}]")
# cols: ['id', 'coord[0]', 'coord[1]', 'coord[2]']

# Merge adjacent 'S' fields into one (pure dtype view). Multiple merges
# can be specified at once; same-field overlap and name collisions are
# validated before anything is returned.
dt = np.dtype([
    ('first', 'S4'), ('last', 'S6'),
    ('city', 'S8'), ('state', 'S2'),
    ('age', 'i4'),
])
merged = merge_adjacent_fields(dt, {
    'name':    ['first', 'last'],
    'address': ['city', 'state'],
})
# dtype([('name', 'S10'), ('address', 'S10'), ('age', '<i4')])

# Map numeric enum values to names
enum_dict = {1: 'ACTIVE', 2: 'INACTIVE', 3: 'PENDING'}
names = name_array(np.array([1, 2, 1, 3]), enum_dict)

# Convert counts to cumulative offsets
offsets = position_array(np.array([3, 2, 4, 1]))
# array([0, 3, 5, 9, 10])

# Safely coerce inputs to ndarray
arr = as_ndarray([1, 2, 3], dtype=np.float64)
empty = as_ndarray(None)  # array([], dtype=float64)

# Byte <-> string conversion for C++/pybind11 interop
dt = np.dtype([('name', 'S10'), ('name_length', 'i4')])
sa = np.zeros(2, dtype=dt)
encode_field(sa, 'name', ['Alice', 'Bob'], length_field='name_length')
decoded = decode_field(sa, 'name')
overflow = check_overflow(sa, 'name', 'name_length')

# Attach encoding to a dtype so decode_field/encode_field use it automatically
name_dt = with_encoding(np.dtype('S32'), 'latin-1')

Module layout

Each category lives in its own module. All public functions are re-exported from vcti.nputils.

Module Functions
dtype_utils structured_dtype, flatten_dtype (+ flatten_record_dtype alias), merge_adjacent_fields, rename_fields
view_utils fields_view, drop_fields
join_utils join_struct_arrays
mapping_utils name_array
offset_utils position_array
coerce_utils as_ndarray
byte_utils string_from_bytes, bytes_from_string, decode_column, encode_column, decode_field, encode_field, check_overflow, get_encoding, with_encoding

Functions

Dtype construction & transformation

Function Purpose
structured_dtype(dtype, names) Build a structured dtype from a scalar or subdtype plus field names
flatten_dtype(dt, *, field_names, fmt, strip_plural) Expand array fields into scalars with flexible naming
flatten_record_dtype(dt, ...) Legacy alias for flatten_dtype
merge_adjacent_fields(dt, merges) Merge one or more groups of adjacent 'S' fields into a single field each (pure dtype view)
rename_fields(dt, mapping) Return a new dtype with fields renamed

Zero-copy views

Function Purpose
fields_view(sa, fields) View containing only the selected fields
drop_fields(sa, exclude) View containing all fields except those excluded

Joining

Function Purpose
join_struct_arrays(arrays) Join structured arrays horizontally by combining fields

Mapping, offsets, coercion

Function Purpose
name_array(nparray, enum_dict, default) Map numeric values to string names
position_array(counts, dtype) Convert count array to cumulative offset array
as_ndarray(value, dtype) Coerce None, list, or ndarray to ndarray

Byte / string conversion (pybind11 interop)

Function Purpose
string_from_bytes(value, encoding) Decode a single bytes value, stripping null padding
bytes_from_string(value, length, encoding) Encode to fixed-length bytes (pad or truncate)
decode_column(byte_array, encoding) Vectorized decode of a byte column to strings
encode_column(strings, length, encoding) Vectorized encode to (bytes, lengths)
decode_field(sa, field_name, *, encoding) Decode a byte field in a structured array
encode_field(sa, field_name, strings, *, length_field, encoding) Encode strings into a byte field, optionally populating a paired length field
check_overflow(sa, field_name, length_field) Detect rows where the original encoded byte length exceeded the field
get_encoding(dtype, default) Read encoding from dtype.metadata['encoding']
with_encoding(dtype, encoding) Attach encoding to a scalar dtype via metadata

Dependencies

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcti_nputils-1.4.0.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcti_nputils-1.4.0-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file vcti_nputils-1.4.0.tar.gz.

File metadata

  • Download URL: vcti_nputils-1.4.0.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vcti_nputils-1.4.0.tar.gz
Algorithm Hash digest
SHA256 9dca7dfcebd60970552cac6e3666ee261053f7a689a120460b3849d3345c9099
MD5 aff01e71f66346ad140a8551e4dc1315
BLAKE2b-256 0f3261f1bd972b39aa4a576e6b7f3db7310ca30072a8145d17e975cb7bbad38a

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_nputils-1.4.0.tar.gz:

Publisher: publish.yml on vcollab/vcti-python-nputils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vcti_nputils-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: vcti_nputils-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vcti_nputils-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9c7309163eff616795bc13c0a54a552f304dc5b7f9dc7289473f42921d8726c2
MD5 d18b20de7e5e2714519cdbe5804ded4e
BLAKE2b-256 9f1ecfb5c7b231d7d5cb753b9aeb8f9b5b9b8df5af5cfd4b07a34fe9ee94e41c

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_nputils-1.4.0-py3-none-any.whl:

Publisher: publish.yml on vcollab/vcti-python-nputils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page