Skip to main content

Presentation layer for NumPy structured arrays — column filtering, dtype flattening, enum mapping, array slicing, and adapters

Project description

Array Display

Presentation layer for NumPy arrays — structured, 1-D, or 2-D.

ArrayDisplay wraps a NumPy array reference (no copy) and provides seven capabilities for controlling how the data is presented. Plain 1-D and 2-D arrays are reinterpreted as structured via zero-copy np.ndarray.view() under the hood.

  1. Column filtering — hide columns by name, regex, or predicate
  2. Dtype flattening — expand vector/matrix fields into scalar columns with custom component names
  3. Column grouping — bundle related columns under a shared label for two-level header rendering
  4. Enum mapping — lazily map integer ID columns to human-readable names
  5. Row index — designate a row identifier (column name, group name, or external ndarray)
  6. Array slicing — create lightweight index arrays for bounded access
  7. Adapters — render as text table or pandas DataFrame on bounded slices

Installation

pip install vcti-array-display>=3.1.0

With pandas support (optional):

pip install vcti-array-display[pandas]>=3.1.0

Input Shapes

ArrayDisplay accepts three input shapes. Structured arrays are used as-is; plain 1-D and 2-D arrays are reinterpreted as structured via np.ndarray.view() (zero-copy for C-contiguous input).

import numpy as np
from vcti.arraydisplay import ArrayDisplay

# Structured array — fields already named
dt = np.dtype([('id', 'i4'), ('value', 'f8')])
arr = np.array([(1, 10.0), (2, 20.0)], dtype=dt)
view = ArrayDisplay(arr)                # view.view_columns == ['id', 'value']

# Plain 1-D — single column, default label "value"
view = ArrayDisplay(np.array([1.5, 2.0, 3.5]))
# view.view_columns == ['value']
view = ArrayDisplay(np.array([1.5, 2.0, 3.5]), field_names=['temperature'])

# Plain 2-D — auto-named col_0, col_1, ...
arr2 = np.random.rand(1000, 3)
view = ArrayDisplay(arr2)               # ['col_0', 'col_1', 'col_2']
view = ArrayDisplay(arr2, field_names=['x', 'y', 'z'])           # explicit
view = ArrayDisplay(arr2, field_name_prefix='comp')              # ['comp_0', ...]

Non-contiguous 2-D arrays raise ValueError — call np.ascontiguousarray() first if you accept the copy cost. ndim > 2 is rejected explicitly; reshape first.


1. Column Filtering

Hide unwanted columns using any combination of exact names, regex patterns, and callable predicates.

import re
from vcti.arraydisplay import ArrayDisplay, FILLER_COLUMNS, LENGTH_COLUMNS, VOID_COLUMNS

dt = np.dtype([
    ('id', 'i4'), ('f0', 'V4'), ('label', 'U20'),
    ('label_len', 'i4'), ('value', 'f8'), ('f3', 'V8'),
])
arr = np.zeros(10, dtype=dt)

# Pre-built patterns for C++ interop noise
view = ArrayDisplay(arr, exclude_columns=[FILLER_COLUMNS, LENGTH_COLUMNS])
view.view_columns  # ['id', 'label', 'value']

# Or exclude by dtype — catches all void padding regardless of name
view = ArrayDisplay(arr, exclude_columns=[VOID_COLUMNS, LENGTH_COLUMNS])

# Mix exact names, regex, and callables
view = ArrayDisplay(arr, exclude_columns=[
    FILLER_COLUMNS,                        # regex: ^f\d+$
    LENGTH_COLUMNS,                        # regex: _len$
    "debug_flag",                          # exact name
    re.compile(r"^tmp_"),                  # custom regex
    lambda name, dtype: dtype.kind == 'V', # by dtype
])

No magic defaults — ArrayDisplay(arr) shows all columns. Pre-built patterns FILLER_COLUMNS, LENGTH_COLUMNS, and VOID_COLUMNS are opt-in. Callables receive (name, dtype) for filtering by type, shape, or both.

2. Dtype Flattening

Expand vector and matrix fields into individual scalar columns, with optional user-defined component names.

dt = np.dtype([('id', 'i4'), ('position', 'f8', (3,))])
arr = np.zeros(5, dtype=dt)

# Default numeric suffixes (flatten_dtype=True is the default)
view = ArrayDisplay(arr)
view.view_columns  # ['id', 'position_0', 'position_1', 'position_2']

# Custom component names
view = ArrayDisplay(arr, component_names={'position': ['x', 'y', 'z']})
view.view_columns  # ['id', 'position_x', 'position_y', 'position_z']

# Opt out of flattening to preserve the original dtype
view = ArrayDisplay(arr, flatten_dtype=False)
view.view_columns  # ['id', 'position']  (single column, not useful for display)

# Component grouping is tracked for consumers (e.g., header spanning)
view.field_components
# {'position': ['position_x', 'position_y', 'position_z']}

# Rename after construction
view.set_component_names('position', ['lat', 'lon', 'alt'])

Note: Default flattened names (e.g., position_0) are generated by vcti-nputils and may truncate long field names. Use component_names to ensure predictable, readable column names regardless of the defaults.

3. Column Grouping

Bundle existing columns under a shared label. Rendered as a level-1 header in to_table() spanning the group's member columns. Purely visual — does not by itself affect to_dataframe(). Multiple groups may coexist.

dt = np.dtype([
    ('element_idx', 'i4'), ('face_num', 'i4'),
    ('stress', 'f8'),
])
arr = np.array([(1, 0, 34.2), (1, 1, 45.3), (2, 0, 67.8)], dtype=dt)

view = ArrayDisplay(arr)
view.add_column_group('loc', ['element_idx', 'face_num'])

print(view.to_table())
#          loc           |
# element_idx | face_num | stress
# ------------+----------+-------
#           1 |        0 |   34.2
#           1 |        1 |   45.3
#           2 |        0 |   67.8

Group names must not collide with dtype field names or existing group names. remove_column_group(name) removes a group; if it happens to be the current row index, the index is cleared as well.

4. Enum Mapping

Map integer ID columns to human-readable names. The mapping is stored as a recipe and resolved lazily — only when presenting a bounded slice.

dt = np.dtype([('id', 'i4'), ('element_type', 'i4'), ('value', 'f8')])
arr = np.array([(1, 1, 10.5), (2, 2, 20.3), (3, 1, 15.0)], dtype=dt)
view = ArrayDisplay(arr)

view.add_name_columns({
    'element_type_name': ('element_type', {1: 'QUAD', 2: 'HEX'}),
})
view.view_columns  # ['id', 'element_type_name', 'value']

# The mapping is NOT materialized yet — it's a recipe:
view.name_columns
# {'element_type_name': ('element_type', {1: 'QUAD', 2: 'HEX'})}

5. Row Index

Designate what each row represents. Becomes the pandas Index / MultiIndex in to_dataframe() and renders left-most in to_table(). One row index is active at a time.

set_index(source) accepts three forms of source:

Single column (dtype field)

dt = np.dtype([('node_id', 'i4'), ('value', 'f8')])
arr = np.array([(1, 10.0), (2, 20.0), (3, 30.0)], dtype=dt)

view = ArrayDisplay(arr, index='node_id')
# to_table: "node_id" left-most, single-level header
# to_dataframe: pandas Index named "node_id"

Column group name (multi-column via a group)

view = ArrayDisplay(arr)
view.add_column_group('face_id', ['element_idx', 'face_num'])
view.set_index('face_id')

# to_table: "face_id" level-1 over (element_idx, face_num), then data
# to_dataframe: pandas MultiIndex on (element_idx, face_num)

External ndarray (shared across multiple ArrayDisplays)

face_id = np.array(
    [(1, 0), (1, 1), (2, 0)],
    dtype=[('element_idx', 'i4'), ('face_num', 'i4')],
)
stress_arr = np.array([(34.2, 12.1), (45.3, 8.7), (67.8, 22.4)],
                      dtype=[('xx', 'f8'), ('yy', 'f8')])
strain_arr = np.array([(0.01, 0.02), (0.03, 0.01), (0.02, 0.01)],
                      dtype=[('xx', 'f8'), ('yy', 'f8')])

# One identifier, two displays — no duplication of index data
stress = ArrayDisplay(stress_arr, index=face_id)
strain = ArrayDisplay(strain_arr, index=face_id)

Notes:

  • Raw values are used — enum name mappings (add_name_columns) don't affect the pandas index; this keeps .loc lookups stable
  • A level-1 header appears only when the index is a group name (the group's label is shown); single-column dtype-field index and external ndarray index produce single-level output
  • Plain 1-D external index uses "id" as the column label

6. Array Slicing

Create lightweight index arrays for bounded access. The index array is tiny regardless of the underlying array size.

idx = view.to_slice(head=20)                       # first 20 rows
idx = view.to_slice(tail=10)                       # last 10 rows
idx = view.to_slice(head=10, tail=5)               # first 10 + last 5
idx = view.to_slice(mask=arr['value'] > 50)        # boolean filter
idx = view.to_slice(indices=np.array([0, 100, 999]))  # explicit

view.array[idx]                                    # sliced data

7. Adapters

Text table (pure numpy — no extra dependencies)

print(view.to_table(head=10, tail=5))
# id | element_type_name | value
# ---+-------------------+------
#  1 | QUAD              |  10.5
#  2 | HEX               |  20.3
# ...
# [1000 rows x 3 columns]

pandas DataFrame (optional dependency)

df = view.to_dataframe()                   # full array
df = view.to_dataframe(head=100)           # first 100 rows
df = view.to_dataframe(resolve_names=False)  # raw ID columns

Enum names are materialized using pd.Categorical for memory efficiency.

Jupyter notebooks

ArrayDisplay provides _repr_html_() — displays a bounded row window with enum names resolved automatically.


Performance

Designed for large CAE arrays (millions of rows, GBs of data):

  • No array copy — wraps a reference to the original numpy array
  • Lazy enum resolution — resolved only on the displayed slice
  • Index arraysto_slice() returns kilobytes regardless of array size
  • Bounded presentationto_table() and to_dataframe(head=N) never touch the full array
  • pd.Categorical — ~100x less memory than string columns for enum values

API Summary

Cost column:

  • Cheap — metadata only; no iteration over rows.
  • Moderate — O(displayed rows × cols) or O(N) one-off; safe for bounded slices.
  • Heavy — O(N × C) with allocation; avoid on large arrays unless you need it.
Method Pillar Cost Description
set_view_columns(...) Filtering Cheap Configure visible columns
include_view_columns(cols) Filtering Cheap Add columns to the view
exclude_view_columns(cols) Filtering Cheap Remove columns from the view
replace_view_columns(mapping) Filtering Cheap Rename columns in the view
set_array(arr, ...) Flattening Cheap Set array; vector fields auto-flatten via zero-copy view
set_component_names(field, names) Flattening Cheap Rename flattened components
add_column_group(name, cols) Grouping Cheap Register a visual column group
remove_column_group(name) Grouping Cheap Remove a group
add_name_columns(mappings) Enum mapping Cheap Register lazy enum recipes
set_index(source) Row index Cheap Set row identifier — column name, group name, or ndarray
clear_index() Row index Cheap Remove the row identifier
to_slice(head=H, tail=T) Slicing Cheap O(H + T) index array
to_slice(mask=M) Slicing Moderate O(N) boolean scan
to_table(head=H, tail=T) Adapter Moderate O((H + T) × cols)
to_dataframe(head=H) Adapter Moderate O(H × cols)
to_dataframe() (no bounds) Adapter Heavy O(N × C) pandas materialization
copy() Misc Cheap Shared array, config deep-copied

See docs/performance.md for full complexity analysis.

Note on to_dataframe(): Without head/tail/indices bounds, this method copies the full array into pandas — pandas cannot share memory across heterogeneous dtypes, so expect ~2× memory usage during conversion. Use the bounded form view.to_dataframe(head=100) for inspection and analysis; reserve the unbounded form for genuine full-array export (parquet / CSV / etc).

Examples

See examples/full_pipeline.py for a complete end-to-end script demonstrating all seven pillars.

Dependencies

  • numpy (>=1.24) — required
  • vcti-nputils (>=1.0.0) — required
  • pandas (>=2.0) — optional, for to_dataframe() and Jupyter display

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcti_array_display-3.1.0.tar.gz (29.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcti_array_display-3.1.0-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file vcti_array_display-3.1.0.tar.gz.

File metadata

  • Download URL: vcti_array_display-3.1.0.tar.gz
  • Upload date:
  • Size: 29.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vcti_array_display-3.1.0.tar.gz
Algorithm Hash digest
SHA256 6a4dad6b9ee670c2026af20f2de3348dce8b782ef77c5d14229a33bd40910afc
MD5 066044d1e0cf1721b576ecfd88dd227d
BLAKE2b-256 8c7aafbae54ec7f3d8fd64f9c958e3f87e075d4fd771d3a6ab10ea72806a5e02

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_array_display-3.1.0.tar.gz:

Publisher: publish.yml on vcollab/vcti-python-array-display

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vcti_array_display-3.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vcti_array_display-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d82b52b500462e0f89975c4c3a81c3ced36580dc09b7ebdce9872ba8bfaa442f
MD5 6dfd2628b058011ce16ccd26a86f7808
BLAKE2b-256 38d0d716e81df008356295a589ab517d9e863378db9f249525cd42e9adf61be1

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_array_display-3.1.0-py3-none-any.whl:

Publisher: publish.yml on vcollab/vcti-python-array-display

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page