Skip to main content

Presentation layer for NumPy structured arrays — column filtering, dtype flattening, enum mapping, array slicing, and adapters

Project description

Array Display

Presentation layer for NumPy arrays — structured, 1-D, or 2-D.

ArrayDisplay wraps a NumPy array reference (no copy) and provides five capabilities for controlling how the data is presented. Plain 1-D and 2-D arrays are reinterpreted as structured via zero-copy np.ndarray.view() under the hood.

  1. Column filtering — hide columns by name, regex, or predicate
  2. Dtype flattening — expand vector/matrix fields into scalar columns with custom component names
  3. Enum mapping — lazily map integer ID columns to human-readable names
  4. Array slicing — create lightweight index arrays for bounded access
  5. Adapters — render as text table or pandas DataFrame on bounded slices

Installation

pip install vcti-array-display>=2.0.0

With pandas support (optional):

pip install vcti-array-display[pandas]>=2.0.0

1. Column Filtering

Hide unwanted columns using any combination of exact names, regex patterns, and callable predicates.

import re
import numpy as np
from vcti.arraydisplay import ArrayDisplay, FILLER_COLUMNS, LENGTH_COLUMNS, VOID_COLUMNS

dt = np.dtype([
    ('id', 'i4'), ('f0', 'V4'), ('label', 'U20'),
    ('label_len', 'i4'), ('value', 'f8'), ('f3', 'V8'),
])
arr = np.zeros(10, dtype=dt)

# Pre-built patterns for C++ interop noise
view = ArrayDisplay(arr, exclude_columns=[FILLER_COLUMNS, LENGTH_COLUMNS])
view.view_columns  # ['id', 'label', 'value']

# Or exclude by dtype — catches all void padding regardless of name
view = ArrayDisplay(arr, exclude_columns=[VOID_COLUMNS, LENGTH_COLUMNS])

# Mix exact names, regex, and callables
view = ArrayDisplay(arr, exclude_columns=[
    FILLER_COLUMNS,                        # regex: ^f\d+$
    LENGTH_COLUMNS,                        # regex: _len$
    "debug_flag",                          # exact name
    re.compile(r"^tmp_"),                  # custom regex
    lambda name, dtype: dtype.kind == 'V', # by dtype
])

No magic defaults — ArrayDisplay(arr) shows all columns. Pre-built patterns FILLER_COLUMNS, LENGTH_COLUMNS, and VOID_COLUMNS are opt-in. Callables receive (name, dtype) for filtering by type, shape, or both.

2. Dtype Flattening

Expand vector and matrix fields into individual scalar columns, with optional user-defined component names.

dt = np.dtype([('id', 'i4'), ('position', 'f8', (3,))])
arr = np.zeros(5, dtype=dt)

# Default numeric suffixes
view = ArrayDisplay(arr, flatten_dtype=True)
view.view_columns  # ['id', 'position_0', 'position_1', 'position_2']

# Custom component names
view = ArrayDisplay(arr, flatten_dtype=True,
                 component_names={'position': ['x', 'y', 'z']})
view.view_columns  # ['id', 'position_x', 'position_y', 'position_z']

# Component grouping is tracked for consumers (e.g., header spanning)
view.field_components
# {'position': ['position_x', 'position_y', 'position_z']}

# Rename after construction
view.set_component_names('position', ['lat', 'lon', 'alt'])

Note: Default flattened names (e.g., position_0) are generated by vcti-nputils and may truncate long field names. Use component_names to ensure predictable, readable column names regardless of the defaults.

3. Enum Mapping

Map integer ID columns to human-readable names. The mapping is stored as a recipe and resolved lazily — only when presenting a bounded slice.

dt = np.dtype([('id', 'i4'), ('element_type', 'i4'), ('value', 'f8')])
arr = np.array([(1, 1, 10.5), (2, 2, 20.3), (3, 1, 15.0)], dtype=dt)
view = ArrayDisplay(arr)

view.add_name_columns({
    'element_type_name': ('element_type', {1: 'QUAD', 2: 'HEX'}),
})
view.view_columns  # ['id', 'element_type_name', 'value']

# The mapping is NOT materialized yet — it's a recipe:
view.name_columns
# {'element_type_name': ('element_type', {1: 'QUAD', 2: 'HEX'})}

4. Array Slicing

Create lightweight index arrays for bounded access. The index array is tiny regardless of the underlying array size.

idx = view.to_slice(head=20)                       # first 20 rows
idx = view.to_slice(tail=10)                       # last 10 rows
idx = view.to_slice(head=10, tail=5)               # first 10 + last 5
idx = view.to_slice(mask=arr['value'] > 50)        # boolean filter
idx = view.to_slice(indices=np.array([0, 100, 999]))  # explicit

view.array[idx]                                    # sliced data

5. Adapters

Text table (pure numpy — no extra dependencies)

print(view.to_table(head=10, tail=5))
# id | element_type_name | value
# ---+-------------------+------
#  1 | QUAD              |  10.5
#  2 | HEX               |  20.3
# ...
# [1000 rows x 3 columns]

pandas DataFrame (optional dependency)

df = view.to_dataframe()                   # full array
df = view.to_dataframe(head=100)           # first 100 rows
df = view.to_dataframe(resolve_names=False)  # raw ID columns

Enum names are materialized using pd.Categorical for memory efficiency.

Jupyter notebooks

ArrayDisplay provides _repr_html_() — displays a bounded row window with enum names resolved automatically.


Performance

Designed for large CAE arrays (millions of rows, GBs of data):

  • No array copy — wraps a reference to the original numpy array
  • Lazy enum resolution — resolved only on the displayed slice
  • Index arraysto_slice() returns kilobytes regardless of array size
  • Bounded presentationto_table() and to_dataframe(head=N) never touch the full array
  • pd.Categorical — ~100x less memory than string columns for enum values

API Summary

Cost column:

  • Cheap — metadata only; no iteration over rows.
  • Moderate — O(displayed rows × cols) or O(N) one-off; safe for bounded slices.
  • Heavy — O(N × C) with allocation; avoid on large arrays unless you need it.
Method Pillar Cost Description
set_view_columns(...) Filtering Cheap Configure visible columns
include_view_columns(cols) Filtering Cheap Add columns to the view
exclude_view_columns(cols) Filtering Cheap Remove columns from the view
replace_view_columns(mapping) Filtering Cheap Rename columns in the view
set_array(arr, ...) Flattening Cheap Set array; flatten_dtype=True uses zero-copy view
set_component_names(field, names) Flattening Cheap Rename flattened components
add_name_columns(mappings) Enum mapping Cheap Register lazy enum recipes
to_slice(head=H, tail=T) Slicing Cheap O(H + T) index array
to_slice(mask=M) Slicing Moderate O(N) boolean scan
to_table(head=H, tail=T) Adapter Moderate O((H + T) × cols)
to_dataframe(head=H) Adapter Moderate O(H × cols)
to_dataframe() (no bounds) Adapter Heavy O(N × C) pandas materialization
copy() Misc Cheap Shared array, config deep-copied

See docs/performance.md for full complexity analysis.

Note on to_dataframe(): Without head/tail/indices bounds, this method copies the full array into pandas — pandas cannot share memory across heterogeneous dtypes, so expect ~2× memory usage during conversion. Use the bounded form view.to_dataframe(head=100) for inspection and analysis; reserve the unbounded form for genuine full-array export (parquet / CSV / etc).

Plain 1-D and 2-D Arrays

Plain arrays are accepted too — they're reinterpreted as structured via np.ndarray.view() (zero-copy for C-contiguous input).

# 1-D — single column named "value" by default
view = ArrayDisplay(np.array([1.5, 2.0, 3.5]))
view.view_columns  # ['value']

# 1-D with explicit name
view = ArrayDisplay(np.array([1.5, 2.0, 3.5]), field_names=['temperature'])

# 2-D — auto-named col_0, col_1, ...
arr = np.random.rand(1000, 3)
view = ArrayDisplay(arr)
view.view_columns  # ['col_0', 'col_1', 'col_2']

# 2-D with explicit names (the common case)
view = ArrayDisplay(arr, field_names=['x', 'y', 'z'])

# 2-D with custom prefix
view = ArrayDisplay(arr, field_name_prefix='comp')
view.view_columns  # ['comp_0', 'comp_1', 'comp_2']

Non-contiguous 2-D arrays raise ValueError — call np.ascontiguousarray() first if you accept the copy cost. ndim > 2 is rejected explicitly; reshape first.

Examples

See examples/full_pipeline.py for a complete end-to-end script demonstrating all five pillars.

Dependencies

  • numpy (>=1.24) — required
  • vcti-nputils (>=1.0.0) — required
  • pandas (>=2.0) — optional, for to_dataframe() and Jupyter display

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcti_array_display-2.0.0.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcti_array_display-2.0.0-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file vcti_array_display-2.0.0.tar.gz.

File metadata

  • Download URL: vcti_array_display-2.0.0.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vcti_array_display-2.0.0.tar.gz
Algorithm Hash digest
SHA256 963a1d875d514e30893ce2c27120233c7a7dac734f93372bbdfaf5891d2ba73a
MD5 2410693aef2d0f449c85ca809a30dd56
BLAKE2b-256 74707bee5f93c6678f4fb1e98dfa57401eba6ea2d7f2dd34a4d942eabe0f6d18

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_array_display-2.0.0.tar.gz:

Publisher: publish.yml on vcollab/vcti-python-array-display

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vcti_array_display-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vcti_array_display-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 37f08c4a56e8306d17bb5f5b9c4f6da4a1f90cde706853c721abd04f8059e400
MD5 b8b8dc02c5ac6c2de2e03a4e46b46603
BLAKE2b-256 fa246ff2aa7833317d5a9529a84653f334f71ed2125c56c52fd3f46e324aec0e

See more details on using hashes here.

Provenance

The following attestation bundles were made for vcti_array_display-2.0.0-py3-none-any.whl:

Publisher: publish.yml on vcollab/vcti-python-array-display

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page