Skip to main content

A small library providing utilities for better type hinting of NumPy arrays.

Project description

NumTypes

Python NumPy Pyright License

Type hints for NumPy arrays without the hassle! 🎉

🎯 Motivation

If you've ever tried to add type hints to code using NumPy, you've probably noticed that the built-in type annotations for arrays are not very useful:

import numpy as np

def translate_box(corners: np.ndarray) -> np.ndarray:
    # What are the dimensions? What's the data type? 🤷
    return corners + ?

The NumPy typing module provides an alias NDArray, but it doesn't support shape information. Only data types can be specified, like this:

...
from numpy.typing import NDArray

def translate_box(corners: NDArray[np.float32]) -> NDArray[np.float32]:
    # But that's not very helpful, is it? We still don't know how the corners are represented.
    return corners + ?

You can be more specific if you use ndarray directly, but now it becomes very verbose:

...

def translate_box(corners: np.ndarray[tuple[int, int], np.dtype[np.float32]]) -> np.ndarray[tuple[int, int], np.dtype[np.float32]]:
    # Still, you don't know if a row represents a point, or if the columns represent points. Is it 2D or 3D?
    return corners + ?

If you want to specify the exact shape, then it's even worse:

...
from typing import Literal

def translate_box(
    corners: np.ndarray[tuple[Literal[8], Literal[3]], np.dtype[np.float32]]
) -> np.ndarray[tuple[Literal[8], Literal[3]], np.dtype[np.float32]]:
    translation = np.array([1.0, -1.0, 0.0], dtype=corners.dtype)
    return corners + translation[np.newaxis, :]  # NumPy would also lose the type information here

NumTypes alleviates this issue by providing more concise syntax that:

  • documents the array shapes and data types at the desired specificity,
  • tells your type checker what to expect, and
  • helps it out whenever it gets confused by NumPy.

This library doesn't do any magic, it just provides sensible type aliases and leverages existing typing features (like type guards) in Python to give you better type hints. In fact, it's such a thin wrapper around NumPy's existing types that you could implement it yourself! But doing it every time for every project is tedious, so this library does it for you.

✨ Features

  • Precise shape typing - Specify exact dimensions or use wildcards for flexibility
  • dtype support - Full support for NumPy data types
  • Runtime shape validation - Verify array shapes at runtime with shape_of(), while helping your type checker in the process
  • Convenient & concise aliases - Vector, Matrix, IntArray, BoolArray, and more
  • Pyright compatible - Tested and working with Pyright/Pylance
  • Minimal overhead - Runtime validation is optional, lightweight and can easily be disabled

📦 Installation

pip install numtypes

🚀 Quick Start

The above use case can be simplified with NumTypes like this:

from numtypes import FloatArray, Dims, D

def translate_box(corners: FloatArray[Dims[D[8], D[3]]]) -> FloatArray[Dims[D[8], D[3]]]:
    translation = np.array([1.0, -1.0, 0.0], dtype=corners.dtype)
    result = corners + translation[np.newaxis, :]

    # Simple syntax for validating the shape and helping the type checker.
    assert shape_of(result, matches=(8, 3))

    return result  # The type checker now knows this is a FloatArray with shape (8, 3)

Nevertheless, in most cases you will likely use more flexible shapes, like Dim1, Dim2, etc. This allows you to specify the dimensionality without worrying about the exact size:

from numtypes import UByteArray, Dim1, Dim2

def flatten_image(image: UByteArray[Dim2]) -> UByteArray[Dim1]:
    return image.reshape(-1)  # This actually works with NumPy alone, since it is able to figure out the type.

📖 Usage

Basic Array Types

The most common pattern is using Dim1, Dim2, Dim3, etc. for arrays with a known number of dimensions:

from numtypes import Array, Dim1, Dim2, Dim3
import numpy as np

# Some NumPy functions infer type info properly
zeros_1d: Array[Dim1] = np.zeros((5,))
zeros_2d: Array[Dim2] = np.zeros((5, 5))
tensor: Array[Dim3] = np.ones((10, 20, 30))

# The most helpful part is knowing what arrays represent and how they are shaped.
def some_function(array_1: Array[Dim2], array_2: Array[Dim3], *arrays: Array[Dim1]) -> Array[Dim2]:
    ...

Specifying Exact Shapes

When you need to be more specific about dimensions:

from numtypes import FloatArray, IntArray, Dims, D, N

# Exact shape specification
corners: FloatArray[Dims[D[8], D[3]]]  # 8 corners × 3 coordinates
embeddings: FloatArray[Dims[D[1000], D[384]]]  # 1000 embeddings × 384 dimensions

# Using N for flexible dimensions (equivalent to -1)
batch: IntArray[Dims[N, D[224], D[224], D[3]]]  # Any batch size × 224×224 RGB images
sequence: Array[Dims[N, D[768]]]  # Any sequence length × 768 features

# In the following example, it's immediately clear how the vectors are represented.
def operation_on_3d_vectors(vectors: Array[Dims[N, D[3]]]) -> Array[Dims[N, D[3]]]:
    ...

Creating Typed Arrays

NumTypes provides helper functions to create arrays with explicit type information:

from numtypes import array, array_1d, array_2d, Float, Double, Int

# Create with exact shape
arr = array([1, 2, 3], shape=(3,))  # Array[Dims[D[3]]]
mat = array([[1, 2], [3, 4]], shape=(2, 2))  # Array[Dims[D[2], D[2]]]

# Convenience functions for common cases
vec = array_1d([1, 2, 3])  # Vector (alias for Array[Dim1])
mat = array_2d([[1, 2], [3, 4]])  # Matrix (alias for Array[Dim2])

# Specify data types
float_arr = array([1.0, 2.0], shape=(2,), dtype=np.float32)  # Array[Dims[D[2]], Float]
double_arr = array_1d([1.0, 2.0], dtype=np.float64)  # Array[Dim1, Double]
int_arr = array([1, 2], shape=(2,), dtype=np.int32)  # Array[Dims[D[2]], Int]

Runtime Shape Validation

Use shape_of() to validate shapes at runtime while providing type information to your type checker:

from numtypes import shape_of, Array, UnknownShape, AnyShape

def process_batch(images: Array) -> Array:
    # Validate the shape
    assert shape_of(images, matches=(32, 224, 224, 3))
    
    # Type checker now knows images has shape (32, 224, 224, 3)
    normalized = images / 255.0
    
    # Validate the output shape if needed
    assert shape_of(normalized, matches=(32, 224, 224, 3))
    return normalized

# Flexible shape validation
assert shape_of(data, matches=(-1, 128))  # Any number of rows, exactly 128 columns
assert shape_of(sequence, matches=(100, -1, -1))  # 100 sequences of any shape

# Partial validation with ellipsis
assert shape_of(batch, matches=(32, ...))  # First dimension must be 32
# Note: After ellipsis validation, the shape is no longer known to the type checker.

Type Aliases for Common Use Cases

NumTypes provides convenient type aliases for common array types and shapes:

from numtypes import Vector, Matrix, IntArray, BoolArray, FloatArray, IndexArray, Long

# Data type aliases
mask: BoolArray[Dim2]  # 2D boolean array
indices: IntArray[Dim1]  # 1D integer array  
scores: FloatArray[Dims[D[100]]]  # Exactly 100 float scores (float32)
labels: Array[Dims[D[1000]], Long]  # 1000 int64 labels

# Shape aliases
embedding: Vector[D[384]]  # 1D array of 384 elements
rotation: Matrix[D[3], D[3]]  # 3×3 matrix

# Special arrays
sorted_indices: IndexArray[UnknownShape] = np.argsort(scores)  # From argsort

You can still use any NumPy data type directly, but these aliases help with readability in common cases.

Working with NumPy Operations

⚠️ Important: Many NumPy operations currently lose type information. For example, adding two arrays or computing the mean will result in an array with an unknown shape. In general, you don't always need to know the exact shape of an array after every operation. As such, it makes the most sense to use shape_of() to validate the shape of the result only when you need it, e.g. when passing an array as an argument to a function, or when returning it from a function.

Here's how to handle common cases:

# These provide type info correctly.
zeros: Array[Dim2] = np.zeros((5, 5))
ones: Array[Dim3] = np.ones((3, 4, 5))

# Operations that preserve type info
negated: Array[Dim2] = -zeros  # Still an Array[Dim2]

# Operations that lose type info - use shape_of to recover it
data: Array[Dim2] = array_2d([[1.0, 2.0], [3.0, 4.0]])
mean_per_row = data.mean(axis=1)  # Type checker doesn't know the shape
assert shape_of(mean_per_row, matches=(-1,))  # Now it knows it's 1D

# Alternative: use type annotations with `# type: ignore` statements or type casts.
mean_per_row: Array[Dim1] = data.mean(axis=1)  # type: ignore

# For complex operations, validate intermediate results
def matrix_operation(a: Matrix, b: Matrix) -> Vector:
    assert shape_of(a, matches=(10, 20))
    assert shape_of(b, matches=(20, 30))
    
    result = a @ b  # Matrix multiplication
    assert shape_of(result, matches=(10, 30))
    
    flattened = result.reshape(-1)
    assert shape_of(flattened, matches=(300,))
    
    return flattened

Debugging Shape Mismatches

NumTypes supports configuration for debugging shape validation failures:

from numtypes import config
import ipdb

# Configure a debugger to be called on shape mismatch, e.g. ipdb
config.configure(debugger=ipdb.set_trace)

# Configure logging for shape mismatches
config.configure(logger=lambda msg: print(f"Shape mismatch: {msg}"))

array = ...  # Some NumPy array

# Now shape_of will trigger these on failure
assert shape_of(array, matches=(10, 10))  # Will call debugger/logger if shape doesn't match

Removing Runtime Validation

Because the idiomatic usage of shape_of() leverages the built in assert statement, you can easily remove runtime validation by just enabling optimization for your Python interpreter. This is done by running Python with the -O flag:

python -O your_script.py

This won't affect the type checking, but will remove whatever overhead the runtime validations introduce.

⚙️ Type Checker Compatibility

Note: This library is currently only tested with Pyright (including Pylance in VS Code). Compatibility with mypy and other type checkers is not guaranteed.

Other Limitations

  • Python Compatibility: This library requires Python 3.13 or later, due to the latest typing features/syntax used.
  • Structured Arrays: This library does not support structured arrays (i.e. arrays with named fields).
  • Type Coverage: The convenience aliases probably don't cover every use case. You can always define your own types using the provided Array, Dims, and DimX classes though. If a type feels too common to be missing, please open an issue or PR!
  • Suboptimal Syntax: The current syntax is not perfect, but it is a trade-off between conciseness and type safety. Even though the ideal type annotations would look like Array[8, 3, 2, Float], this is currently not possible with the syntax Python currently supports. Although one could make Array[8, 3, 2, Float] work using __class_getitem__, it would currently not be understandable for type checkers.

🤝 Contributing

TODO: Contributing guide coming soon! For now, feel free to open issues and PRs.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numtypes-0.2.0.tar.gz (32.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

numtypes-0.2.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file numtypes-0.2.0.tar.gz.

File metadata

  • Download URL: numtypes-0.2.0.tar.gz
  • Upload date:
  • Size: 32.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for numtypes-0.2.0.tar.gz
Algorithm Hash digest
SHA256 750a1ba04b00350fef08ac0e76f709e186bcbd07bd42bc19b617dc7490724399
MD5 5ad1dca64afcd42a22662b960f4d2149
BLAKE2b-256 ee7ac3505373aaaffb7150b4df95741e7e92fa9570b2e7fdc2a92dee9132bffa

See more details on using hashes here.

File details

Details for the file numtypes-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: numtypes-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for numtypes-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3fe93983f82342b37940fb39244cab2f6b187b66f2a40e1f3d6a8a428c94e8f9
MD5 c25469fdf03806e33a9426d2da42cbac
BLAKE2b-256 1fc15ecba448bb02d245144390bbdfa069bded633e09eb0a5458db05a184ef68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page