Zero-boilerplate converter from raw data (images, text, categories) to numeric NumPy arrays.

These details have not been verified by PyPI

Project links

Project description

FastNum

Zero-boilerplate conversion from raw data to numeric NumPy arrays.

FastNum detects the kind of data you hand it — an image path, a sentence, a category list, or a batch of any of these — and returns the right numeric representation without a single line of configuration.

Why FastNum?

Most ML preprocessing pipelines repeat the same four patterns:

Input	Desired output
Image file	Float32 pixel array normalised to `[0, 1]`
Text sentence	Integer token-ID sequence
Flat category list	One-hot matrix
Batch of sentences	Padded token-ID matrix

FastNum collapses all four into one call: fn.to_num(data).

Installation

pip install fastnum

Or from source:

git clone https://github.com/your-username/fastnum.git
cd fastnum
pip install -e ".[dev]"

Requirements: Python ≥ 3.9, numpy ≥ 1.24, opencv-python ≥ 4.8.

Quick start

from fastnum import FastNum

fn = FastNum()

# --- Image -----------------------------------------------------------
pixels = fn.to_num("photo.jpg")          # (H, W, 3) float32, values in [0, 1]
pixels = fn.to_num("photo.jpg", image_size=(224, 224))  # resize on the fly

# Batch of images (all resized to the same shape for stacking)
batch = fn.to_num(["a.jpg", "b.jpg"], image_size=(224, 224))  # (2, 224, 224, 3)

# --- Plain text ------------------------------------------------------
tokens = fn.to_num("the cat sat on the mat")   # int32 array of token IDs
print(fn.decode(tokens))                        # → "the cat sat on the mat"

# --- Category list ---------------------------------------------------
labels = ["dog", "cat", "dog", "bird"]
one_hot = fn.to_num(labels)
# array([[0., 1., 0.],
#        [1., 0., 0.],
#        [0., 1., 0.],
#        [0., 0., 1.]], dtype=float32)

# --- Sentence batch --------------------------------------------------
matrix = fn.to_num(["hello world", "foo bar baz"])
# int32 matrix (2, 3), shorter rows are right-padded with pad_token_id

# --- Raw NumPy array -------------------------------------------------
import numpy as np
fn.to_num(np.array([1, 2, 3]))              # cast to float32, no-op otherwise

API reference

`FastNum(pad_token_id=0)`

Parameter	Type	Default	Description
`pad_token_id`	`int`	`0`	ID reserved for the `[PAD]` token. The special token is inserted into the vocabulary at construction time so real words are always assigned different IDs.

`to_num(data, image_size=None) → np.ndarray`

Parameter	Type	Description
`data`	`str \| list[str] \| np.ndarray`	Input to convert.
`image_size`	`tuple[int, int] \| None`	Target `(H, W)` for image resizing.

Return type depends on input:

Input	dtype	Shape
Image path / list of paths	`float32`	`(H, W, C)` / `(N, H, W, C)`
Sentence	`int32`	`(T,)`
Category list	`float32`	`(N, num_classes)`
Sentence batch	`int32`	`(N, max_len)`
`np.ndarray`	`float32`	same as input

`decode(token_ids) → str`

Converts a token-ID sequence back to whitespace-separated text. Padding tokens are silently dropped.

`vocab_size → int`

Number of entries currently in the vocabulary, including [PAD].

The `[PAD]` token and collision safety

FastNum reserves pad_token_id inside the vocabulary at construction time:

self.vocab        = {"[PAD]": pad_token_id}
self.inverse_vocab = {pad_token_id: "[PAD]"}

Because [PAD] occupies a slot before any text is tokenised, _get_or_add assigns new words IDs equal to len(self.vocab), which can never equal pad_token_id again. This means:

A padded cell in a token matrix will never decode to a real word.
decode() does not need a special-case filter beyond i != self.pad_token_id — the two sets are disjoint by construction.

Development

# Run tests with coverage
pytest

# Lint
ruff check fastnum

# Type-check
mypy fastnum

License

MIT © your-username

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

May 22, 2026

0.1.1

May 22, 2026

This version

0.1.0

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastnum-0.1.0.tar.gz (7.2 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fastnum-0.1.0-py3-none-any.whl (3.8 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file fastnum-0.1.0.tar.gz.

File metadata

Download URL: fastnum-0.1.0.tar.gz
Upload date: May 22, 2026
Size: 7.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for fastnum-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`1a8e050d9a43d34fb3a2c261573c89d68e5c981a9e9791766520485be6a2ea9f`
MD5	`ef218a9deb76b211370b1c25a94f4c8d`
BLAKE2b-256	`f35a36efdedc306a4286053809c53c0ca3a9ce962a47acc31d91a5250f56afce`

See more details on using hashes here.

File details

Details for the file fastnum-0.1.0-py3-none-any.whl.

File metadata

Download URL: fastnum-0.1.0-py3-none-any.whl
Upload date: May 22, 2026
Size: 3.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for fastnum-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3d74ea888e876b36bcaaf623faf92ccda3bc587bc7833315abcaabf815d8119d`
MD5	`c067ecf2e286e03c35ef61d2435d6f41`
BLAKE2b-256	`db6643dd7e96d7da33fe46c8db4de963316183abf190364b795769708a80cd06`

See more details on using hashes here.

fastnum 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FastNum

Why FastNum?

Installation

Quick start

API reference

`FastNum(pad_token_id=0)`

`to_num(data, image_size=None) → np.ndarray`

`decode(token_ids) → str`

`vocab_size → int`

The `[PAD]` token and collision safety

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

fastnum 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FastNum

Why FastNum?

Installation

Quick start

API reference

FastNum(pad_token_id=0)

to_num(data, image_size=None) → np.ndarray

decode(token_ids) → str

vocab_size → int

The [PAD] token and collision safety

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`FastNum(pad_token_id=0)`

`to_num(data, image_size=None) → np.ndarray`

`decode(token_ids) → str`

`vocab_size → int`

The `[PAD]` token and collision safety