Skip to main content

Unified image to tensor utility with streaming TFRecord support

Project description

img2tensor

A unified, high-performance utility to convert images into training-ready tensors for NumPy, PyTorch, and TensorFlow, or stream them directly into TFRecords.

img2tensor handles the standard deep learning data ingestion "papercuts": BGR/RGB swaps, memory layouts (NHWC vs NCHW), and dtype scaling in a single, lightweight function.


🚀 Installation

pip install img2tensor


📖 Usage

1. Single Image (In-Memory)

Returns a 3D tensor ($C, H, W$ for PyTorch).

import img2tensor

Returns: torch.Tensor of shape (3, 224, 224)

tensor = img2tensor.get_tensor("cat.jpg", tensor_type="pytorch")

2. Batch Loading (In-Memory)

Returns a 4D tensor ($N, H, W, C$ for NumPy/TF).

Returns: np.ndarray of shape (32, 224, 224, 3)

batch = img2tensor.get_tensor(list_of_paths, n_jobs=8)

3. Production Pipeline (TFRecord)

Writes to disk using a chunked streaming approach to save RAM.

img2tensor.get_tensor( img_paths=large_list_of_paths, output_format="tfrecord", tfrecord_path="dataset.tfrecord", n_jobs=12 )


🛠 API Reference: get_tensor()

Inputs

Parameter Type Default Description
img_paths `str Path list`
tensor_type str "numpy" Target framework: "numpy", "pytorch", or "tensorflow".
dtype str "float32" Target type: "float32", "float16", "uint8". Floats are auto-scaled (1/255).
image_layer str "PIL" Backend decoder: "PIL" or "CV2".
n_jobs int 4 Number of threads for parallel decoding.
output_format str "tensor" "tensor" (returns object) or "tfrecord" (writes to disk).
tfrecord_path `str Path` None

Outputs

  • Single Path Input: Returns a 3D Tensor ($H, W, C$ for NumPy/TF; $C, H, W$ for PyTorch).
  • List Input: Returns a 4D Tensor ($N, H, W, C$ for NumPy/TF; $N, C, H, W$ for PyTorch).
  • TFRecord Mode: Returns a dict with file metadata (path, sample count, dtype).

🧠 Design Philosophy

NCHW vs NHWC

One of the most frequent bugs in Computer Vision pipelines is passing the wrong channel layout. img2tensor detects your framework and adjusts automatically:

  • PyTorch: Returns $N \times C \times H \times W$ (and ensures memory is .contiguous()).
  • NumPy/TF: Returns $N \times H \times W \times C$.

What it Does NOT Do (By Design)

  • No Resizing: We believe silent resizing is dangerous as it introduces artifacts. If your images are inconsistent sizes, get_tensor will throw a ValueError identifying the offending file.
  • No Augmentation: This is a pure loader. Use specialized libraries like albumentations or torchvision for data manipulation.
  • No Heavy Dependencies: Using lazy imports, the library won't crash if you don't have TensorFlow installed but only use NumPy or Torch.

📄 License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

img2tensor-0.1.1.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

img2tensor-0.1.1-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file img2tensor-0.1.1.tar.gz.

File metadata

  • Download URL: img2tensor-0.1.1.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for img2tensor-0.1.1.tar.gz
Algorithm Hash digest
SHA256 54023ff9d09387e20bc4ee88948f9305260b7a10aadc0a8cf288081f6ee2a283
MD5 9d794ee1ade56bb070e5d07cfb01a250
BLAKE2b-256 fcd171d1ab96da1dfc400857ba4a28e7b85d7a31a930ae5b024672550b9e7a56

See more details on using hashes here.

Provenance

The following attestation bundles were made for img2tensor-0.1.1.tar.gz:

Publisher: publish.yml on sourabhyadav999/img2tensor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file img2tensor-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: img2tensor-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for img2tensor-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1f795b5ce7e285a4cf54755691e55c433dc242dfbf122ac1aae85194c91aa29a
MD5 b660deb90031d763bcc6a8a8c4102f00
BLAKE2b-256 b8b3b6aa9be78297b70815058ff228d85cc794215270f2b65251bf006bad1edd

See more details on using hashes here.

Provenance

The following attestation bundles were made for img2tensor-0.1.1-py3-none-any.whl:

Publisher: publish.yml on sourabhyadav999/img2tensor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page