Skip to main content

Load image datasets as NumPy arrays

Project description

image-dataset-loader: Load image datasets as NumPy arrays

PyPI MIT license

Installation

pip install image-dataset-loader

Overview

Suppose you have an image dataset in a directory which looks like this:

data/
  train/
    cats/
      cat0001.jpg
      cat0002.jpg
      ...
    dogs/
      dog0001.jpg
      dog0002.jpg
      ...
  test/
    cats/
      cat0001.jpg
      cat0002.jpg
      ...
    dogs/
      dog0001.jpg
      dog0002.jpg
      ...

You can use the image_dataset_loader.load function to load this dataset as NumPy arrays:

import image_dataset_loader

(x_train, y_train), (x_test, y_test) = image_dataset_loader.load('path/to/data', ['train', 'test'])

The shape of the x_* arrays will be (instances, rows, cols, channels) for color images and (instances, rows, cols) for grayscale images. Also, the shape of the y_* arrays will be (instances,).

All images in the dataset must have the same shape. Also, all data subsets (i.e., train and test in this example) must contain the same set of classes. Class names will be sorted alphabetically. So, in this example, cats and dogs will be represented by 0 and 1, respectively.

You can also load a single data subset. For example:

(x_train, y_train), = image_dataset_loader.load('path/to/data', ['train'])

Note that the comma after (x_train, y_train) is required, because the function always returns a tuple of tuples.

API

load(dataset_path, set_names,
     shuffle=True, seed=None,
     x_dtype='uint8', y_dtype='uint32')
  • dataset_path: Path to the dataset directory.
  • set_names: List of the data subsets (subdirectories of the dataset directory).
  • shuffle: Whether to shuffle the samples. If false, instances will be sorted by class name and then by file name.
  • seed: Random seed used for shuffling (see the docs).
  • x_dtype: NumPy data type for the X arrays (see the docs).
  • y_dtype: NumPy data type for the Y arrays (see the docs).
  • Returns a tuple of (x, y) tuples corresponding to set_names.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

image-dataset-loader-0.0.2.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

image_dataset_loader-0.0.2-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file image-dataset-loader-0.0.2.tar.gz.

File metadata

  • Download URL: image-dataset-loader-0.0.2.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for image-dataset-loader-0.0.2.tar.gz
Algorithm Hash digest
SHA256 e2b7ef0f0baa4cddc973a5410bc4ec689cb6f472ed3905d4587a9541ace7b5cc
MD5 f2ffa1bab7a254932e9c858c4352c42c
BLAKE2b-256 471749573e2663b5d34f00dedda2a845a08f47d4d2196116fffb32058d6f44c4

See more details on using hashes here.

File details

Details for the file image_dataset_loader-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for image_dataset_loader-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 88d4e741b74fc41010581b1cc4c5fdc4c7614fd9450e47690cc1903de287e25a
MD5 163cef43ec3e98d5f2e9f34e7ca66f7c
BLAKE2b-256 b23b14663733b56737c167341a6da085e4a1a0b8971eae5a5e8f93453d74d209

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page