Load image datasets as NumPy arrays
Project description
image-dataset-loader: Load image datasets as NumPy arrays
Installation
pip install image-dataset-loader
Overview
Suppose you have an image dataset in a directory which looks like this:
data/
train/
cats/
cat0001.jpg
cat0002.jpg
...
dogs/
dog0001.jpg
dog0002.jpg
...
test/
cats/
cat0001.jpg
cat0002.jpg
...
dogs/
dog0001.jpg
dog0002.jpg
...
You can use the image_dataset_loader.load
function to load this dataset as NumPy arrays:
import image_dataset_loader
(x_train, y_train), (x_test, y_test) = image_dataset_loader.load('path/to/data', ['train', 'test'])
The shape of the x_*
arrays will be (instances, rows, cols, channels)
for color images and (instances, rows, cols)
for grayscale images.
Also, the shape of the y_*
arrays will be (instances,)
.
All images in the dataset must have the same shape.
Also, all data subsets (i.e., train
and test
in this example) must contain the same set of classes.
Class names will be sorted alphabetically.
So, in this example, cats
and dogs
will be represented by 0
and 1
, respectively.
You can also load a single data subset. For example:
(x_train, y_train), = image_dataset_loader.load('path/to/data', ['train'])
Note that the comma after (x_train, y_train)
is required, because the function always returns a tuple of tuples.
API
load(dataset_path, set_names,
shuffle=True, seed=None,
x_dtype='uint8', y_dtype='uint32')
dataset_path:
Path to the dataset directory.set_names:
List of the data subsets (subdirectories of the dataset directory).shuffle:
Whether to shuffle the samples. If false, instances will be sorted by class name and then by file name.seed:
Random seed used for shuffling (see the docs).x_dtype:
NumPy data type for the X arrays (see the docs).y_dtype:
NumPy data type for the Y arrays (see the docs).- Returns a tuple of
(x, y)
tuples corresponding toset_names
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file image-dataset-loader-0.0.2.tar.gz
.
File metadata
- Download URL: image-dataset-loader-0.0.2.tar.gz
- Upload date:
- Size: 3.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2b7ef0f0baa4cddc973a5410bc4ec689cb6f472ed3905d4587a9541ace7b5cc |
|
MD5 | f2ffa1bab7a254932e9c858c4352c42c |
|
BLAKE2b-256 | 471749573e2663b5d34f00dedda2a845a08f47d4d2196116fffb32058d6f44c4 |
File details
Details for the file image_dataset_loader-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: image_dataset_loader-0.0.2-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88d4e741b74fc41010581b1cc4c5fdc4c7614fd9450e47690cc1903de287e25a |
|
MD5 | 163cef43ec3e98d5f2e9f34e7ca66f7c |
|
BLAKE2b-256 | b23b14663733b56737c167341a6da085e4a1a0b8971eae5a5e8f93453d74d209 |