Skip to main content

A tiny python library to augment the images dataset aimed for a ML classification system

Project description

Table of Contents

Scope

The scope of this library is to augment the dataset for an image classification ML system.

Setup

Versions

The library is compatible with python 3.6 on.

Virtualenv

We suggest to isolate your installation via python virtualenv:

python3 -m venv .imgaug
...
source .imgaug/bin/activate

Installation

Update pip package manager:

pip install pip --upgrade
...
pip install -r requirements.txt

Tests

The library is covered, by fast, isolated unit and doc testing (the latter to grant reliable documentation):

python -m unittest discover -s imgaug -p '*'

APIs

The library is composed by different collaborators, each with its specific responsibility. Each class tries to expose a minimal public APIs in the form of __call__ or __iter__ methods (when generators are used).
The classes are aimed to work with one image at time, in case you need to transform and augment multiple images, avoid creating multiple instances of the classes, just change the argument of the __call__ function (but for Persister, which need a new instance and/or instance attribute modification).

Labeller

The target label is extracted directly by inspecting the image name and trying to extract meaningful information (customisable).

lbl = Labeller(digits=10)

lbl('resources/bag.png')
'bag'

lbl('resources/109-602-3906-001-c-suit-veletta-albino.jpg')
'1096023906'

Normalizer

The images are normalized by:

  • resizing them to the specified max size (default to 256 pixels)
  • optionally applying a squared, transparent/backgound canvas and centering the image on it, thus avoiding any deformation
norm = Normalizer(size=128, canvas=True)
img = norm('resources/bag.png')
img.shape
(128, 128, 4)

Augmenter

The number of images is augmented by two orders of magnitude (depending on the cutoff float attribute) by applying different transformations to the original one.
Transformations are applied by using generators, thus saving memory consumption.

aug = Augmenter(cutoff=.5)
aug('resources/bag.png')
<generator object Augmenter.__call__ at 0x125354480>

Persister

Images are persisted upon normalization and augmentation, by specifying an action function that accepts the name of the file (original basename suffixed by an index) and a BytesIO object containing the image data stream.
The persister supports both a filename path and, optionally, a stream-like object (in case the file is not yet persisted to disk).
The persister supports iteration by yielding the image label and the function return value (typically the saved path), allowing to generate CSV files specific to cloud platforms (i.e. Google Vision APIs).

def persist(name, stream):
    filename = f'temp/{name}'
    with open(filename, 'wb') as f:
        f.write(stream.getvalue())
    return filename

pers = Persister('resources/skirt.jpg', action=perist)
for label, filename in pers:
    print(label, filename)

Zipper

In case you need an archive with each normalised augmentations within the recognised label subfolder, you can rely on the Zipper interface: it creates a ZIP file on current path, by scanning the specified folder for PNG or JPG images.

zipper = Zipper('.resources/', normalizer=image.Normalizer(16), augmenter=image.Augmenter(.05))
zipper()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

image_augmenter-0.5.5.tar.gz (8.9 kB view details)

Uploaded Source

Built Distribution

image_augmenter-0.5.5-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file image_augmenter-0.5.5.tar.gz.

File metadata

  • Download URL: image_augmenter-0.5.5.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for image_augmenter-0.5.5.tar.gz
Algorithm Hash digest
SHA256 95e8f7ca6d05ba6902ab999ca7be341461c659cb0b6b4a7bc8fb14b68ccd1085
MD5 1af36b6b0efb4861da44ee5e304faa67
BLAKE2b-256 cb558ef384f0830f1c16fa33951ce1a82751c07cb613cd6f2eff4ce7ecb9ad17

See more details on using hashes here.

File details

Details for the file image_augmenter-0.5.5-py3-none-any.whl.

File metadata

  • Download URL: image_augmenter-0.5.5-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for image_augmenter-0.5.5-py3-none-any.whl
Algorithm Hash digest
SHA256 00b4740a450cd93bd2e5e74897b2f7268c575488b0015cc1b749771864414f53
MD5 5ebe95eeee52b3a5fa3ccbeaf7698567
BLAKE2b-256 5e84b4972ee99f3672c4b3c4edb80f5acb2541cfc74383d69d2c213b745358da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page