Skip to main content

Lightweight package meant to simplify data processing for Deep Learning

Project description

.. |Build-Status| image:: https://travis-ci.com/evoneutron/melon.svg?branch=master :target: https://travis-ci.com/evoneutron/melon

Melon

| Melon is a lightweight package meant to simplify data processing for Deep Learning.

| It removes the need for boilerplate code to pre-process the data prior to (model) training, testing and inference. | It aims at standardizing data serialization and manipulation approaches. | | The default formats align with the requirements by frameworks such as Tensorflow / PyTorch. | The tool also provides various level of customizations depending on the use-case.

Installation

Install and update using pip_:

.. code-block:: text

$ pip install melon

Supported in Python >=3.4.0

.. _pip: https://pip.pypa.io/en/stable/quickstart/

Examples

Images

| With default options_:

.. code-block:: python

from melon import ImageReader

def train():
    source_dir = "resources/images"
    reader = ImageReader(source_dir)
    X, Y = reader.read()
    ...
    with tf.Session() as s:
        s.run(..., feed_dict = {X_placeholder: X, Y_placeholder: Y})

| source_dir directory should contain images that need to be read. See tests/resources/images/sample for a sample directory. In the sample directory there is an optional labels.txt file that is described in Labeling_.


| Since number of images may be too large to fit into memory the tool supports batch-processing. |

.. code-block:: python

from melon import ImageReader

def train():
    source_dir = "resources/images"
    options = { "batch_size": 32 }
    reader = ImageReader(source_dir, options)
    while reader.has_next():
        X, Y = reader.read()
        ...

| This reads images in the batches of 32 until all images are read. If batch_size is not specified then reader.read() will read all images.


.. _Custom options:

| With custom options_:

.. code-block:: python

from melon import ImageReader

def train():
    source_dir = "resources/images"
    options = { "data_format": "channels_last", "normalize": False }
    reader = ImageReader(source_dir, options)
    ...

| This changes format of data to channels-last (each sample will be Height x Width x Channel) and doesn't normalize the data. See options_ for available options.

.. _options:

Options

Images

width
    Width of the output (pixels). default: ``255``

height
    Height of the output (pixels). default: ``255``

batch_size
    Batch size of each read. default: All images in a directory

data_format
    Format of the images data

        | ``channels_first`` - `Channel x Height x Width` (default)
        | ``channels_last`` - `Height x Width x Channel`

label_format
    Format of the labels data

        | ``one_hot`` - as a matrix, with one-hot vector per image (default)
        | ``label`` -  as a vector, with a single label per image


normalize
    Normalize data. default: ``True``

num_threads - number of threads for parallel processing
    default: Number of cores of the machine

.. _Labeling:

Labeling

| In supervised learning each image needs to be mapped to a label. | While the tool supports reading images without labels (e.g. for inference) it also provides a way to label them.


Generating labels file

| To generate labels file use the following command:

.. code-block:: text

$ melon generate
> Source dir:

| After providing source directory the tool will generate labels file in that directory with blank labels. | Final step is to add a label to each row in the generated file. | | For reference see tests/resources/images/sample/labels.txt:

.. code-block:: text

#legend
pedestrian:0
cat:1
parrot:2
car:3
apple tree:4

#map
img275.jpg:1
img324.jpg:2
img551.jpg:3
img928.jpg:1
img999.png:0
img736.png:4

| #legend section is optional but #map section is required to map a label to an image.


Format of the labels

| Label's format can be specified in Custom options_. It defaults to one-hot format.

Roadmap

  • Support for video data

  • Support for textual data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

melon-0.0.5.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

melon-0.0.5-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file melon-0.0.5.tar.gz.

File metadata

  • Download URL: melon-0.0.5.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.0 CPython/3.6.5

File hashes

Hashes for melon-0.0.5.tar.gz
Algorithm Hash digest
SHA256 bdbcc23d80aadbef5c37d397a601a8ecba9da918a4191e2c64f533573f9dde3b
MD5 acf3cf472ff892d4c122d14292306771
BLAKE2b-256 44d0505d32f00b85f87b66dd0511755a8928947bf6744b167c6d5aab46c053e9

See more details on using hashes here.

File details

Details for the file melon-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: melon-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.0 CPython/3.6.5

File hashes

Hashes for melon-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8e6b5ec8a117803734dc7eb619b554c8a0e468776510203a7b9d12b0b1d7f594
MD5 d7ff70a53e9155a78d2c0338e0e118ea
BLAKE2b-256 b990e9e032e7fef5fabf6823c67f910e86f07dfd7c327d97bb7aca4d5ee520f5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page