Skip to main content

Lightweight package meant to simplify data processing for Deep Learning

Project description

build-status coverage-status pypi-reference pypi-downloads


Melon is a lightweight package meant to simplify data processing for Deep Learning.
It removes the need for boilerplate code to pre-process the data prior to (model) training, testing and inference.
It aims at standardizing data serialization and manipulation approaches.

The default formats align with the requirements by frameworks such as Tensorflow / PyTorch / Keras.
The tool also provides various level of customizations depending on the use-case.


Install and update using pip:

$ pip install melon

Supported in Python >= 3.4.0



With default options:

from melon import ImageReader

def train():
    source_dir = "resources/images"
    reader = ImageReader(source_dir)
    X, Y =
    with tf.Session() as s:, feed_dict = {X_placeholder: X, Y_placeholder: Y})
source_dir directory should contain images that need to be read. See sample directory for reference.
In the sample directory there is an optional labels.txt file that is described in Labeling.

Since number of images may be too large to fit into memory the tool supports batch-processing.

from melon import ImageReader

def train():
    source_dir = "resources/images"
    options = { "batch_size": 32 }
    reader = ImageReader(source_dir, options)
    while reader.has_next():
        X, Y =
This reads images in the batches of 32 until all images are read. If batch_size is not specified then will read all images.

With custom options:

from melon import ImageReader

def train():
    source_dir = "resources/images"
    options = { "data_format": "channels_last", "normalize": False }
    reader = ImageReader(source_dir, options)
This changes format of data to channels-last (each sample will be Height x Width x Channel) and doesn’t normalize the data. See options for available options.



Width of the output (pixels). default: 255
Height of the output (pixels). default: 255
Batch size of each read. default: All images in a directory

Format of the images data

channels_first - Channel x Height x Width (default)
channels_last - Height x Width x Channel

Format of the labels data

one_hot - as a matrix, with one-hot vector per image (default)
label - as a vector, with a single label per image
Normalize data. default: True
num_threads - number of threads for parallel processing
default: Number of cores of the machine


In supervised learning each image needs to be mapped to a label.
While the tool supports reading images without labels (e.g. for inference) it also provides a way to label them.

Generating labels file

To generate labels file use the following command:
$ melon generate
> Source dir:
After providing source directory the tool will generate labels file in that directory with blank labels.
Final step is to add a label to each row in the generated file.

For reference see sample labels:
apple tree:4

#legend section is optional but #map section is required to map a label to an image.

Format of the labels

Label’s output format can be specified in Custom options. It defaults to one-hot format.


  • Support for video data (Q1 2019)
  • Support for reading from AWS S3 (Q2 2019)

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for melon, version
Filename, size File type Python version Upload date Hashes
Filename, size melon- (16.5 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size melon- (8.5 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page