Skip to main content

data pipeline and convienence library targeted at accelerating the development of imaging projects and research

Project description



The imagepypelines package consists of high level tools which simplify the construction of complex image processing, computer vision, and machine learning frameworks. During our time in the undergrad Imaging Science program at the Rochester Institute of Technology, we found ourselves writing and rewriting code for things as simple as data type casting and displaying imagery when debugging, causing more trouble than mathematical or logical bugs themselves! Our hope is that the plug-and-play, easily-customizable nature of imagepypelines will allow all data-driven scientists to construct complex frameworks quickly for prototyping applications, and serve as a valuable educational tool for those interested in learning traditionally tough subject matter in a friendly environment!

To achieve this goal, our development team always adheres to the following 5 core principles:

  1. Legos are fun
  2. Coding should be fun
  3. Therefore coding should be like playing with Legos
  4. Imagery is fun, so that will always be our focus
  5. We must suffer, lest our users suffer


(make sure you see the dependencies section)

Python compatibility: 3.4-3.6 (Python 2.7 backwards) 64bit

via pip:

pip install imagepypelines --user

from source:

git clone
cd imagepypelines
python install


for full functionality, imagepypelines requires opencv and tensorflow to be installed on your machine

pip install tensorflow-gpu --user

(or for cpu only)

pip install tensorflow --user

we strongly recommend that you build opencv from source. However unofficial bindings for opencv can be installed with

pip install opencv-python --user

(while we haven't encountered many problems with these unofficial bindings, we do not guarantee support)


Full documentation for imagepypelines, including examples and tutorials, can be found on our website:

Licensing / Credit

imagepypelines is licensed under the MIT permissive software license. You may use this code for commercial or research use so long as it conforms to the terms of the license included in this repo as well as the licenses of imagepypelines dependencies.

Please credit us if you use imagepypelines in your research

What Makes Us Unique?

The Pipeline

imagepypelines's most powerful feature is a high level interface to create data processing pipelines which apply a sequence of algorithms to input data automatically.

In our experience as imaging scientists, processing pipelines in both corporate or academic settings are not always easy to adapt for new purposes and are therefore too often relegated to proof-of-concept applications only. Many custom pipelines may also not provide step-by-step error checking, which can make debugging a challenge. xkcd

(source: XKCD)

The Pipeline object of imagepypelines allows for quick construction and prototyping, ensures end-to-end compatibility through each layer of a workflow, and leverages helpful in-house debugging utilities for use in image-centric or high-dimensional data routines.

The Block

Pipelines in imagepypelines are constructed of processing blocks which apply an algorithm to a sequence of data passed into it.


Each block takes in a list of data and returns a list of data, passing it onto the next block or out of the pipeline. This system ensures that blocks are compatible with algorithms that process data in batches or individually. Blocks also support label handling, and thus are compatible with supervised machine learning systems or other algorithms that require training

Broadly speaking, each box can be thought of as a black box which simply applies an operation to input data block

a datum can be anything: an image array, a filename, a label -- pretty much an pythonic type.

Blocks can also output more or less datums than they take in and are thus capable of being used for culling or injecting data into the pipeline.

Hang on? are all blocks compatible with one another?

not entirely, each block has predefined acceptable inputs and outputs. However the Pipeline object will validate the pipeline integrity before any data is processed

Building a pipeline

building a pipeline is super easy

Image Display Pipeline

import imagepypelines as ip

pipeline = ip.Pipeline(name='image display')
pipeline.add( ip.ImageLoader() ) # each one of these elements are 'blocks'
pipeline.add( ip.Resizer() )
pipeline.add( ip.BlockViewer() )

# now let's display some example data!
pipeline.process( ip.standard_image_filenames() )

We just made a processing pipeline that can read in images, resize them and display them! but we can do much more complicated operations.

Lowpass Filter Pipeline

import imagepypelines as ip

load = ip.ImageLoader()
resize = ip.Resizer(512,512)
fft = ip.FFT()
lowpass = ip.Lowpass(cut_off=32)
ifft = ip.IFFT()
display = ip.BlockViewer(pause_time=1)

pipeline = ip.Pipeline(blocks=[load,resize,fft,lowpass,ifft,display])

# process a set of images (using imagepypelines' example data)
filenames = ip.standard_image_filenames()

Machine Learning Applications

One of the more powerful applications of imagepypelines is it's ease of use in machine learning and feature engineering applications. we can easily tailor a pipeline to perform image classification

this classifier is available as a builtin Pipeline with fully tweakable hyperparameters as ip.SimpleImageClassifier

import imagepypelines as ip

features = ip.PretrainedNetwork() # image feature block
pca = ip.PCA(256) # principle component analysis block
neural_network = ip.MultilayerPerceptron(neurons=512, num_hidden=2) # neural network block

classifier = ip.Pipeline([features,pca,neural_network])

# loading example data
cifar10 = ip.Cifar10()
train_data, train_labels = cifar10.get_train()
test_data, ground_truth = cifar10.get_test()

classifier.train(train_data,train_labels) # train the classifier
predictions = classifier.process(test_data) # test the classifier

# print the accuracy
accuracy = ip.accuracy(predictions,ground_truth) * 100
print('pipeline classification accuracy is {}%!'.format(accuracy))

We just trained a full neural network classifier!

Processing Blocks built into imagepypelines

more are being added with every commit!

I/O operations

  • Image Display
  • Camera Capture
  • Image Loader
  • Image Writing

Machine Learning

  • Linear Support Vector Machine
  • Rbf Support Vector Machine
  • Poly Support Vector Machine
  • Sigmoid Support Vector Machine
  • trainable neural networks
  • 8 Pretrained Neural Networks (for feature extraction)
  • Principle Component Analysis

Image Processing

  • colorspace conversion
  • fast fourier transform
  • frequency filtering
  • Otsu Image Segmentation
  • ORB keypoint and description
  • Image resizing

Designing your own processing blocks

There are two ways to create a block

1) quick block creation

for operations that can be completed in a single function that accepts one datum, you can create a block with a single line.

import imagepypelines as ip

# create the function we use to process images
def normalize_image(img):
	return img / img.max()

# set up the block to work with grayscale and color imagery
io_map = {ip.ArrayType([None,None]):ip.ArrayType([None,None]),

block = ip.quick_block(normalize_image, io_map)

2) object inheritance

this is covered in more detail on our tutorial pages. this example will not cover training or label handling

import imagepypelines as ip

class NormalizeBlock(ip.SimpleBlock):
	"""normalize block between 0 and max_count, inclusive"""
	def __init__(self,max_count=1):
		self.max_count = max_count
		# set up the block to work with grayscale and color imagery
		io_map = {ip.ArrayType([None,None]):ip.ArrayType([None,None]),


	def process(self,img):
		"""overload the processing function for this block"""
		return img.astype(np.float32) / img.max() * self.max_count

Imaging Science Convenience Functions

In addition to the Pipeline, imagepypelines also contains convenience utilities to accelerate the development of imaging science and computer vision tasks

Getting Standard Test Imagery

imagepypelines contains helper functions to quickly retrieve imagery that are frequently used as benchmarks in the Imaging Science community

import imagepypelines as ip
lenna = ip.lenna()
linear_gradient = ip.linear()

A full list of standard images can be retrieved with ip.list_standard_images()

for those of you in the Imaging Science program at RIT, there are a couple easter eggs for ya ;)

import imagepypelines as ip
ip.quick_image_view( ip.carlenna() )
ip.quick_image_view( ip.roger() )
ip.quick_image_view( ip.pig() )

Viewing Imagery

Viewing imagery can be an surprisingly finicky process that differs machine to machine or operating over X11. imagepypelines contains helper functions and objects for this purpose

quick image viewer:

when you want to quickly display an image without any bells and whistles, you can use the quick_image_view function

import imagepypelines as ip
lenna = ip.lenna()

# Now lets display Lenna
ip.quick_image_view( lenna )

# display lenna normalized to 255
ip.quick_image_view(lenna, normalize_and_bin=True)

Robust Image Viewer:

When you want a tool that can display multiple images at once, resize images when desired and an optional frame_counter, you can use the Viewer object

import imagepypelines as ip
import time

# lets build our Viewer and have it auto-resize images to 512x512
viewer = ip.Viewer('Window Title Here', size=(512,512))
# let's enable the frame counter, so we know what image we are on

# get all standard images
standard_images = ip.standard_image_gen()

# now let's display all images sequentially!
for img in standard_images:
	viewer.view( img )

Normalizing and binning an image

forgetting to do this gets ya more often than you might think when displaying an image

import imagepypelines as ip
import numpy as np
random_pattern = np.random.rand(512,512).astype(np.float32)

display_safe = ip.normalize_and_bin(random_pattern)

Array Summarization

when debugging an image pipeline, printing out an image can be counter productive. Imaging scientists frequently default to printing out the shape or size of the data. imagepypelines contains a helper class to quickly summarize an image in a formatted string

import imagepypelines as ip
lenna = ip.lenna()

summary = ip.Summarizer(lenna)

produces the following

'[ARRAY SUMMARY | shape: (512, 512, 3) | size: 786432 | max: 255 | min: 3 | mean: 128.228 | dtype: uint8]'

Image Coordinates

helper functions to get image coordinates quickly, useful if your applications involve a mix of color and grayscale images. Mostly useful to clean up code and avoid silly mistakes

import imagepypelines as ip
lenna = ip.lenna()

# center pixel in the image
center_row, center_col = ip.centroid(lenna)

# number of rows and columns
rows, cols = ip.frame_size(lenna)

# shape and dtype
rows, cols, bands, dtype = ip.dimensions(lenna)


Many imaging tasks are time sensitive or computationally intensive. imagepypelines includes simple tools to time your process or function

Timer Objects

imagepypelines also includes a separate timer for timing things inside a function or code block

absolute timing:
from imagepypelines.util import Timer
import time

t = Timer()
print( t.time(),"seconds" ) # or t.time_ms() for milliseconds
lap timing:
from imagepypelines.util import Timer
import time

t = Timer()
for i in range(10):
	print( t.lap(),"seconds" ) # or t.lap_ms() for milliseconds
perform operation for N seconds:
from imagepypelines.util import Timer
import time

def do_something():

# set the countdown
N = 10 #seconds
t = Timer()
t.countdown = N
while t.countdown:

timing Decorator

let's say we have a function that we think may be slowing down our pipeline. We can add @function_timer on the line above the function and see it automatically print how long the function took to run

from imagepypelines.util import function_timer
from imagepypelines.util import function_timer_ms
import time

# add the decorator here
def we_can_time_in_seconds():

# we can also time the function in milliseconds using '@function_timer_ms'
def or_in_milliseconds():


prints the following when the above code is run

(  function_timer  )[    INFO    ] ran function 'we_can_time_in_seconds' in 1.001sec
(  function_timer  )[    INFO    ] ran function 'or_in_milliseconds' in 1000.118ms

Development Tools in imagepypelines

This section is for developers of imagepypelines or people who want imagepypelines closely integrated with their projects


Are you a scientist??? If so, then you probably use millions of print statements to debug your code. (don't worry, we are all guilty of it)

imagepypelines encourages code traceability through the use of an object known as a Printer. Printers are objects that simply print out what's happening in a manner that's easy to read, color coded, and traceable to the object that is performing the current action. Printers are extremely low overhead and will not affect the speed of your code more than a print statement.

The functionality is similar to python's logging module

making printers

printers can be created or retrieved using the get_printer function

import imagepypelines as ip
printer = ip.get_printer('name your printer here')

printer levels

printer messages can be filtered be priority so that only desired messages can be seen. In imagepypelines, printer levels are also color coded so they can be read easily in a console

import imagepypelines as ip

example_printer = ip.get_printer('example!')
example_printer.debug('message') # prints 'message' at level 10 - blue text'message') # prints 'message' at level 20 - white text
example_printer.warning('message') # prints 'message' at level 30 - yellow text
example_printer.comment('message') # prints 'message' at level 30 - green text
example_printer.error('message') # prints 'message' at level 40 - red text
example_printer.critical('message') # prints 'message' at level 50 - bold red text

Any level that is less than the current GLOBAL_LOG_LEVEL will NOT be printed. This makes it easy to filter out statements which may be erroneous or too numerous to make sense of.

this value can be set with the set_global_printout_level function

import imagepypelines as ip
ip.set_global_printout_level('warning') # debug and info statements will not print now

local printer levels can be set with Printer.set_log_level

import imagepypelines as ip
printer = ip.get_printer('Example Printer')
printer.set_log_level('error') # only error and critical functions will print

(this system is exactly the same as log_levels in python's logging module )

disable or enabling certain printers

Sometimes you may only want to see printouts from a specific class or function. you can do this with the whitelist_printer, blacklist_printer, or disable_all_printers functions

default printer

there's a default printer in imagepypelines which is accessible through functions in the main module

ip.debug('debug message') # level=10 --> (    imagepypelines    )[    DEBUG    ] debug message'info message') # level=20 --> (    imagepypelines    )[    INFO    ] debug message
ip.warning('warning message') # level=30 --> (    imagepypelines    )[    WARNING    ] warning message
ip.error('error message') # level=40 --> (    imagepypelines    )[    ERROR    ] error message
ip.critical('critical message') # level=50 --> (    imagepypelines    )[    CRITICAL    ] critical message
ip.comment('comment message') # level=30 --> (    imagepypelines    )[    COMMENT    ] comment message

class printers

a good strategy to encourage traceability is to create a printer object as a class instance attribute

import imagepypelines as ip

class ExampleClass(object):
	def __init__(self,*args,**kwargs):
		name_of_class = self.__class__.__name__
		self.printer = ip.get_printer(name_of_class)"object instantiated!")


	def do_something(self):
		self.printer.warning("did something!")


produces the following

(   ExampleClass   )[    INFO    ] object instantiated!
(   ExampleClass   )[   WARNING  ] did something!

This way it's easy track what stage of the pipeline your code is in, because each object will have it's own printer and be distinguishable in the terminal!

development decorators

imagepypelines contains four decorators that are made for use by developers in the backend


made to decorate functions or classes that are deprecated

import imagepypelines as ip

@ip.util.deprecated("'old_function' has been renamed to 'new_function'. references will be removed in a future version!")
def old_function():
	pass # real code will do something


produces the following

(    imagepypelines    )[   WARNING  ] DEPRECIATION WARNING: 'old_function' has been renamed to 'new_function'. references will be removed in a future version!


made to decorate functions or classes that are experimental and may not be fully tested yet

import imagepypelines as ip

@ip.util.experimental() # you can include a custom message here if you want
def new_feature():


produces the following

(    imagepypelines    )[   WARNING  ] EXPERIMENTAL WARNING: 'new_feature' is an experimental feature


This is a decorator made for unit tests which require a human to verify functionality. (for example: functions that display images)

WARNING: unlike most decorators, this will not return the output of the wrapped function, but instead True or False. This is because it is meant for Unit Tests, NOT actual use in a pipeline

import imagepypelines as ip
def unit_test_for_quick_image_view():
	ip.quick_image_view( ip.lenna() )

print('test succeeded ': unit_test_for_quick_image_view())

lenna will display

produces the following the terminal for the user to answer

did the test for 'unit_test_for_quick_image_view' succeed? Yes? No?


Decorator to print out the arguments a function is running with. Unlike other decorators described here, we encourage you to use this decorator frequently in your code during development to avoid silly mistakes

import imagepypelines as ip
def func_with_lots_of_args(a, b, c=3, d=4):
func_with_lots_of_args(1, b=2, c='not 3')

produces the following in the terminal

(func_with_lots_of_args)[    INFO    ] running 'func_with_lots_of_args' with the following args:
        positional    | a : 1
        keyword       | b : 2
        keyword       | c : not 3
        default       | d : 4

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imagepypelines-0.1.2a0.tar.gz (3.6 MB view hashes)

Uploaded source

Built Distributions

imagepypelines-0.1.2a0-py3.6.egg (3.7 MB view hashes)

Uploaded 3 6

imagepypelines-0.1.2a0-py3-none-any.whl (3.6 MB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page