Skip to main content

A powerful parallel pipelining tool for image processing

Project description

Olympict

coveragestatus

Olympict

Based on olympipe, this project will make image processing pipelines easy to use using the basic multiprocessing module. This module uses type checking to ensure your data process validity from the start.

Basic image processing pipeline

Loading images from a folder and resize them to a new folder

from olympict import ImagePipeline

p0 = ImagePipeline.load_folder("./examples") # path containing the images
p1 = p0.resize((150, 250)) # new width, new height
p2 = p1.save_to_folder("./resized_examples") # path to save the images
p2.wait_for_completion() # the code blocks here until all images are processed

print("Finished resizing")

Loading images from a folder and overwrite them with a new size

from olympict import ImagePipeline

p0 = ImagePipeline.load_folder("./examples") # path containing the images
p1 = p0.resize((150, 250))
p2 = p1.save() # overwrite the images
p2.wait_for_completion()

Loading images from a folder and resize them keeping the aspect ratio using a padding color

from olympict import ImagePipeline
blue = (255, 0, 0) # Colors are BGR to match opencv format

p0 = ImagePipeline.load_folder("./examples")
p1 = p0.resize((150, 250), pad_color=blue)
p2 = p1.save() # overwrite the images
p2.wait_for_completion()

Load image to make a specific operation

from olympict import ImagePipeline, Img

def operation(image: Img) -> Img:
    # set the green channel as a mean of blue and red channels
    img[:, :, 1] = (img[:, :, 0] + img[:, :, 2]) / 2
    return img

p0 = ImagePipeline.load_folder("./examples")
p1 = p0.task_img(operation)
p2 = p1.save() # overwrite the images
p2.wait_for_completion()

Check ongoing operation

from olympict import ImagePipeline, Img

def operation(image: Img) -> Img:
    # set the green channel as a mean of blue and red channels
    img[:, :, 1] = (img[:, :, 0] + img[:, :, 2]) / 2
    return img

p0 = ImagePipeline.load_folder("./examples").debug_window("Raw image")
p1 = p0.task_img(operation).debug_window("Processed image")
p2 = p1.save() # overwrite the images
p2.wait_for_completion()

Load a video and process each individual frame

from olympict import VideoPipeline

p0 = VideoPipeline.load_folder("./examples") # will load .mp4 and .mkv files

p1 = p0.to_sequence() # split each video frame into a basic image

p2 = p1.resize((100, 3), (255, 255, 255)) # resize each image with white padding

p3 = p2.save_to_folder("./sequence") # save images individually

p3.wait_for_completion()

img_paths = glob("./sequence/*.png") # count images

print("Number of images:", len(img_paths))

Complex example with preview windows

import os
from random import randint
import re
import time
from olympict import ImagePipeline
from olympict.files.o_image import OlympImage


def img_simple_order(path: str) -> int:
    number_pattern = r"\d+"
    res = re.findall(number_pattern, os.path.basename(path))

    return int(res[0])


if __name__ == "__main__":

    def wait(x: OlympImage):
        time.sleep(0.1)
        print(x.path)
        return x

    def generator():
        for i in range(96):
            img = np.zeros((256, 256, 3), np.uint8)
            img[i, :, :] = (255, 255, 255)

            o = OlympImage()
            o.path = f'/tmp/{i}.png'
            o.img = img
            yield o
        return

    p = (
        ImagePipeline(generator())
        .task(wait)
        .debug_window("start it")
        .task_img(lambda x: x[::-1, :, :])
        .debug_window("flip it")
        .keep_each_frame_in(1, 3)
        .debug_window("stuttered")
        .draw_bboxes(
            lambda x: [
                (
                    (
                        randint(0, 50),
                        randint(0, 50),
                        randint(100, 200),
                        randint(100, 200),
                        "change",
                        0.5,
                    ),
                    (randint(0, 255), 25, 245),
                )
            ]
        )
        .debug_window("bboxed")
    )

    p.wait_for_completion()

Basic usage

Each pipeline starts from an interator as a source of packets (a list, tuple, or any complex iterator). This pipeline will then be extended by adding basic .task(<function>). The pipeline process join the main process when using the .wait_for_results() or .wait_for_completion() functions.

from olympict import Pipeline

def times_2(x: int) -> int:
    return x * 2

p = Pipeline(range(10))

p1 = p.task(times_2) # Multiply each packet by 2
# or 
p1 = p.task(lambda x: x * 2) # using a lambda function

res = p1.wait_for_results()

print(res) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Filtering

You can choose which packets to .filter(<keep_function>) by passing them a function returning True or False when applied to this packet.

from olympict import Pipeline

p = Pipeline(range(20))
p1 = p.filter(lambda x: x % 2 == 0) # Keep pair numbers
p2 = p1.batch(2) # Group in arrays of 2 elements

res = p2.wait_for_results()

print(res) # [[0, 2], [4, 6], [8, 10], [12, 14], [16, 18]]

In line formalization

You can chain declarations to have a more readable pipeline.

from olympict import Pipeline

res = Pipeline(range(20)).filter(lambda x: x % 2 == 0).batch(2).wait_for_results()

print(res) # [[0, 2], [4, 6], [8, 10], [12, 14], [16, 18]]

Debugging

Interpolate .debug() function anywhere in the pipe to print packets as they arrive in the pipe.

from olympict import Pipeline

p = Pipeline(range(20))
p1 = p.filter(lambda x: x % 2 == 0).debug() # Keep pair numbers
p2 = p1.batch(2).debug() # Group in arrays of 2 elements

p2.wait_for_completion()

Pipeline forking

For the time being, you have to adapt the code a little bit if you wish to get several outputs for a same pipeline. [This section might be updated soon]

from olympict import Pipeline

p1 = Pipeline(range(10))
p2 = p1.filter(lambda x: x % 2 == 0)
p3 = p1.filter(lambda x: x % 2 == 1)

q2 = p2.prepare_output_buffer()
q3 = p3.prepare_output_buffer()

res3 = p3.wait_for_results(q3)
res2 = p2.wait_for_results(q2)

print(res3) # [1, 3, 5, 7, 9]
print(res2) # [0, 2, 4, 6, 8]

Real time processing (for sound, video...)

Use the .temporal_batch(<seconds_float>) pipe to aggregate packets received at this point each <seconds_float> seconds.

import time
from olympict import Pipeline

def delay(x: int) -> int:
    time.sleep(0.1)
    return x

p = Pipeline(range(20)).task(delay) # Wait 0.1 s for each queue element
p1 = p.filter(lambda x: x % 2 == 0) # Keep pair numbers
p2 = p1.temporal_batch(1.0) # Group in arrays of 2 elements

res = p2.wait_for_results()

print(res) # [[0, 2, 4, 6, 8], [10, 12, 14, 16, 18], []]

Using classes in a pipeline

You can add a stateful class instance to a pipeline. The method used will be typecheked as well to ensure data coherence. You just have to use the .class_task(<Class>, <Class.method>, ...) method where Class.method is the actual method you will use to process each packet.

item_count  = 5

class StockPile:
    def __init__(self, mul:int):
        self.mul = mul
        self.last = 0
        
    def pile(self, num: int) -> int:
        out = self.last
        self.last = num * self.mul
        return out
        

p1 = Pipeline(range(item_count))

p2 = p1.class_task(StockPile, StockPile.pile, [3])

res = p2.wait_for_results()

print(res) # [0, 0, 3, 6, 9]

This project is still an early version, feedback is very helpful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

olympict-1.0.0.tar.gz (13.3 kB view hashes)

Uploaded Source

Built Distribution

olympict-1.0.0-py3-none-any.whl (14.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page