Skip to main content

A Python utility for multiprocessing pipelines

Project description

PyPI version License: MIT

⚡️ Introduction

multipipe is a Python utility that allows you to create pipelines of functions to execute on any given iterable (e.g., lists, generators) by leveraging multiprocessing. multipipe is built on top of multiprocess.

🔌 Requirements

python>=3.8

💾 Installation

pip install multipipe

💡 Examples

Basic usage

from multipipe import Multipipe

def add(x):
    return x + 1

def mul(x):
    return x * 2

pipe = Multipipe([ add, mul ])
pipe(range(10))

Output:

[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]

Using partials

Sometimes, you may want to use partials to pass arguments to your functions.

from multipipe import Multipipe
from functools import partial

def add(x, y):
    return x + y

def mul(x, y):
    return x * y

pipe = Multipipe([ partial(add, y=1), partial(mul, y=2) ])
pipe(range(10))

Output:

[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]

Complex IO pipeline

In this example, we lazily read data from a JSONl file, execute a pipeline of functions lazily, and write the results to a new JSONl file. In practice, this allows you to process huge files without loading their content into memory all-at-once.

from multipipe import Multipipe
from unified_io import read_jsonl, write_jsonl

# Create a pipeline of functions
pipe = Multipipe([ ... ])

# Read a JSONl file line-by-line as a generator, i.e., lazily
in_data = read_jsonl("path/to/input/file.jsonl", generator=True)

# This is still a generator.
# The pipeline will be executed lazily.
out_data = pipe(in_data, generator=True)

# Write a JSONl file from the generator executing the pipeline
write_jsonl(out_data, "path/to/output/file.jsonl")

🎁 Feature Requests

Would you like to see other features implemented? Please, open a feature request.

🤘 Want to contribute?

Would you like to contribute? Please, drop me an e-mail.

📄 License

multipipe is an open-sourced software licensed under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multipipe-0.1.0.tar.gz (4.0 kB view hashes)

Uploaded Source

Built Distribution

multipipe-0.1.0-py3-none-any.whl (4.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page