Skip to main content

A Python utility for multiprocessing pipelines

Project description

PyPI version License: MIT

⚡️ Introduction

multipipe is a Python utility that allows you to create pipelines of functions to execute on any given iterable (e.g., lists, generators) by leveraging multiprocessing. multipipe is built on top of multiprocess.

🔌 Requirements

python>=3.8

💾 Installation

pip install multipipe

💡 Examples

Basic usage

from multipipe import Multipipe

def add(x):
    return x + 1

def mul(x):
    return x * 2

pipe = Multipipe([ add, mul ])
pipe(range(10))

Output:

[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]

Using partials

Sometimes, you may want to use partials to pass arguments to your functions.

from multipipe import Multipipe
from functools import partial

def add(x, y):
    return x + y

def mul(x, y):
    return x * y

pipe = Multipipe([ partial(add, y=1), partial(mul, y=2) ])
pipe(range(10))

Output:

[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]

Complex IO pipeline

In this example, we lazily read data from a JSONl file, execute a pipeline of functions lazily, and write the results to a new JSONl file. In practice, this allows you to process huge files without loading their content into memory all-at-once.

from multipipe import Multipipe
from unified_io import read_jsonl, write_jsonl

# Create a pipeline of functions
pipe = Multipipe([ ... ])

# Read a JSONl file line-by-line as a generator, i.e., lazily
in_data = read_jsonl("path/to/input/file.jsonl", generator=True)

# This is still a generator.
# The pipeline will be executed lazily.
out_data = pipe(in_data, generator=True)

# Write a JSONl file from the generator executing the pipeline
write_jsonl(out_data, "path/to/output/file.jsonl")

🎁 Feature Requests

Would you like to see other features implemented? Please, open a feature request.

🤘 Want to contribute?

Would you like to contribute? Please, drop me an e-mail.

📄 License

multipipe is an open-sourced software licensed under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multipipe-0.1.0.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

multipipe-0.1.0-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file multipipe-0.1.0.tar.gz.

File metadata

  • Download URL: multipipe-0.1.0.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.12

File hashes

Hashes for multipipe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1d939a8e38ae83ffb837faf4595f9ef411a3d90eb96a13b179f40ecdde6a6242
MD5 5f39be6e90ed17a9609e6ba7fe87a375
BLAKE2b-256 9505b3f876e623a43d6f8247191df8b80750c737988dcd7006667857a426545d

See more details on using hashes here.

File details

Details for the file multipipe-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: multipipe-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.12

File hashes

Hashes for multipipe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4b96a57941f3b833696323e9850841d0ec71b4b6e080e84972688e84f91d7dc5
MD5 1295dbc92967d92286728c00a0ad4e0b
BLAKE2b-256 30c1e8e82b2b78c9b549faf2a8ff7c78f713cf2eaa7f7f59c09e6941f59f5bc1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page