Skip to main content

Building simple pipelines simply.

Project description

lined

Building simple pipelines, simply.

And lightly too! No dependencies. All with pure builtin python.

A really simple example:

>>> p = Pipeline(sum, str)
>>> p([2, 3])
'5'

A still quite simple example:

>>> def first(a, b=1):
...     return a * b
>>>
>>> def last(c) -> float:
...     return c + 10
>>>
>>> f = Pipeline(first, last)
>>>
>>> assert f(2) == 12
>>> assert f(2, 10) == 30

Let's check out the signature of f:

>>> from inspect import signature
>>>
>>> assert str(signature(f)) == '(a, b=1) -> float'
>>> assert signature(f).parameters == signature(first).parameters
>>> assert signature(f).return_annotation == signature(last).return_annotation == float

Border case: One function only

>>> same_as_first = Pipeline(first)
>>> assert same_as_first(42) == first(42)

More?

string and dot digraph representations

Pipeline's string representation (__repr__) and how it deals with callables that don't have a __name__ (hint: it makes one up):

from lined.base import Pipeline
from functools import partial

pipe = Pipeline(sum, np.log, str, print, partial(map, str), name='some_name')
pipe
Pipeline(sum, log, str, print, unnamed_func_001, name='some_name')

If you have graphviz installed, you can also do this:

pipe.dot_digraph()

image

And if you don't, but have some other dot language interpreter, you can just get the body (and fiddle with it):

print('\n'.join(pipe.dot_digraph_body()))
rankdir="LR"
sum [shape="box"]
log [shape="box"]
str [shape="box"]
print [shape="box"]
unnamed_func_001 [shape="box"]
sum -> log
log -> str
str -> print
print -> unnamed_func_001

Optionally, a pipeline can have an input_name and/or an output_name. These will be used in the string representation and the dot digraph.

pipe = Pipeline(sum, np.log, str, print, partial(map, str), input_name='x', output_name='y')
str(pipe)
"Pipeline(sum, log, str, print, unnamed_func_001, name='some_name')"
pipe.dot_digraph()

image

Tools

iterize and iterate

from lined import Pipeline

pipe = Pipeline(lambda x: x * 2, 
                lambda x: f"hello {x}")
pipe(1)
'hello 2'

But what if you wanted to use the pipeline on a "stream" of data. The following wouldn't work:

try:
    pipe(iter([1,2,3]))
except TypeError as e:
    print(f"{type(e).__name__}: {e}")
TypeError: unsupported operand type(s) for *: 'list_iterator' and 'int'

Remember that error: You'll surely encounter it at some point.

The solution to it is (often): iterize, which transforms a function that is meant to be applied to a single object, into a function that is meant to be applied to an array, or any iterable of such objects. (You might be familiar (if you use numpy for example) with the related concept of "vectorization", or array programming.)

from lined import Pipeline, iterize
from typing import Iterable

pipe = Pipeline(iterize(lambda x: x * 2), 
                iterize(lambda x: f"hello {x}"))
iterable = pipe([1, 2, 3])
assert isinstance(iterable, Iterable)  # see that the result is an iterable
list(iterable)  # consume the iterable and gather it's items
['hello 2', 'hello 4', 'hello 6']

Instead of just computing the string, say that the last step actually printed the string (called a "callback" function whose result was less important than it's effect -- like storing something, etc.).

from lined import Pipeline, iterize, iterate

pipe = Pipeline(iterize(lambda x: x * 2), 
                iterize(lambda x: print(f"hello {x}")),
               )

for _ in pipe([1, 2, 3]):
    pass
hello 2
hello 4
hello 6

It could be a bit awkward to have to "consume" the iterable to have it take effect.

Just doing a

pipe([1, 2, 3])

to get those prints seems like a more natural way.

This is where you can use iterate. It basically "launches" that consuming loop for you.

from lined import Pipeline, iterize, iterate

pipe = Pipeline(iterize(lambda x: x * 2), 
                iterize(lambda x: print(f"hello {x}")),
                iterate
               )

pipe([1, 2, 3])
hello 2
hello 4
hello 6

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lined-0.0.14.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

lined-0.0.14-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file lined-0.0.14.tar.gz.

File metadata

  • Download URL: lined-0.0.14.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for lined-0.0.14.tar.gz
Algorithm Hash digest
SHA256 c00cebf7e92a5120d9ece462169426f251b96418f66603103a044df6db1a1046
MD5 694d90b99e5ca23b5a272657e7359890
BLAKE2b-256 ea72ba58c7ec4dc3129722da50ac31ee933fa06c9e210c904b8325decd9962f7

See more details on using hashes here.

File details

Details for the file lined-0.0.14-py3-none-any.whl.

File metadata

  • Download URL: lined-0.0.14-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for lined-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 b5ac206c62fed7870eae0ed69f79a59e9b2dfcc27567fca22fb04fa5c9754a7d
MD5 2591573b583f35e3588efecff8b2f60a
BLAKE2b-256 5ca85fd20b88bfedfa6fd8874aee418f784d4da769b3090e39036491bc84216b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page