Skip to main content

Simple Smart Pipe Operator

Project description

Downloads Build Status PyPI

Simple Smart Pipe

SSPipe is a python productivity-tool for rapid data manipulation in python.

It helps you break up any complicated expression into a sequence of simple transformations, increasing human-readability and decreasing the need for matching parentheses!

If you're familiar with | operator of Unix, or %>% operator of R's magrittr, or DataFrame.pipe method of pandas library, sspipe provides the same functionality for any object in python.

Installation and Usage

Install sspipe using pip:

pip install --upgrade sspipe

Then import it in your scripts.

from sspipe import p, px

The whole functionality of this library is exposed by two objects p (as a wrapper for functions to be called on the piped object) and px (as a placeholder for piped object).

Examples

Description Python expression using p and px Equivalent python code
Simple
function call
"hello world!" | p(print) X = "hello world!"
print(X)
Function call
with extra args
"hello" | p(print, "world", end='!') X = "hello"
print(X, "world", end='!')
Explicitly positioning
piped argument
with px placeholder
"world" | p(print, "hello", px, "!") X = "world"
print("hello", X, "!")
Chaining pipes 5 | px + 2 | px ** 5 + px | p(print) X = 5
X = X + 2
X = X ** 5 + X
print(X)
Tailored behavior
for builtin map
and filter
(
range(5)
| p(filter, px % 2 == 0)
| p(map, px + 10)
| p(list) | p(print)
)
X = range(5)
X = filter((lambda x:x%2==0),X)
X = map((lambda x: x + 10), X)
print(list(X))
NumPy expressions range(10) | np.sin(px)+1 | p(plt.plot) X = range(10)
X = np.sin(X) + 1
plt.plot(X)
Pandas support people_df | px.loc[px.age > 10, 'name'] X = people_df
X.loc[X.age > 10, 'name']
Assignment people_df['name'] |= px.str.upper() X = people_df['name']
X = X.str.upper()
people_df['name'] = X
Builtin
Data Structures
2 | p({px-1: p([px, p((px+1, 4))])}) X = 2
X = {X-1: [X, (X+1, 4)]}

Introduction

Suppose we want to generate a dict, mapping names of 5 biggest files in current directory to their size in bytes, like below:

{'README.md': 3732, 'setup.py': 1642, '.gitignore': 1203, 'LICENSE': 1068, 'deploy.sh': 89}

One approach is to use os.listdir() to list files and directories in current working directory, filter those which are file, map each to a tuple of (name, size), sort them by size, take first 5 items, make adict and print it.

Although it is not a good practice to write the whole script in single expression without introducing intermediary variables, it is an exaggerated example, doing it in a single expression for demonstration purpose:

import os

print(
    dict(
        sorted(
            map(
                lambda x: [x, os.path.getsize(x)],
                filter(os.path.isfile, os.listdir('.'))
            ), key=lambda x: x[1], reverse=True
        )[:5]
    )
)

Using sspipe's p operator, the same single expression can be written in a more human-readable flow of sequential transformations:

import os
from sspipe import p

(
    os.listdir('.')
    | p(filter, os.path.isfile)
    | p(map, lambda x: [x, os.path.getsize(x)])
    | p(sorted, key=lambda x: x[1], reverse=True)[:5]
    | p(dict)
    | p(print)
)

As you see, the expression is decomposed into a sequence starting with initial data, os.list('.'), followed by multiple | p(...) stages.

Each | p(...) stage describes a transformation that is applied to to left-hand-side of |.

First argument of p() defines the function that is applied on data. For example, x | p(f1) | p(f2) | p(f3) is equivalent to f3(f2(f1(x))).

Rest of arguments of p() are passed to the transforming function of each stage. For example, x | p(f1, y) | p(f2, k=z) is equivalent to f2(f1(x, y), k=z)

Advanced Guide

The px helper

TODO: explain.

  • px is implemented by: px = p(lambda x: x)
  • px is similar to, but not same as, magrittr's dot(.) placeholder
    • x | p(f, px+1, y, px+2) is equivalent to f(x+1, y, x+2)
  • A+1 | f(px, px[2](px.y)) is equivalent to f(A+1, (A+1)[2]((A+1).y)
  • px can be used to prevent adding parentheses
    • x+1 | px * 2 | np.log(px)+3 is equivalent to: np.log((x+1) * 2) + 3

Integration with Numpy, Pandas, Pytorch

TODO: explain.

  • p and px are compatible with Numpy, Pandas, Pytorch.
  • [1,2] | p(pd.Series) | px[px ** 2 < np.log(px) + 1] is equivalent to x=pd.Series([1, 2]); x[x**2 < np.log(x)+1]

Compatibility with JulienPalard/Pipe

This library is inspired by, and depends on, the intelligent and concise work of JulienPalard/Pipe. If you want a single pipe.py script or a lightweight library that implements core functionality and logic of SSPipe, Pipe is perfect.

SSPipe is focused on facilitating usage of pipes, by integration with popular libraries and introducing px concept and overriding python operators to make pipe a first-class citizen.

Every existing pipe implemented by JulienPalard/Pipe library is accessible through p.<original_name> and is compatible with SSPipe. SSPipe does not implement any specific pipe function and delegates implementation and naming of pipe functions to JulienPalard/Pipe.

For example, JulienPalard/Pipe's example for solving "Find the sum of all the even-valued terms in Fibonacci which do not exceed four million." can be re-written using sspipe:

def fib():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

from sspipe import p, px

euler2 = (fib() | p.where(lambda x: x % 2 == 0)
                | p.take_while(lambda x: x < 4000000)
                | p.add())

You can also pass px shorthands to JulienPalard/Pipe API:

euler2 = (fib() | p.where(px % 2 == 0)
                | p.take_while(px < 4000000)
                | p.add())

Internals

TODO: explain.

  • p is a class that overrides __ror__ (|) operator to apply the function to operand.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sspipe-0.1.10.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sspipe-0.1.10-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file sspipe-0.1.10.tar.gz.

File metadata

  • Download URL: sspipe-0.1.10.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.0.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for sspipe-0.1.10.tar.gz
Algorithm Hash digest
SHA256 dc760d2202e8e08c8736e6f9443a6a819319c2c6ec112f5c16d8096867efbd16
MD5 50a6594de6f70e64a704d96c81054b2d
BLAKE2b-256 7d83d8715054674aebbfb70bac86cb74b3ce92b51d5e5a5a804cb7bc43e5cf42

See more details on using hashes here.

File details

Details for the file sspipe-0.1.10-py3-none-any.whl.

File metadata

  • Download URL: sspipe-0.1.10-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.0.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for sspipe-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 c4eb83d9d240668fd441f479f939eab869d9d0070a62c3b31bd83056715c9d1f
MD5 bde55de97c4fc91653f848bb75653d1c
BLAKE2b-256 b14497f1402ebd89b647d0604f2dcb81b89fc8968478660a790df76eadb65744

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page