Skip to main content

Simple Stupid Pipe

Project description

Simple Stupid Pipe

SSPipe is a python productivity-tool for rapid data manipulation in python.

It helps you break up any complicated expression into a sequence of simple transformations, increasing human-readability and decreasing the need for matching parentheses!

If you're familiar with | operator of Unix, or %>% operator of R's magrittr, or DataFrame.pipe method of pandas library, sspipe provides the same functionality for any object in python.

Installation and Usage

Install sspipe using pip:

pip install --upgrade sspipe

Then import it in your scripts.

from sspipe import p

Although a few other helper objects are provided, whole functionality of this library is exposed by p object you have imported in the script above.

Introduction

Suppose we want to generate a dict, mapping names of 5 biggest files in current directory to their size in bytes, like below:

{'README.md': 3732, 'setup.py': 1642, '.gitignore': 1203, 'LICENSE': 1068, 'deploy.sh': 89}

One approach is to use os.listdir() to list files and directories in current working directory, filter those which are file, map each to a tuple of (name, size), sort them by size, take first 5 items, make adict and print it.

Although it is not a good practice to write the whole script in single expression without introducing intermediary variables, it is an exaggerated example, doing it in a single expression for demonstration purpose:

import os

print(
    dict(
        sorted(
            map(
                lambda x: (x, os.path.getsize(x)),
                filter(os.path.isfile, os.listdir('.'))
            ), key=lambda x: x[1], reverse=True
        )[:5]
    )
)

Using sspipe's p operator, the same single expression can be written in a more human-readable flow of sequential transformations:

import os
from sspipe import p

(
    os.listdir('.')
    | p(filter, os.path.isfile)
    | p(map, lambda x: (x, os.path.getsize(x)))
    | p(sorted, key=lambda x: x[1], reverse=True)[:5]
    | p(dict)
    | p(print)
)

As you see, the expression is decomposed into a sequence starting with initial data, os.list('.'), followed by multiple | p(...) stages.

Each | p(...) stage describes a transformation that is applied to to left-hand-side of |.

First argument of p() defines the function that is applied on data. For example, x | p(f1) | p(f2) | p(f3) is equivalent to f3(f2(f1(x))).

Rest of arguments of p() are passed to the transforming function of each stage. For example, x | p(f1, y) | p(f2, k=z) is equivalent to f2(f1(y, x), k=z)

Advanced Guide

The px helper

TODO: explain.

  • px is implemented by: px = p(lambda x: x)
  • px is similar to, but not same as, magrittr's dot(.) placeholder
    • x | p(f, px+1, y, px+2) is equivalent to f(x+1, y, x+2)
  • A+1 | f(px, px[2](px.y)) is equivalent to f(A+1, (A+1)[2]((A+1).y)
  • px can be used to prevent adding parentheses
    • x+1 | px * 2 | np.log(px)+3 is equivalent to: np.log((x+1) * 2) + 3

Integration with Numpy, Pandas, Pytorch

TODO: explain.

  • p and px are compatible with Numpy, Pandas, Pytorch.
  • [1,2] | p(pd.Series) | px[px ** 2 < np.log(px) + 1] is equivalent to x=pd.Series([1, 2]); x[x**2 < np.log(x)+1]

Integration with PyToolz

TODO: explain.

PyToolz provides a set of utility functions for iterators, functions, and dictionaries. For each utility function f() which is provided by pytoolz, p.f() is piped version of that utility.

  • {'x': 1, 'y': 7} | p.valmap(px+1) equals {'x': 2, 'y': 8}
  • range(5) | p.map(px**2) | p(list) equals [0, 1, 4, 9, 16]

Internals

TODO: explain.

  • p is a class that overrides __ror__ (|) operator to apply the function to operand.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sspipe-0.0.16.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sspipe-0.0.16-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file sspipe-0.0.16.tar.gz.

File metadata

  • Download URL: sspipe-0.0.16.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5

File hashes

Hashes for sspipe-0.0.16.tar.gz
Algorithm Hash digest
SHA256 3639d6c612bd2fcf04c950160e4fd5b400b7c1be30148cbfa0bee62884e076c0
MD5 2a8d886cc3679d27080df98c958aa13b
BLAKE2b-256 c31acd8c4434d70609e42a7fcbab38e77ae79d9f90d07520c32936fc1b94cea8

See more details on using hashes here.

File details

Details for the file sspipe-0.0.16-py3-none-any.whl.

File metadata

  • Download URL: sspipe-0.0.16-py3-none-any.whl
  • Upload date:
  • Size: 5.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5

File hashes

Hashes for sspipe-0.0.16-py3-none-any.whl
Algorithm Hash digest
SHA256 a1b615f38d85e75247a9d2aa08a6c2456501561bececb10009640cd9948eb738
MD5 08a459f79b4d927d37cb47954aca49b7
BLAKE2b-256 fdfadf4ae5dfd72131f43a84090f147334c6c434ef81c9ba5ab081d06123a4b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page