A framework for data piping in python

These details have not been verified by PyPI

Project description

pipda

Docs building

A framework for data piping in python

Inspired by siuba, dfply, plydata and dplython, but with simple yet powerful APIs to mimic the dplyr and tidyr packages in python

API | Change Log | Documentation

Installation

pip install -U pipda

Usage

Verbs

A verb is pipeable (able to be called like data >> verb(...))
A verb is dispatchable by the type of its first argument
A verb evaluates other arguments using the first one
A verb is passing down the context if not specified in the arguments

import pandas as pd
from pipda import (
    register_verb,
    register_func,
    register_operator,
    evaluate_expr,
    Operator,
    Symbolic,
    Context
)

f = Symbolic()

df = pd.DataFrame({
    'x': [0, 1, 2, 3],
    'y': ['zero', 'one', 'two', 'three']
})

df

#      x    y
# 0    0    zero
# 1    1    one
# 2    2    two
# 3    3    three

@register_verb(pd.DataFrame)
def head(data, n=5):
    return data.head(n)

df >> head(2)
#      x    y
# 0    0    zero
# 1    1    one

@register_verb(pd.DataFrame, context=Context.EVAL)
def mutate(data, **kwargs):
    data = data.copy()
    for key, val in kwargs.items():
        data[key] = val
    return data

df >> mutate(z=1)
#    x      y  z
# 0  0   zero  1
# 1  1    one  1
# 2  2    two  1
# 3  3  three  1

df >> mutate(z=f.x)
#    x      y  z
# 0  0   zero  0
# 1  1    one  1
# 2  2    two  2
# 3  3  three  3

Functions used as verb arguments

# verb can be used as an argument passed to another verb
# dep=True make `data` argument invisible while calling
@register_verb(pd.DataFrame, context=Context.EVAL, dep=True)
def if_else(data, cond, true, false):
    cond.loc[cond.isin([True]), ] = true
    cond.loc[cond.isin([False]), ] = false
    return cond

# The function is then also a singledispatch generic function

df >> mutate(z=if_else(f.x>1, 20, 10))
#    x      y   z
# 0  0   zero  10
# 1  1    one  10
# 2  2    two  20
# 3  3  three  20

# function without data argument
@register_func
def length(strings):
    return [len(s) for s in strings]

df >> mutate(z=length(f.y))

#    x     y    z
# 0  0  zero    4
# 1  1   one    3
# 2  2   two    3
# 3  3 three    5

Context

The context defines how a reference (f.A, f['A'], f.A.B is evaluated)

@register_verb(pd.DataFrame, context=Context.SELECT)
def select(df, *columns):
    return df[list(columns)]

df >> select(f.x, f.y)
#    x     y
# 0  0  zero
# 1  1   one
# 2  2   two
# 3  3 three

How it works

data %>% verb(arg1, ..., key1=kwarg1, ...)

The above is a typical dplyr/tidyr data piping syntax.

The counterpart python syntax we expect is:

data >> verb(arg1, ..., key1=kwarg1, ...)

To implement that, we need to defer the execution of the verb by turning it into a Verb object, which holds all information of the function to be executed later. The Verb object won't be executed until the data is piped in. It all thanks to the executing package to let us determine the ast nodes where the function is called. So that we are able to determine whether the function is called in a piping mode.

If an argument is referring to a column of the data and the column will be involved in the later computation, the it also needs to be deferred. For example, with dplyr in R:

data %>% mutate(z=a)

is trying add a column named z with the data from column a.

In python, we want to do the same with:

data >> mutate(z=f.a)

where f.a is a Reference object that carries the column information without fetching the data while python sees it immmediately.

Here the trick is f. Like other packages, we introduced the Symbolic object, which will connect the parts in the argument and make the whole argument an Expression object. This object is holding the execution information, which we could use later when the piping is detected.

Documentation

https://pwwang.github.io/pipda/

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.13.1

Oct 10, 2023

0.13.0

Oct 5, 2023

0.12.0

Apr 13, 2023

0.11.1

Jan 18, 2023

0.11.0

Dec 8, 2022

0.10.0

Dec 1, 2022

0.9.0

Oct 28, 2022

0.8.2

Oct 17, 2022

0.8.1

Oct 15, 2022

0.8.0

Oct 8, 2022

0.7.6

Oct 6, 2022

0.7.5

Oct 5, 2022

0.7.4

Oct 5, 2022

0.7.3

Sep 23, 2022

0.7.2

Sep 20, 2022

0.7.1

Sep 13, 2022

0.7.0

Sep 4, 2022

0.6.0

May 13, 2022

0.5.9

Mar 30, 2022

0.5.8

Mar 17, 2022

0.5.7

Mar 6, 2022

0.5.6

Mar 2, 2022

0.5.5

Mar 1, 2022

0.5.4

Mar 1, 2022

0.5.3

Feb 17, 2022

0.5.2

Feb 15, 2022

0.5.1

Feb 14, 2022

0.5.0

Feb 12, 2022

0.4.5

Aug 4, 2021

0.4.4

Aug 3, 2021

0.4.3

Jul 27, 2021

0.4.2

Jul 16, 2021

0.4.1

Jul 13, 2021

0.4.0

Jul 7, 2021

0.3.0

Jul 1, 2021

0.2.9

Jun 21, 2021

0.2.8

Jun 15, 2021

0.2.7

Jun 11, 2021

0.2.6

May 28, 2021

0.2.5

May 18, 2021

0.2.4

Apr 29, 2021

0.2.3

Apr 10, 2021

0.2.2

Apr 7, 2021

0.2.1

Apr 6, 2021

0.2.0

Mar 30, 2021

0.1.5

Mar 13, 2021

0.1.4

Mar 5, 2021

0.1.3

Mar 2, 2021

0.1.2

Mar 1, 2021

0.1.1

Feb 28, 2021

0.1.0

Feb 17, 2021

0.0.6

Dec 5, 2020

0.0.4

Dec 1, 2020

0.0.3

Nov 30, 2020

0.0.1

Nov 30, 2020

0.0.0

Nov 27, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipda-0.13.1.tar.gz (18.6 kB view details)

Uploaded Oct 10, 2023 Source

Built Distribution

pipda-0.13.1-py3-none-any.whl (20.8 kB view details)

Uploaded Oct 10, 2023 Python 3

File details

Details for the file pipda-0.13.1.tar.gz.

File metadata

Download URL: pipda-0.13.1.tar.gz
Upload date: Oct 10, 2023
Size: 18.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.6.1 CPython/3.11.5 Linux/6.2.0-1012-azure

File hashes

Hashes for pipda-0.13.1.tar.gz
Algorithm	Hash digest
SHA256	`56420cbb285a085db385a37ad267f59ba090ec1e901eb122132bd64ad5f515f9`
MD5	`5fd4c4c67137650662f0b4adee69b6f0`
BLAKE2b-256	`72ef51772bad9cb991011efcd3d99a4f052e5563da9db8439f5279e5aa8bb1fd`

See more details on using hashes here.

File details

Details for the file pipda-0.13.1-py3-none-any.whl.

File metadata

Download URL: pipda-0.13.1-py3-none-any.whl
Upload date: Oct 10, 2023
Size: 20.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.6.1 CPython/3.11.5 Linux/6.2.0-1012-azure

File hashes

Hashes for pipda-0.13.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9e9046ac507ad03ced7b63e09e2468bdc2c863c01d44233c5502b4f450461893`
MD5	`f8e3e40956b581743af088b386b353c1`
BLAKE2b-256	`768f10431c73e0e31d84e3e71264389787fe7e3cf6b3678e57684862af7d4f01`

See more details on using hashes here.

pipda 0.13.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

pipda

Installation

Usage

Verbs

Functions used as verb arguments

Context

How it works

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes