Skip to main content

Data Pipeline for your Python projects

Project description

Python Data Pipeline

Process any type of data in your projects easily, control the flow of your data!

Python 3.5, 3.6, 3.7 PyPI version


Installation

Install smart-pipeline with:

pip install -U smart-pipeline

Usage

Package 'smart_pipeline' provides a Pipeline class:

# Import Pipeline class
from smart_pipeline import Pipeline

# Create an instance
pl = Pipeline()

Pipeline class has 3 types of pipes: item, data and stat.

Item pipe modifies each item in dataset without changing the whole population of data:

data = [1,2,3,4,5]

# Define an item function
def addOne(item):
    return item + 1

# Adds function into pipeline
pl.addItemPipe(addOne)
# Pass the data through pipeline
res = pl(data)

# res = [2,3,4,5,6]

Data pipe is a filter:

data = [1,2,3,4,5]

def onlyOdd(item):
    return False if item%2==0 else True

pl.addDataPipe(onlyOdd)
res = pl(data)

# res = [1,3,5]

Stat pipe reduces over the data, passing the accumulated value to each element:

data = [1,2,3,4,5]

# Function that goes over all items in dataset
def countNumberStat(stats, item):
    stats["total"] += 1
    if item%2==0:
        stats["even"] += 1
    else:
        stats["odd"] += 1
    return stats

# Function to be called at the end with accumulated stats
def printNumberStat(stats):
    print(stats["total"], "items were processed in total.")
    print(stats["even"], "of them are even.")
    print(stats["odd"], "of them are odd")

# Make sure to pass initial state as 3rd argument
pl.addStatPipe(countNumberStat, printNumberStat, { "total":0, "even":0, "odd":0 })
pl(data)

# Output:
# 5 items were processed in total.
# 2 of them are even.
# 3 of them are odd

Check out some examples


If this library solved some of your problems, please consider starring the project 😉

And feel free to create pull requests!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_pipeline-1.0.0.tar.gz (3.1 kB view hashes)

Uploaded Source

Built Distribution

smart_pipeline-1.0.0-py3-none-any.whl (7.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page