Skip to main content

Data Pipeline for your Python projects

Project description

Python Data Pipeline

Process any type of data in your projects easily, control the flow of your data!

Python 3.5, 3.6, 3.7 PyPI version


Installation

Install smart-pipeline with:

pip install -U smart-pipeline

Usage

Package 'smart_pipeline' provides a Pipeline class:

# Import Pipeline class
from smart_pipeline import Pipeline

# Create an instance
pl = Pipeline()

Pipeline class has 3 types of pipes: item, data and stat.

Item pipe modifies each item in dataset without changing the whole population of data:

data = [1,2,3,4,5]

# Define an item function
def addOne(item):
    return item + 1

# Adds function into pipeline
pl.addItemPipe(addOne)
# Pass the data through pipeline
res = pl(data)

# res = [2,3,4,5,6]

Data pipe is a filter:

data = [1,2,3,4,5]

def onlyOdd(item):
    return False if item%2==0 else True

pl.addDataPipe(onlyOdd)
res = pl(data)

# res = [1,3,5]

Stat pipe reduces over the data, passing the accumulated value to each element:

data = [1,2,3,4,5]

# Function that goes over all items in dataset
def countNumberStat(stats, item):
    stats["total"] += 1
    if item%2==0:
        stats["even"] += 1
    else:
        stats["odd"] += 1
    return stats

# Function to be called at the end with accumulated stats
def printNumberStat(stats):
    print(stats["total"], "items were processed in total.")
    print(stats["even"], "of them are even.")
    print(stats["odd"], "of them are odd")

# Make sure to pass initial state as 3rd argument
pl.addStatPipe(countNumberStat, printNumberStat, { "total":0, "even":0, "odd":0 })
pl(data)

# Output:
# 5 items were processed in total.
# 2 of them are even.
# 3 of them are odd

Check out some examples


If this library solved some of your problems, please consider starring the project 😉

And feel free to create pull requests!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for smart-pipeline, version 1.0.0
Filename, size File type Python version Upload date Hashes
Filename, size smart_pipeline-1.0.0-py3-none-any.whl (7.4 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size smart_pipeline-1.0.0.tar.gz (3.1 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page