Skip to main content

Package for defining processing pipelines.

Project description

pyProcessingPipeline

The pyProcessingPipeline is a package for creating persistent and transparent data processing pipelines using a MySQL database as a persistent layer.

It stores every intermediate result, allowing everyone with access to the processing database to recreate your ProcessingRuns and validate your results.

copyright (c) 2021-2023 THM Giessen (LSE, workgroup Prof. Stefan Bernhard)

Authors: Christian Teichert, Alexander Mair, Matthias Haub, Urs Hackstein, Stefan Bernhard

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License Version 3 as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHATABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program.

IF not, see http://www.gnu.org/licenses/.


Getting Started

Installing via PyPi

Simply run

pip install pyProcessingPipeline

Installing from source

If you want to install the most recent version (e.g. with unfished features) you can install from source. To install the pyProcessingPipeline from source, clone this repository via

git clone https://gitlab.com/agbernhard.lse.thm/agb_public/pyProcessingPipeline

and then simply run pip install:

cd pyProcessingPipeline
pip install .

Creating your first pipeline

Defining your fist processing pipeline is as easy as creating a ProcessingRun and adding as many processing steps as you want:

from pyProcessingPipeline import ProcessingRun
import pyProcessingPipeline.steps as prs

# Create a ProcessingRun, which groups all steps and handles the processing
processing_run = ProcessingRun(
    name="TestProcessingRun", description="Just a test :)", persist_results=True
)

# You may now add as many steps as you might want.
# Steps are executed in the same order as added.
processing_run.add_step(
    prs.misc.Cut(global_lower_bound=10, global_upper_bound=90)
)
processing_run.add_step(
    prs.filters.butterworth.LowpassButter(
        cutoff_frequency=1.5,
        filter_order=3,
        sampling_frequency=125,
    )
)
processing_run.add_step(
    prs.preprocessing.normalization.NormalizeFundamentalFrequency()
)
...

# To execute all steps, simply call the run-function on
# the list of timeseries you want to process.
processing_run.run([sample_data])

# The results will then be available in the run results:
processing_run.results

For creating and storing persistent runs, see the documentation.

Building Documentation

To build the documentation, you will need to install the optional dependencies needed for documentation. Switch to the root folder and run:

$ pip install '.[docs]'

Now you can build the documentation by calling

make docs

Running Tests

First, install the optional test dependencies

$ pip install '.[test]'

and run

make unittest

Checking Code

To check if your code satisfies the style requirements, you can install the optional dev dependencies

$ pip install '.[dev]'

and call

make check

This will run black for auto-formatting of code, flake8 for checking the codestyle and mypy for static code analysis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyprocessingpipeline-0.3.1.tar.gz (50.2 kB view details)

Uploaded Source

Built Distribution

pyProcessingPipeline-0.3.1-py3-none-any.whl (66.0 kB view details)

Uploaded Python 3

File details

Details for the file pyprocessingpipeline-0.3.1.tar.gz.

File metadata

  • Download URL: pyprocessingpipeline-0.3.1.tar.gz
  • Upload date:
  • Size: 50.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for pyprocessingpipeline-0.3.1.tar.gz
Algorithm Hash digest
SHA256 ff865a3acb6d0b874d0a163bb46cf0953fea2b98e22e0840e451b3774e85ab87
MD5 c2b4872ba95847095094c45cf25bc807
BLAKE2b-256 c7d85e44b5e7e7ba7423af98ac1a8e7dad4036cb5def44a700fdaed4684d9f10

See more details on using hashes here.

File details

Details for the file pyProcessingPipeline-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pyProcessingPipeline-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 474e59009a0899907251bc3466fb7c41ca9b750cb4513efb5d063ea859751633
MD5 56f7d0cea2040c7c30616c6de13cbebc
BLAKE2b-256 a1a1b06572e9406f6637b35c4298d18b678530acbadbf0bb7df73455bf5cf240

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page