A tool that allow one to create pipeline of automatic data processing.
Project description
Autopipe
A tool that allow one to create pipeline of automatic data processing.
How to use
To create a pipeline, you must create a Coordinator. To do so, create a file and a class implementing the base Coordinator class. For example, here is the code for a simple coordinator that download images corresponding to a querry from google image.
from typing import List, Union, Callable
from autopipe import Coordinator, Pipe, APData, Output
from autopipe.input import RssInput
from autopipe.pipe import FileData, DownloaderPipe
class DownloadExample(Coordinator):
def __init__(self, query: str = "raccoon"):
super().__init__()
self.query = query
@classmethod
def name(cls):
return "DownloadExample"
@property
def input(self):
return RssInput(f"http://www.obsrv.com/General/ImageFeed.aspx?{self.query}",
lambda x: FileData(None, x["media_content"][0]["url"], False))
@property
def pipeline(self) -> List[Union[Pipe, Callable[[APData], Union[APData, Pipe]]]]:
return [Output(DownloaderPipe())]
For this coordinator to be found by autopipe, you must use one of the three following way
- Place your coordinator file into the
autopipe/coordinators
folders, import your coordinator in theautopipe/coordinators/__init__.py
file and place your coordinator name in the__all__
array of this file. - Run autopipe with the coordinator argument set to the path of your file followed by ':' and the coordinator's class name. For example if your coordinator's file is named
coordinator.py
, is located in the current directory and your coordinator's name isDownloadExample
, your coordinator argument would be../coordinator.py:DownloadExample
- Send your coordinator file to the standard input of autopipe and use
-
as your coordinator name. SOON
To run this, you have three ways
- Use the autopipe file in the bin folder like so:
./autopipe <coordinator> [coordinator_parameters]
- Use the module syntaxe like so:
python -m autopipe <coordinator> [coordinator_parameters]
- Use the shebang
#!/usr/bin/env autopipe -
, set your coordinator file executable (chmod +x file.py
) and execute it. SOON
Coordinators options
A pipeline always start with an Input
. You specify the instance of thte input manager you will use in the get_input()
method.
An input will return one or multiples data that will be send to your pipeline one by one.
The pipeline
Each item is send to the Pipe
you specify in the pipeline
property of your coordinator. In this property, you can place instances of pipes or functions that take a single APData
as parameter and return an APData
or a Pipe
.
A pipeline is finished for an item when an Output
pipe is reached. That can be by using one of the premade output or by wrapping a Pipe
or an APData
with an Output()
call.
Interceptors
You can add interceptors to your coordinator. An interceptor is a function that will be called between steps of your pipeline if the specified condition matches. This allow you to handle invalid cases of your data or specific cases that don't need a specific step in the pipeline. You can specify an interceptor using the @autopipe.interceptor(lambda data: condition)
decorator.
The default handler
A default_handler
method can be specified in your coordinator. This special method will be called once the whole pipeline has been consumed but no Output has been returned. You can also use this method instead of the pipeline
property by removing the property from your coordinator (or returning an empty list).
Usage
usage: autopipe [-h] [-V] [-v [lvl]] [-d] [-w dir] coordinator [coordinator ...]
Easily run advanced pipelines in a daemon or in one run sessions.
positional arguments:
coordinator The name of your pipeline coordinator.
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-v, --verbose [lvl] Set the logging level. (default: warn ; available: trace, debug, info, warning, error)
-d, --daemon Enable the daemon mode (rerun input generators after a sleep cooldown)
-w, --workdir dir Change the workdir, default is the pwd.
Instalation
To install autopipe, run sudo pip install autopipe
. To use a developement version, you can clone this project and run pip install -e .
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file autopipe-0.0.3.tar.gz
.
File metadata
- Download URL: autopipe-0.0.3.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3e759774f4c9d849007f4f2f840ee403a8227fdc50eedc82c08a274ccb67773 |
|
MD5 | 9c3734f231fd959d5c5eb91c5ff15d16 |
|
BLAKE2b-256 | 080c3f921fa00dca9f02009947f92c8d045c08edfd9099c4511bdcb5b06de3c9 |
File details
Details for the file autopipe-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: autopipe-0.0.3-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2fc160345d05c3759094fc2ca8c11257d416c861fa7dce4a95d60cf1bde9d5b4 |
|
MD5 | 4c47867a01ca580f5f2de942b753b440 |
|
BLAKE2b-256 | e0048b1370f85ac918f4c1beb7e386b29a14ff2e478ca1f9d2fa2eb2d170ed52 |