Skip to main content

A tool that allow one to create pipeline of automatic data processing.

Project description

Autopipe

A tool that allow one to create pipeline of automatic data processing.

How to use

To create a pipeline, you must create a Coordinator. To do so, create a file and a class implementing the base Coordinator class. For example, here is the code for a simple coordinator that download images corresponding to a querry from google image.

from typing import List, Union, Callable
from autopipe import Coordinator, Pipe, APData, Output
from autopipe.input import RssInput
from autopipe.pipe import FileData, DownloaderPipe


class DownloadExample(Coordinator):
	def __init__(self, query: str = "raccoon"):
		super().__init__()
		self.query = query

	@classmethod
	def name(cls):
		return "DownloadExample"

	@property
	def input(self):
		return RssInput(f"http://www.obsrv.com/General/ImageFeed.aspx?{self.query}",
		                lambda x: FileData(None, x["media_content"][0]["url"], False))

	@property
	def pipeline(self) -> List[Union[Pipe, Callable[[APData], Union[APData, Pipe]]]]:
		return [Output(DownloaderPipe())]

For this coordinator to be found by autopipe, you must use one of the three following way

  1. Place your coordinator file into the autopipe/coordinators folders, import your coordinator in the autopipe/coordinators/__init__.py file and place your coordinator name in the __all__ array of this file.
  2. Run autopipe with the coordinator argument set to the path of your file followed by ':' and the coordinator's class name. For example if your coordinator's file is named coordinator.py, is located in the current directory and your coordinator's name is DownloadExample, your coordinator argument would be ../coordinator.py:DownloadExample
  3. Send your coordinator file to the standard input of autopipe and use - as your coordinator name. SOON

To run this, you have three ways

  1. Use the autopipe file in the bin folder like so: ./autopipe <coordinator> [coordinator_parameters]
  2. Use the module syntaxe like so: python -m autopipe <coordinator> [coordinator_parameters]
  3. Use the shebang #!/usr/bin/env autopipe -, set your coordinator file executable (chmod +x file.py) and execute it. SOON

Coordinators options

A pipeline always start with an Input. You specify the instance of thte input manager you will use in the get_input() method.

An input will return one or multiples data that will be send to your pipeline one by one.

The pipeline

Each item is send to the Pipe you specify in the pipeline property of your coordinator. In this property, you can place instances of pipes or functions that take a single APData as parameter and return an APData or a Pipe. A pipeline is finished for an item when an Output pipe is reached. That can be by using one of the premade output or by wrapping a Pipe or an APData with an Output() call.

Interceptors

You can add interceptors to your coordinator. An interceptor is a function that will be called between steps of your pipeline if the specified condition matches. This allow you to handle invalid cases of your data or specific cases that don't need a specific step in the pipeline. You can specify an interceptor using the @autopipe.interceptor(lambda data: condition) decorator.

The default handler

A default_handler method can be specified in your coordinator. This special method will be called once the whole pipeline has been consumed but no Output has been returned. You can also use this method instead of the pipeline property by removing the property from your coordinator (or returning an empty list).

Usage

usage: autopipe [-h] [-V] [-v [lvl]] [-d] [-w dir] coordinator [coordinator ...]

Easily run advanced pipelines in a daemon or in one run sessions.

positional arguments:
  coordinator          The name of your pipeline coordinator.

optional arguments:
  -h, --help           show this help message and exit
  -V, --version        show program's version number and exit
  -v, --verbose [lvl]  Set the logging level. (default: warn ; available: trace, debug, info, warning, error)
  -d, --daemon         Enable the daemon mode (rerun input generators after a sleep cooldown)
  -w, --workdir dir    Change the workdir, default is the pwd.

Instalation

To install autopipe, run sudo pip install autopipe. To use a developement version, you can clone this project and run pip install -e ..

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autopipe-0.0.3.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

autopipe-0.0.3-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file autopipe-0.0.3.tar.gz.

File metadata

  • Download URL: autopipe-0.0.3.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for autopipe-0.0.3.tar.gz
Algorithm Hash digest
SHA256 e3e759774f4c9d849007f4f2f840ee403a8227fdc50eedc82c08a274ccb67773
MD5 9c3734f231fd959d5c5eb91c5ff15d16
BLAKE2b-256 080c3f921fa00dca9f02009947f92c8d045c08edfd9099c4511bdcb5b06de3c9

See more details on using hashes here.

File details

Details for the file autopipe-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: autopipe-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for autopipe-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2fc160345d05c3759094fc2ca8c11257d416c861fa7dce4a95d60cf1bde9d5b4
MD5 4c47867a01ca580f5f2de942b753b440
BLAKE2b-256 e0048b1370f85ac918f4c1beb7e386b29a14ff2e478ca1f9d2fa2eb2d170ed52

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page