Skip to main content

Hence, a powerful framework designed to streamline data pipeline, scraping, automation workflow orchestration.

Project description

Hence - A minimal python workflow engine

Introduction

Welcome to Hence, a powerful framework designed to streamline your workflow orchestration process.

Whether you're involved in web scraping, data loading, fetching, or any other repetitive task, Hence offers a comprehensive solution to break down these tasks into manageable units of work.

By orchestrating these units sequentially, Hence empowers you to focus on the big picture without the hassle of manually ensuring the success of each step.

Features

  • Task Breakdown – Hence breaks complex tasks into smaller, manageable units for better organization and execution.
  • Workflow Orchestration – Automate workflows with Hence to ensure smooth, sequential execution without manual effort.
  • Error Handling – Hence manages errors gracefully, preventing workflow interruptions and ensuring seamless execution.
  • Scalability – Whether small or large-scale, Hence adapts effortlessly to your needs for optimal performance.

Use-cases

  • Web Scraping – Hence automates web scraping by breaking tasks into fetching, extracting, and storing data.
  • Data Loading/Fetching – Hence streamlines fetching from APIs and loading data into databases effortlessly.
  • Repetitive Tasks – Automate reports, file processing, and data transformations with Hence to save time and effort.

Setup / Installation

Use as library

Install from Pypi

pip install -U hence

Install from Github

pip install -U git+https://github.com/0hsn/hence.git@main

or a specific tag

pip install -U git+https://github.com/0hsn/hence.git@v0.11.0

Development setup

Prerequisite

Local installation steps

  • Firstly, clone the repository

  • Setup with development tools

    pipenv install --dev
    

Testing

poetry run py.test -s

Samples

poetry run python -m samples.web_scraping

API

Pipeline

Pipeline.add_task

Add a task to pipeline using decorator. This decorator is useful, when you want to define a function and make it pipeline task at the same time.

Signature
def add_task(uid: typing.Optional[str] = None, pass_ctx: bool = False) -> typing.Any
Parameters

uid: str | None Optional. Default: None. A unique name for a task function in a pipeline. If same id passed, should replace older assignment.

pass_ctx: bool Optional. Default: False. Pass PipelineContext as 1st parameter to task function. If true, the 1st parameter to the function

Example
@pipeline.add_task(pass_ctx=True)
def function_1(ctx: PipelineContext, a: str):
    return a

Pipeline.re_add_task

Add a task to pipeline. This function is useful, when you want to define a function early and make it pipeline task later.

Signature
def re_add_task(function: typing.Callable, uid: typing.Optional[str] = None, pass_ctx: bool = False) -> None
Parameters

function: typing.Callable Required. A function to act as a pipeline task.

uid: str | None Optional. Default: None. A unique name for a task function in a pipeline. If same id passed, should replace older assignment.

pass_ctx: bool Optional. Default: False. Pass PipelineContext as 1st parameter to task function. If true, the 1st parameter to the function

Example
def function_1(ctx: PipelineContext, a: str):
    return a

pipeline.re_add_task(function_1, pass_ctx=True)

Pipeline.parameter

Add parameters before Pipeline.run. This function passes parameters when running the task.

Signature
def parameter(self, **kwargs) -> typing.Self
Parameters

pass the function name or registered uid for the function as parameter.

Example
def function_1(ctx: PipelineContext, a: str):
    return a

def function_2(ctx: PipelineContext, a: str):
    return a

pipeline.re_add_task(function_1, pass_ctx=True)
pipeline.re_add_task(function_2, uid="r_func")

pipeline
    .parameter(function_1={"a": "Some string"})
    .parameter(r_func={"a": "Some string"})

Pipeline.run

Run the pipeline.

Signature
def run(self, is_parallel: bool = False) -> dict[str, typing.Any]:
Parameters

is_parallel: bool Optional. To run added tasks in parallel.

Example
def function_1(ctx: PipelineContext, a: str):
    return a

def function_2(ctx: PipelineContext, a: str):
    return a

pipeline.re_add_task(function_1, pass_ctx=True)
pipeline.re_add_task(function_2, uid="r_func")

output = pipeline.run()

# or in parallel, since these tasks are not dependent
output = pipeline.run(True)

This function outputs a dictionary containing all function returns, by function name or uid (if used).

PipelineContext

PipelineContext is a class that holds all the operation data for a certain Pipeline.

  • PipelineContext is passed when .add_task(pass_ctx=True, .. or .re_add_task(.., pass_ctx=True, ...
  • remember to add a variable as 1st parameter to function when pass_ctx is True.
Members

result: dict[str, typing.Any]. A dictionary containing returns from the executed functions in a certain pipeline.

parameters: dict[str, dict[str, typing.Any]] A dictionary containing all the parameters passed using Pipeline.parameter.

sequence: list[str] A list containing all the functions added as task to a certain pipeline.

functions: dict[str, typing.Callable] A dictionary containing all the functions added as task via Pipeline.add_task and Pipeline.re_add_task.

Contributions


Licensed under AGPL-3.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hence-0.12.1.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hence-0.12.1-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file hence-0.12.1.tar.gz.

File metadata

  • Download URL: hence-0.12.1.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Darwin/23.6.0

File hashes

Hashes for hence-0.12.1.tar.gz
Algorithm Hash digest
SHA256 8ef2232c34d6ed2193d78b34fe731ae077c5e832229c88073b3ab56419fd165f
MD5 7cac2f34a59218d2157876892f2f82ff
BLAKE2b-256 7c4674f3823c3b26ecd7dc1f42597d00fc368c8841151a4ba525e41fe35d535c

See more details on using hashes here.

File details

Details for the file hence-0.12.1-py3-none-any.whl.

File metadata

  • Download URL: hence-0.12.1-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Darwin/23.6.0

File hashes

Hashes for hence-0.12.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cc40554e52fef4259dec6dfe3d0b7d171b26f9271ebc69f6c3bbf95f527c62fe
MD5 423fea60835f4f4779fb5f05653346c0
BLAKE2b-256 f62bb02e669e1554081c878df3a8fc3b6f86e29538d9b2b84657d832e7548825

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page