Skip to main content

No project description provided

Project description

Salt

Salt is a Task Scheduler system with a difference.

You can write dags and tasks in your favourite language (between the supported ones), with a set of unique dependencies, and build them with Salt framework.

After building the tasks, Salt can schedule them in isolated contexts that have all the dependencies needed to run.

Why Salt over Airflow

Airflow dependency management is complex. Due to the Pythonic structure, it enforces you to use Python and take care of dependencies, docker images and environments, most of the times leading to complex and clashing environments that lead to a lot of work to keep organized.

Salt removes all of this. Your DAG is self-contained with its dependencies and does not need anything but Salt framework to execute.

Getting Started

Salt development requires pdm Python Package manager to be installed.

Install the project in editable mode:

pip install -e .[dev]

Core Concepts

DAGs and Tasks

DAGs and Tasks are defined in user packages using Salt framework.

Salt Framework exposes bindings for Python to let users build tasks and dags. The tasks and dags are built into a one-file package using nuitka and then simply executed by Salt.

The scheduling logic is completely detached from the task/dag logic and is configured on the side through the Web UI.

When Salt executes the compiled tasks and dags, the underlying framework automatically communicates with Salt backend to register the DAG and link it to its configuration, so it can be scheduled by the scheduler.

To register a DAG module, user publishes it using Salt commands.

TODO The publishing process is yet to be defined.

Feature Considerations

  • Allow cycles and loops in DAGs
  • Add validation features for data computed by tasks, not just task success/failure.
  • Allow DAGs to change their shape at run-time. Each Task is pushed only at the moment of execution, and never pre-parsed.
    • This allow for DAGs to change depending on data computed during execution.
  • Provide a solid data driven & event driven integration/approach

Architecture Notes

Developer [Dev's Code: graph.py / graph.cs] → Developer writes code in Python or C# using Salt framework to define Workflows and Tasks. ↓

Workflow Build [Salt Build Tool - Python] → Builds PEX/.exe/Docker/.zip [Salt Build Tool - C#] → TODO

Workflow Registry [Workflow Registry + Metadata DB] → User registers workflow using Salt Workflow-Registry command → Workflow registry stores workflow metadata in Redis backend and binary in S3 storage.

Scheduling [Scheduler] → Event/Data change triggers workflows or task runs. → Queues task (task type + binary ref + input data) Queuing happens by publishing ready-to-be-picked tasks on a table (e.g. a Redis backend or any resource that can be locked to avoid race conditions) Workers lock a task and execute it, finally storing returned data in the backend database so the scheduler can access it. Q. how do inputs work for first queued task? ↓

Workflow Pickup & Execution / Workers [Generic Worker Fleet (K8s / Celery)] → Locks and picks task on Tasks page → Pull task binary from task resource Caching is vital here so binary is not pulled every time. → Runs it (e.g., ./task.bin --input <args-id>) Args are stored in a backend resource such as Redis. The Task framework automatically pulls these and passes it to code. XCOM similar approach? A lot of problems with serialization especially with custom types. → Reports status/output

Build Python Workflows

pip install salt salt build <project_path>

The previous command outputs a main.pex file built from your python workflow wheel. This file is a standalone executable which bakes in all the dependencies needed to execute your workflow.

Register a Python Workflow

To register a Python Workflow and Schedule it:

pip install salt salt register-workflow <pyproject_path>

The project must have been built already through salt build.

Generate Server gRPC Code

pip install salt salt generate-server-code e.g. salt generate-server-code /Users/Iacopo/Documents/PyCharm/Salt/Salt/src/salt/server/grpc/protos /Users/Iacopo/Documents/PyCharm/Salt/Salt/src

Note: Protobuf Python Codegen relies on the protos folder structure to generate python imports. Therefore, it's important to keep a mirrored sub-folder tree inside /grpc/protos so that folder tree is used to build the imports in the generated packages. e.g. in workflow_pb2_grpc we then get:

from salt.server import workflow_pb2 as salt_dot_server_dot_workflow__pb2

instead of

import workflow_pb2

Workflow Registry

The workflow registry backend takes care of ingesting, registering and storing workflow binaries.

The workflow metadata are stored in a table registry using Redis as a backend, and their binaries are stored in a S3 bucket. Both metadata and binaries can be found using the unique key.

The workflow registry table is then consulted by the Scheduler, which in conjunction with the scheduling configuration, will take care of executing workflows.

Scheduler

Scheduling must be configured through a salt.yaml file placed in the workflow folder.

The salt.yaml is evaluated and pushed to the Workflow Registry on Workflow Registration (salt register-workflow command).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

salt_py-0.2.0.dev0.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

salt_py-0.2.0.dev0-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file salt_py-0.2.0.dev0.tar.gz.

File metadata

  • Download URL: salt_py-0.2.0.dev0.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for salt_py-0.2.0.dev0.tar.gz
Algorithm Hash digest
SHA256 672f55d972bcadb4884df0189667c0a3fdaad01c0fffc79ebb4a8729f7c35bbf
MD5 ede4cbdd9d2471a154c91484d7cc9c58
BLAKE2b-256 9d49e67e43c383d3c2ec177da1e9895bee5722f8af27b3bc7d023f49964883f1

See more details on using hashes here.

File details

Details for the file salt_py-0.2.0.dev0-py3-none-any.whl.

File metadata

  • Download URL: salt_py-0.2.0.dev0-py3-none-any.whl
  • Upload date:
  • Size: 20.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for salt_py-0.2.0.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 f5ebcd9340c8b419459c9596152534ccb36cebebaa8eb9b1cede55f37ba25ca3
MD5 aab88e28d2a4b54973b86bba8d6fca80
BLAKE2b-256 10daad381822839137e113918288e671be7473af4cbea8f6a62e71bcd5c73f4d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page