Skip to main content

project_description

Project description

unipipe

Unified pipeline library.

:warning: Experimental :warning:

  • Build batch pipelines in Python that run anywhere -- on your laptop, on the server, and in the cloud.
  • Easily scale local experiments to the cloud without any changes
  • Save time by only writing each pipeline once
  • Save money by only paying for the compute infrastructure you need

About

unipipe makes it easy to build batch pipelines in Python, then run them either locally or in the cloud. It was originally created for machine learning workflows, but it works for any batch data processing pipeline.

Install

From PyPI:

# Minimal install
pip install unipipe

# With additional executors (e.g. 'docker', 'vertex')
pip install unipipe[vertex]

From source:

# Minimal install
pip install "unipipe @ git+ssh://git@github.com/fkodom/unipipe.git"

# With additional executors (e.g. 'docker', 'vertex')
pip install[vertex] "unipipe @ git+ssh://git@github.com/fkodom/unipipe.git"

If you'd like to contribute, install all dependencies and pre-commit hooks:

# Install all dependencies
pip install "unipipe[all] @ git+ssh://git@github.com/fkodom/unipipe.git"
# Setup pre-commit hooks
pre-commit install

Getting Started

Build a pipeline once using the unipipe DSL:

from unipipe import dsl

@dsl.component
def say_hello(name: str) -> str:
    return f"Hello, {name}!"

@dsl.pipeline
def pipeline():
    say_hello(name="world")

Then, run the pipeline using any of the supported backends:

from unipipe import run

run(
    # Supported executors include:
    #   'python' --> runs in the current Python process
    #   'docker' --> runs each component in a separate Docker container
    #   'vertex' --> runs in GCP through Vertex, which in turn uses KFP
    executor="python",
    pipeline=pipeline(),
)

Expected output:

INFO:root:[say_hello-1603ae3e] - Hello, world!

Run Any Python Script

Or scale any Python script to the cloud using the unipipe CLI:

# Same choices of executors as above.
unipipe run-script \
    --executor vertex \
    --pipeline-root "gs://bucket-name/artifact-root/ \
    ./examples/ex01_hello_world.py

This makes experimentation easy. unipipe will automatically compose your script into a pipeline, and launch it with your chosen executor. See this example for more details.

More Examples

Link Description
Hello World Create/run your first unipipe pipeline
Hello Pipeline Create pipelines with multiple steps
Multi-output Components Build components that return more than one type-checked value
Pipeline Arguments Make pipelines reusable with dynamic inputs
Dependency Management Install and use other Python packages in your pipelines
Hardware Specs Request hardware (CPUs, Memory, GPUs) for your pipeline runs
Nested Pipelines Call existing pipelines from inside another pipeline
Control Flow Add conditional control flow to your pipelines
Advanced Control Flow Best practices for advanced control flow
Private Dependencies Using private Python packages
Run Any Python Script Run any Python script using unipipe

Why unipipe?

  1. unipipe was designed to mitigate issues with Kubeflow Pipelines (KFP).
    • Kubeflow and KFP are often used by machine learning engineers to orchestrate training jobs, data preprocessing, and other computationally intensive tasks.
  2. KFP pipelines only run on Kubeflow.
    • Kubeflow requires specialized knowledge and additional compute resources. It can be expensive and/or impractical for individuals and small teams.
    • Managed, serverless platforms like Vertex (Google Cloud) exist, which automate all of that. But still, pipelines only run on KFP/Vertex -- not on your laptop.
  3. Why write the same pipeline twice?
    • KFP developers often write multiple pipeline scripts. One for their laptop, and another for the cloud.
    • TODO: Finish this section...

TODO

  1. Add executor for KFP clusters, in addition to Vertex.
  2. Better up-front type checking (in progress).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unipipe-0.5.4.tar.gz (26.1 kB view details)

Uploaded Source

Built Distribution

unipipe-0.5.4-py3-none-any.whl (25.3 kB view details)

Uploaded Python 3

File details

Details for the file unipipe-0.5.4.tar.gz.

File metadata

  • Download URL: unipipe-0.5.4.tar.gz
  • Upload date:
  • Size: 26.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for unipipe-0.5.4.tar.gz
Algorithm Hash digest
SHA256 afae6f2b02ce931a3637143ec55f24f5497158dfa0927a97941f41959809fbcd
MD5 1645f5f14a9ebba4a6a83b7eda7203c6
BLAKE2b-256 14e9b0c0ba5cc799744f06585b1cf98b3b2679fa726429166d368ba3d97a5b19

See more details on using hashes here.

File details

Details for the file unipipe-0.5.4-py3-none-any.whl.

File metadata

  • Download URL: unipipe-0.5.4-py3-none-any.whl
  • Upload date:
  • Size: 25.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for unipipe-0.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 463332fa3ba1a2fa6be8716de09621767c475fda69e2a6ff073c9ba3ed6bc458
MD5 dc9454fca07494f3b6a4c8716e970e6d
BLAKE2b-256 21ab2a72dccfd9430568638502745a067ee49c354a00bff419fde6e511755fae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page