Skip to main content

Pipelines for machine learning workloads.

Project description

Pipeline Version Size Downloads License Discord

powered by mystic

Table of Contents

About

Pipeline is a python library that provides a simple way to construct computational graphs for AI/ML. The library is suitable for both development and production environments supporting inference and training/finetuning. This library is also a direct interface to Pipeline.ai which provides a compute engine to run pipelines at scale and on enterprise GPUs.

The syntax used for defining AI/ML pipelines shares some similarities in syntax to sessions in Tensorflow v1, and Flows found in Prefect. In future releases we will be moving away from this syntax to a C based graph compiler which interprets python directly (and other languages) allowing users of the API to compose graphs in a more native way to the chosen language.

Version roadmap

v0.4.0 (Jan 2023)

  • Custom environments on PipelineCloud (remote compute services)
  • Kwarg inputs to runs
  • Extended IO inputs to pipeline_function objects

v0.5.0 (Jan/Feb 2023)

  • Pipeline chaining
  • if statements & while/for loops

Beyond

  • Run log streaming
  • Run progress tracking
  • Resource dedication
  • Pipeline scecific remote load balancer (10% of traffic to one pipeline 80% to another)
  • Usage capping
  • Run result streaming
  • Progromatic autoscaling
  • Alerts
  • Events
  • Different python versions on remote compute services

Quickstart

:warning: Uploading pipelines to Pipeline Cloud works best in Python 3.9. We strongly recommend you use Python 3.9 when uploading pipelines because the pipeline-ai library is still in beta and is known to cause opaque errors when pipelines are serialised from a non-3.9 environment.

Basic maths

from pipeline import Pipeline, Variable, pipeline_function


@pipeline_function
def square(a: float) -> float:
    return a**2

@pipeline_function
def multiply(a: float, b: float) -> float:
    return a * b

with Pipeline("maths") as pipeline:
    flt_1 = Variable(type_class=float, is_input=True)
    flt_2 = Variable(type_class=float, is_input=True)
    pipeline.add_variables(flt_1, flt_2)

    sq_1 = square(flt_1)
    res_1 = multiply(flt_2, sq_1)
    pipeline.output(res_1)

output_pipeline = Pipeline.get_pipeline("maths")
print(output_pipeline.run(5.0, 6.0))

Transformers (GPT-Neo 125M)

Note: requires torch and transformers as dependencies.

from pipeline import Pipeline, Variable
from pipeline.objects.huggingface.TransformersModelForCausalLM import (
    TransformersModelForCausalLM,
)

with Pipeline("hf-pipeline") as builder:
    input_str = Variable(str, is_input=True)
    model_kwargs = Variable(dict, is_input=True)

    builder.add_variables(input_str, model_kwargs)

    hf_model = TransformersModelForCausalLM(
        model_path="EleutherAI/gpt-neo-125M",
        tokenizer_path="EleutherAI/gpt-neo-125M",
    )
    hf_model.load()
    output_str = hf_model.predict(input_str, model_kwargs)

    builder.output(output_str)

output_pipeline = Pipeline.get_pipeline("hf-pipeline")

print(
    output_pipeline.run(
        "Hello my name is", {"min_length": 100, "max_length": 150, "temperature": 0.5}
    )
)

Installation instructions

Linux, Mac (intel)

pip install -U pipeline-ai

Mac (arm/M1)

Due to the ARM architecture of the M1 core it is necessary to take additional steps to install Pipeline, mostly due to the transformers library. We recoomend running inside of a conda environment as shown below.

  1. Make sure Rosetta2 is disabled.
  2. From terminal run:
xcode-select --install
  1. Install Miniforge, instructions here: https://github.com/conda-forge/miniforge or follow the below:
    1. Download the Miniforge install script here: https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
    2. Make the shell executable and run
    sudo chmod 775 Miniforge3-MacOSX-arm64.sh
    ./Miniforge3-MacOSX-arm64.sh
    
  2. Create a conda based virtual env and activate:
conda create --name pipeline-env python=3.9
conda activate pipeline-env
  1. Install tensorflow
conda install -c apple tensorflow-deps
python -m pip install -U pip
python -m pip install -U tensorflow-macos
python -m pip install -U tensorflow-metal
  1. Install transformers
conda install -c huggingface transformers -y
  1. Install pipeline
python -m pip install -U pipeline-ai

Development

This project is made with poetry, so firstly setup poetry on your machine.

Once that is done, please run

sh setup.sh

With this you should be good to go. This sets up dependencies, pre-commit hooks and pre-push hooks.

You can manually run pre commit hooks with

pre-commit run --all-files

To run tests manually please run

pytest

License

Pipeline is licensed under Apache Software License Version 2.0.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipeline_ai-0.5.0b18.tar.gz (48.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pipeline_ai-0.5.0b18-py3-none-any.whl (68.6 kB view details)

Uploaded Python 3

File details

Details for the file pipeline_ai-0.5.0b18.tar.gz.

File metadata

  • Download URL: pipeline_ai-0.5.0b18.tar.gz
  • Upload date:
  • Size: 48.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pipeline_ai-0.5.0b18.tar.gz
Algorithm Hash digest
SHA256 fdd00cb59a30c1a4bd51abc3f6541c1e648bdc45a9577156245a2b063cb7d440
MD5 b23005b85995ea95dcf404a2982980da
BLAKE2b-256 adbb99cef865bdc1508784d8f4458eb1c3669d6a8377e05181632e4890b7d7ae

See more details on using hashes here.

File details

Details for the file pipeline_ai-0.5.0b18-py3-none-any.whl.

File metadata

  • Download URL: pipeline_ai-0.5.0b18-py3-none-any.whl
  • Upload date:
  • Size: 68.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pipeline_ai-0.5.0b18-py3-none-any.whl
Algorithm Hash digest
SHA256 14293d2364e1ab35e0b1e97f38ab0dce9a9111b7fed6436ffd171d044ca13782
MD5 01ac051ff508b20f66131d064d858678
BLAKE2b-256 073a801e5b61bf2fbfa6f74adb739efd7f4d8141b4778b21e2a0ea4e5ec92db7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page