Skip to main content

Pipelines for machine learning workloads.

Project description

Pipeline Version Size Downloads License Discord

powered by mystic

Table of Contents

About

Pipeline is a python library that provides a simple way to construct computational graphs for AI/ML. The library is suitable for both development and production environments supporting inference and training/finetuning. This library is also a direct interface to Pipeline.ai which provides a compute engine to run pipelines at scale and on enterprise GPUs.

The syntax used for defining AI/ML pipelines shares some similarities in syntax to sessions in Tensorflow v1, and Flows found in Prefect. In future releases we will be moving away from this syntax to a C based graph compiler which interprets python directly (and other languages) allowing users of the API to compose graphs in a more native way to the chosen language.

Version roadmap

v0.4.0 (Jan 2023)

  • Custom environments on PipelineCloud (remote compute services)
  • Kwarg inputs to runs
  • Extended IO inputs to pipeline_function objects

v0.5.0 (Jan/Feb 2023)

  • Pipeline chaining
  • if statements & while/for loops

Beyond

  • Run log streaming
  • Run progress tracking
  • Resource dedication
  • Pipeline scecific remote load balancer (10% of traffic to one pipeline 80% to another)
  • Usage capping
  • Run result streaming
  • Progromatic autoscaling
  • Alerts
  • Events
  • Different python versions on remote compute services

Quickstart

:warning: Uploading pipelines to Pipeline Cloud works best in Python 3.9. We strongly recommend you use Python 3.9 when uploading pipelines because the pipeline-ai library is still in beta and is known to cause opaque errors when pipelines are serialised from a non-3.9 environment.

Basic maths

from pipeline import Pipeline, Variable, pipeline_function


@pipeline_function
def square(a: float) -> float:
    return a**2

@pipeline_function
def multiply(a: float, b: float) -> float:
    return a * b

with Pipeline("maths") as pipeline:
    flt_1 = Variable(type_class=float, is_input=True)
    flt_2 = Variable(type_class=float, is_input=True)
    pipeline.add_variables(flt_1, flt_2)

    sq_1 = square(flt_1)
    res_1 = multiply(flt_2, sq_1)
    pipeline.output(res_1)

output_pipeline = Pipeline.get_pipeline("maths")
print(output_pipeline.run(5.0, 6.0))

Transformers (GPT-Neo 125M)

Note: requires torch and transformers as dependencies.

from pipeline import Pipeline, Variable
from pipeline.objects.huggingface.TransformersModelForCausalLM import (
    TransformersModelForCausalLM,
)

with Pipeline("hf-pipeline") as builder:
    input_str = Variable(str, is_input=True)
    model_kwargs = Variable(dict, is_input=True)

    builder.add_variables(input_str, model_kwargs)

    hf_model = TransformersModelForCausalLM(
        model_path="EleutherAI/gpt-neo-125M",
        tokenizer_path="EleutherAI/gpt-neo-125M",
    )
    hf_model.load()
    output_str = hf_model.predict(input_str, model_kwargs)

    builder.output(output_str)

output_pipeline = Pipeline.get_pipeline("hf-pipeline")

print(
    output_pipeline.run(
        "Hello my name is", {"min_length": 100, "max_length": 150, "temperature": 0.5}
    )
)

Installation instructions

Linux, Mac (intel)

pip install -U pipeline-ai

Mac (arm/M1)

Due to the ARM architecture of the M1 core it is necessary to take additional steps to install Pipeline, mostly due to the transformers library. We recoomend running inside of a conda environment as shown below.

  1. Make sure Rosetta2 is disabled.
  2. From terminal run:
xcode-select --install
  1. Install Miniforge, instructions here: https://github.com/conda-forge/miniforge or follow the below:
    1. Download the Miniforge install script here: https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
    2. Make the shell executable and run
    sudo chmod 775 Miniforge3-MacOSX-arm64.sh
    ./Miniforge3-MacOSX-arm64.sh
    
  2. Create a conda based virtual env and activate:
conda create --name pipeline-env python=3.9
conda activate pipeline-env
  1. Install tensorflow
conda install -c apple tensorflow-deps
python -m pip install -U pip
python -m pip install -U tensorflow-macos
python -m pip install -U tensorflow-metal
  1. Install transformers
conda install -c huggingface transformers -y
  1. Install pipeline
python -m pip install -U pipeline-ai

Development

This project is made with poetry, so firstly setup poetry on your machine.

Once that is done, please run

sh setup.sh

With this you should be good to go. This sets up dependencies, pre-commit hooks and pre-push hooks.

You can manually run pre commit hooks with

pre-commit run --all-files

To run tests manually please run

pytest

License

Pipeline is licensed under Apache Software License Version 2.0.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipeline_ai-0.5.0b10.tar.gz (47.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pipeline_ai-0.5.0b10-py3-none-any.whl (68.0 kB view details)

Uploaded Python 3

File details

Details for the file pipeline_ai-0.5.0b10.tar.gz.

File metadata

  • Download URL: pipeline_ai-0.5.0b10.tar.gz
  • Upload date:
  • Size: 47.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pipeline_ai-0.5.0b10.tar.gz
Algorithm Hash digest
SHA256 85536625eed280f8276bd0afecb7fac4b76e5df73a8b6ddffd266330a708a383
MD5 eeda83ce7fb38ecfc27f878cfbecfcd5
BLAKE2b-256 468982aadbe0f03dd1696dc61fdc187fe5399a1cdb8ea1620ebd0f925e73e06d

See more details on using hashes here.

File details

Details for the file pipeline_ai-0.5.0b10-py3-none-any.whl.

File metadata

  • Download URL: pipeline_ai-0.5.0b10-py3-none-any.whl
  • Upload date:
  • Size: 68.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pipeline_ai-0.5.0b10-py3-none-any.whl
Algorithm Hash digest
SHA256 05a811d5a34462b5b2f19f1fc7d717fa41611232e1d607a5ba3c762be74bb3f3
MD5 5bbb5ddb07f3546d2615824bbf81f190
BLAKE2b-256 d334797ec710fda729c194493fe3904ee5a3646ccb298f970ca7dc9fa3cce9d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page