Skip to main content

Pipelines for machine learning workloads.

Project description

Pipeline Version Size Downloads License Discord

powered by mystic

Table of Contents

About

Pipeline is a python library that provides a simple way to construct computational graphs for AI/ML. The library is suitable for both development and production environments supporting inference and training/finetuning. This library is also a direct interface to Pipeline.ai which provides a compute engine to run pipelines at scale and on enterprise GPUs.

The syntax used for defining AI/ML pipelines shares some similarities in syntax to sessions in Tensorflow v1, and Flows found in Prefect. In future releases we will be moving away from this syntax to a C based graph compiler which interprets python directly (and other languages) allowing users of the API to compose graphs in a more native way to the chosen language.

Version roadmap

v0.4.0 (Jan 2023)

  • Custom environments on PipelineCloud (remote compute services)
  • Kwarg inputs to runs
  • Extended IO inputs to pipeline_function objects

v0.5.0 (Jan/Feb 2023)

  • Pipeline chaining
  • if statements & while/for loops

Beyond

  • Run log streaming
  • Run progress tracking
  • Resource dedication
  • Pipeline scecific remote load balancer (10% of traffic to one pipeline 80% to another)
  • Usage capping
  • Run result streaming
  • Progromatic autoscaling
  • Alerts
  • Events
  • Different python versions on remote compute services

Quickstart

:warning: Uploading pipelines to Pipeline Cloud works best in Python 3.9. We strongly recommend you use Python 3.9 when uploading pipelines because the pipeline-ai library is still in beta and is known to cause opaque errors when pipelines are serialised from a non-3.9 environment.

Basic maths

from pipeline import Pipeline, Variable, pipeline_function


@pipeline_function
def square(a: float) -> float:
    return a**2

@pipeline_function
def multiply(a: float, b: float) -> float:
    return a * b

with Pipeline("maths") as pipeline:
    flt_1 = Variable(type_class=float, is_input=True)
    flt_2 = Variable(type_class=float, is_input=True)
    pipeline.add_variables(flt_1, flt_2)

    sq_1 = square(flt_1)
    res_1 = multiply(flt_2, sq_1)
    pipeline.output(res_1)

output_pipeline = Pipeline.get_pipeline("maths")
print(output_pipeline.run(5.0, 6.0))

Transformers (GPT-Neo 125M)

Note: requires torch and transformers as dependencies.

from pipeline import Pipeline, Variable
from pipeline.objects.huggingface.TransformersModelForCausalLM import (
    TransformersModelForCausalLM,
)

with Pipeline("hf-pipeline") as builder:
    input_str = Variable(str, is_input=True)
    model_kwargs = Variable(dict, is_input=True)

    builder.add_variables(input_str, model_kwargs)

    hf_model = TransformersModelForCausalLM(
        model_path="EleutherAI/gpt-neo-125M",
        tokenizer_path="EleutherAI/gpt-neo-125M",
    )
    hf_model.load()
    output_str = hf_model.predict(input_str, model_kwargs)

    builder.output(output_str)

output_pipeline = Pipeline.get_pipeline("hf-pipeline")

print(
    output_pipeline.run(
        "Hello my name is", {"min_length": 100, "max_length": 150, "temperature": 0.5}
    )
)

Installation instructions

Linux, Mac (intel)

pip install -U pipeline-ai

Mac (arm/M1)

Due to the ARM architecture of the M1 core it is necessary to take additional steps to install Pipeline, mostly due to the transformers library. We recoomend running inside of a conda environment as shown below.

  1. Make sure Rosetta2 is disabled.
  2. From terminal run:
xcode-select --install
  1. Install Miniforge, instructions here: https://github.com/conda-forge/miniforge or follow the below:
    1. Download the Miniforge install script here: https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
    2. Make the shell executable and run
    sudo chmod 775 Miniforge3-MacOSX-arm64.sh
    ./Miniforge3-MacOSX-arm64.sh
    
  2. Create a conda based virtual env and activate:
conda create --name pipeline-env python=3.9
conda activate pipeline-env
  1. Install tensorflow
conda install -c apple tensorflow-deps
python -m pip install -U pip
python -m pip install -U tensorflow-macos
python -m pip install -U tensorflow-metal
  1. Install transformers
conda install -c huggingface transformers -y
  1. Install pipeline
python -m pip install -U pipeline-ai

Development

This project is made with poetry, so firstly setup poetry on your machine.

Once that is done, please run

sh setup.sh

With this you should be good to go. This sets up dependencies, pre-commit hooks and pre-push hooks.

You can manually run pre commit hooks with

pre-commit run --all-files

To run tests manually please run

pytest

License

Pipeline is licensed under Apache Software License Version 2.0.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipeline_ai-0.5.0b11.tar.gz (47.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pipeline_ai-0.5.0b11-py3-none-any.whl (68.0 kB view details)

Uploaded Python 3

File details

Details for the file pipeline_ai-0.5.0b11.tar.gz.

File metadata

  • Download URL: pipeline_ai-0.5.0b11.tar.gz
  • Upload date:
  • Size: 47.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pipeline_ai-0.5.0b11.tar.gz
Algorithm Hash digest
SHA256 35cd25ceea6e8ef0e69ab4ba6c2c7dbc2ad1c31da41c83a6e21f75f557960756
MD5 b2d0241b78ae5f73bf9b7f350896f9d2
BLAKE2b-256 7aa7478a186c4692079d780577f560aaccf1fe98a62b684892ba8f4556012c24

See more details on using hashes here.

File details

Details for the file pipeline_ai-0.5.0b11-py3-none-any.whl.

File metadata

  • Download URL: pipeline_ai-0.5.0b11-py3-none-any.whl
  • Upload date:
  • Size: 68.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for pipeline_ai-0.5.0b11-py3-none-any.whl
Algorithm Hash digest
SHA256 720442447bc372f83e311d0a54b22f9f60886c9a2667698bc15af64be76b0f6e
MD5 5a681f9328e0415d5a9bb4a7a9366952
BLAKE2b-256 deb37ef7db7f8915efa8ce197d6d9d47da283fa064f0f017e01522ec5f96d82d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page