Skip to main content

The potassium package is a flask-like HTTP server for serving large AI models

Project description

Potassium

Potassium (1)

An HTTP server designed for AI, by Banana

Quickstart

Install the potassium package

pip3 install potassium

Create a python file called app.py with this:

from potassium import Potassium

from transformers import pipeline
import torch

app = Potassium("server")

@app.init
def init():
    device = 0 if torch.cuda.is_available() else -1
    model = pipeline('fill-mask', model='bert-base-uncased', device=device)

    app.optimize(model)

    return app.set_cache({
        "model": model
    })

@app.handler
def handler(cache: dict, json_in: dict) -> dict:
    prompt = json_in.get('prompt', None)
    model = cache.get("model")

    outputs = model(prompt)
    return {"outputs": outputs}

if __name__ == "__main__":
    app.serve()

This runs a Huggingface BERT model. For this example, you'll also need to install transformers and torch.

pip3 install transformers torch

Start the server with:

python3 app.py

Test the running server with:

curl -X POST -H "Content-Type: application/json" -d '{"prompt": "Hello I am a [MASK] model."}' http://localhost:8000

Documentation

potassium.Potassium

from potassium import Potassium

app = Potassium("server")

This instantiates your HTTP app, similar to popular frameworks like Flask

This HTTP server is production-ready out of the box.

@app.init

@app.init
def init():
    device = 0 if torch.cuda.is_available() else -1
    model = pipeline('fill-mask', model='bert-base-uncased', device=device)

    app.optimize(model)

    return app.set_cache({
        "model": model
    })

The @app.init decorated function runs once on server startup, and is used to load any reuseable, heavy objects such as:

  • Your AI model, loaded to GPU
  • Tokenizers
  • Precalculated embeddings

Once initialized, you must save those variables to the cache with app.set_cache({}) so they can be referenced later.

There may only be one @app.init function.

@app.handler

@app.handler
def handler(cache: dict, json_in: dict) -> dict:
    prompt = json_in.get('prompt', None)
    model = cache.get("model")

    outputs = model(prompt)
    return {"outputs": outputs}

The @app.handler decorated function runs for every http call, and is used to run inference or training workloads against your model(s).

Arg Type Description
cache dict The app's cache, set with set_cache()
json_in dict The json body of the input call. If using the Banana client SDK, this is the same as model_inputs
Return Val Type Description
json_out dict The json body to return to the client. If using the Banana client SDK, this is the same as model_outputs

There may only be one @app.handler function.

app.serve

app.serve starts the server and blocks

app.set_cache()

app.set_cache({})

app.set_cache saves the input dictionary to the app's cache, for reuse in future calls. It may be used in both the @app.init and @app.handler functions.

app.set_cache overwrites any preexisting cache.

app.get_cache()

cache = app.get_cache()

app.get_cache fetches the dictionary to the app's cache. This value is automatically provided for you as the cache argument in the @app.handler function.

app.optimize(model)

model # some pytorch model
app.optimize(model)

app.optimize is a feature specific to users hosting on Banana's serverless GPU infrastructure. It is run during buildtime rather than runtime, and is used to locate the model(s) to be targeted for Banana's Fastboot optimization.

Multiple models may be optimized. Only Pytorch models are currently supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

potassium-0.0.2.tar.gz (8.0 kB view hashes)

Uploaded Source

Built Distribution

potassium-0.0.2-py3-none-any.whl (8.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page