The potassium package is a flask-like HTTP server for serving large AI models

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3.7
- Python :: 3.8
Topic
- Software Development :: Build Tools

Project description

Potassium

Potassium (1)

Potassium is an open source web framework, built to tackle the unique challenges of serving custom models in production.

The goal of this project is to:

Provide a familiar web framework similar to Flask/FastAPI
Bake in best practices for handling large, GPU-bound ML models
Provide a set of primitives common in ML serving, such as:
- POST request handlers
- Websocket / streaming connections
- Async handlers w/ webhooks
Maintain a standard interface, to allow the code and models to compile to specialized hardware (ideally on Banana Serverless GPUs 😉)

Stability Notes:

Potassium uses Semantic Versioning, in that major versions imply breaking changes, and v0 implies instability even between minor/patch versions. Be sure to lock your versions, as we're still in v0!

Quickstart: Serving a Huggingface BERT model

The fastest way to get up and running is to use the Banana CLI, which downloads and runs your first model.

Here's a demo video

Install the CLI with pip

pip3 install banana-cli

This downloads boilerplate for your potassium app, and automatically installs potassium into the venv.

Create a new project directory with

banana init my-app
cd my-app

Start the dev server

. ./venv/bin/activate
python3 app.py

Call your API (from a separate terminal)

curl -X POST -H "Content-Type: application/json" -d '{"prompt": "Hello I am a [MASK] model."}' http://localhost:8000/

Or do it yourself:

Install the potassium package

pip3 install potassium

Create a python file called app.py containing:

from potassium import Potassium, Request, Response
from transformers import pipeline
import torch
import time

app = Potassium("my_app")

# @app.init runs at startup, and initializes the app's context
@app.init
def init():
    device = 0 if torch.cuda.is_available() else -1
    model = pipeline('fill-mask', model='bert-base-uncased', device=device)
   
    context = {
        "model": model,
        "hello": "world"
    }

    return context

# @app.handler is an http post handler running for every call
@app.handler()
def handler(context: dict, request: Request) -> Response:
    
    prompt = request.json.get("prompt")
    model = context.get("model")
    outputs = model(prompt)

    return Response(
        json = {"outputs": outputs}, 
        status=200
    )

if __name__ == "__main__":
    app.serve()

This runs a Huggingface BERT model.

For this example, you'll also need to install transformers and torch.

pip3 install transformers torch

Start the server with:

python3 app.py

Test the running server with:

curl -X POST -H "Content-Type: application/json" -d '{"prompt": "Hello I am a [MASK] model."}' http://localhost:8000

Documentation

potassium.Potassium

from potassium import Potassium

app = Potassium("server")

This instantiates your HTTP app, similar to popular frameworks like Flask

@app.init

@app.init
def init():
    device = 0 if torch.cuda.is_available() else -1
    model = pipeline('fill-mask', model='bert-base-uncased', device=device)

    return {
        "model": model
    }

The @app.init decorated function runs once on server startup, and is used to load any reuseable, heavy objects such as:

Your AI model, loaded to GPU
Tokenizers
Precalculated embeddings

The return value is a dictionary which saves to the app's context, and is used later in the handler functions.

There may only be one @app.init function.

@app.handler()

@app.handler("/")
def handler(context: dict, request: Request) -> Response:
    
    prompt = request.json.get("prompt")
    model = context.get("model")
    outputs = model(prompt)

    return Response(
        json = {"outputs": outputs}, 
        status=200
    )

The @app.handler decorated function runs for every http call, and is used to run inference or training workloads against your model(s).

You may configure as many @app.handler functions as you'd like, with unique API routes.

The context dict passed in is a mutable reference, so you can modify it in-place to persist objects between warm handlers.

@app.background(path="/background")

@app.background("/background")
def handler(context: dict, request: Request) -> Response:

    prompt = request.json.get("prompt")
    model = context.get("model")
    outputs = model(prompt)

    send_webhook(url="http://localhost:8001", json={"outputs": outputs})

    return

The @app.background() decorated function runs a nonblocking job in the background, for tasks where results aren't expected to return clientside. It's on you to forward the data to wherever you please. Potassium supplies a send_webhook() helper function for POSTing data onward to a url, or you may add your own custom upload/pipeline code.

When invoked, the server immediately returns a {"success": true} message.

You may configure as many @app.background functions as you'd like, with unique API routes.

The context dict passed in is a mutable reference, so you can modify it in-place to persist objects between warm handlers.

app.serve()

app.serve runs the server, and is a blocking operation.

Pre-warming your app

Potassium comes with a built-in endpoint for those cases where you want to "warm up" your app to better control the timing of your inference calls. You don't need to call it, since your inference call requires init() to have run once on server startup anyway, but this gives you a bit more control.

Once your model is warm (i.e., cold boot finished), this endpoint returns a 200. If a cold boot is required, the init() function is first called while the server starts up, and then a 200 is returned from this endpoint.

You don't need any extra code to enable it, it comes out of the box and you can call it at /_k/warmup as either a GET or POST request.

Store

Potassium includes a key-value storage primative, to help users persist data between calls.

Example usage: your own Redis backend (encouraged)

from potassium.store import Store, RedisConfig

store = Store(
    backend="redis",
    config = RedisConfig(
        host = "localhost",
        port = 6379
    )
)

# in one handler
store.set("key", "value", ttl=60)

# in another handler
value = store.get("key")

Example usage: using local storage

Note: not encouraged on Banana serverless or multi-replica environments, as data is stored only on the single replica

from potassium.store import Store, RedisConfig

store = Store(
    backend="local"
)

# in one handler
store.set("key", "value", ttl=60)

# in another handler
value = store.get("key")

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3.7
- Python :: 3.8
Topic
- Software Development :: Build Tools

Release history Release notifications | RSS feed

This version

0.5.0

Dec 12, 2023

0.4.1

Nov 28, 2023

0.4.0

Nov 7, 2023

0.3.2

Oct 25, 2023

0.3.1

Oct 23, 2023

0.3.0

Oct 19, 2023

0.2.1

Oct 18, 2023

0.2.0

Oct 12, 2023

0.1.2

Jul 24, 2023

0.1.1

Jun 29, 2023

0.1.0

Jun 23, 2023

0.0.10

Jun 21, 2023

0.0.9

May 25, 2023

0.0.8

Mar 26, 2023

0.0.7

Mar 24, 2023

0.0.6

Mar 24, 2023

0.0.5

Mar 23, 2023

0.0.4

Mar 15, 2023

0.0.3

Mar 2, 2023

0.0.2

Mar 1, 2023

0.0.1

Mar 1, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

potassium-0.5.0.tar.gz (21.3 kB view hashes)

Uploaded Dec 12, 2023 Source

Built Distribution

potassium-0.5.0-py3-none-any.whl (18.3 kB view hashes)

Uploaded Dec 12, 2023 Python 3

Hashes for potassium-0.5.0.tar.gz

Hashes for potassium-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`08dcd45522737a7919a6a9014a60ab1550ef977b9183035eebdb8b5d8dff7fc5`
MD5	`821c552e3b87253031690b6f6408ad45`
BLAKE2b-256	`2add19f1d81ee3d5bc02d9f853024587dfb93aca9d681ac255f96e533690642e`

Hashes for potassium-0.5.0-py3-none-any.whl

Hashes for potassium-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`149127b984fe9277f716c3d744b9a2227b44c8269f3aca2a8d7d088ca13f22bb`
MD5	`a7d926121a3bde5ebf7fbc75a1b450b8`
BLAKE2b-256	`60f3ae9562b80fd7d8f445e1e530e9e8ab990ff1b56f0e8ce410d8e51f14aefd`