Skip to main content

Model Serving made Efficient in the Cloud.

Project description

MOSEC

PyPI version PyPi Downloads License Check status

Model Serving made Efficient in the Cloud.

Introduction

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

  • Highly performant: web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O
  • Ease of use: user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing
  • Dynamic batching: aggregate requests from different users for batched inference and distribute results back
  • Pipelined stages: spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads
  • Cloud friendly: designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems
  • Do one thing well: focus on the online serving part, users can pay attention to the model performance and business logic

Installation

Mosec requires Python 3.7 or above. Install the latest PyPI package with:

> pip install -U mosec

Usage

Write the server

Import the libraries and set up a basic logger to better observe what happens.

import logging

from mosec import Server, Worker
from mosec.errors import ValidationError

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter(
    "%(asctime)s - %(process)d - %(levelname)s - %(filename)s:%(lineno)s - %(message)s"
)
sh = logging.StreamHandler()
sh.setFormatter(formatter)
logger.addHandler(sh)

Then, we build an API to calculate the exponential with base e for a given number. To achieve that, we simply inherit the Worker class and override the forward method. Note that the input req is by default a JSON-decoded object, e.g., a dictionary here (wishfully it receives data like {"x": 1}). We also enclose the input parsing part with a try...except... block to reject invalid input (e.g., no key named "x" or field "x" cannot be converted to float).

import math


class CalculateExp(Worker):
    def forward(self, req: dict) -> dict:
        try:
            x = float(req["x"])
        except KeyError:
            raise ValidationError("cannot find key 'x'")
        except ValueError:
            raise ValidationError("cannot convert 'x' value to float")
        y = math.exp(x)  # f(x) = e ^ x
        logger.debug(f"e ^ {x} = {y}")
        return {"y": y}

Finally, we append the worker to the server to construct a single-stage workflow, and we specify the number of processes we want it to run in parallel. Then we run the server.

if __name__ == "__main__":
    server = Server()
    server.append_worker(
        CalculateExp, num=2
    )  # we spawn two processes for parallel computing
    server.run()

Run the server

After merging the snippets above into a file named server.py, we can first have a look at the command line arguments:

> python server.py --help

Then let's start the server...

> python server.py

and in another terminal, test it:

> curl -X POST http://127.0.0.1:8000/inference -d '{"x": 2}'
{
  "y": 7.38905609893065
}

> curl -X POST http://127.0.0.1:8000/inference -d '{"input": 2}' # wrong schema
validation error: cannot find key 'x'

or check the metrics:

> curl http://127.0.0.1:8000/metrics

For more debug logs, you can enable it by changing the Python & Rust log level:

logger.setLevel(logging.DEBUG)
> RUST_LOG=debug python server.py

That's it! You have just hosted your exponential-computing model as a server! 😉

Example

More ready-to-use examples can be found in the Example section. It includes:

  • Multi-stage workflow
  • Batch processing worker
  • Shared memory IPC
  • PyTorch deep learning models:
    • sentiment analysis
    • image recognition

Qualitative Comparison*

Batcher Pipeline Parallel I/O Format(1) Framework(2) Backend Activity
TF Serving Limited(a) Heavily TF C++
Triton Limited Multiple C++
MMS Limited Heavily MX Java
BentoML Limited(b) Multiple Python
Streamer Customizable Agnostic Python
Flask(3) Customizable Agnostic Python
Mosec Customizable Agnostic Rust

*As accessed on 08 Oct 2021. By no means is this comparison showing that other frameworks are inferior, but rather it is used to illustrate the trade-off. The information is not guaranteed to be absolutely accurate. Please let us know if you find anything that may be incorrect.

(1): Data format of the service's request and response. "Limited" in the sense that the framework has pre-defined requirements on the format. (2): Supported machine learning frameworks. "Heavily" means the serving framework is designed towards a specific ML framework. Thus it is hard, if not impossible, to adapt to others. "Multiple" means the serving framework provides adaptation to several existing ML frameworks. "Agnostic" means the serving framework does not necessarily care about the ML framework. Hence it supports all ML frameworks (in Python). (3): Flask is a representative of general purpose web frameworks to host ML models.

Contributing

We welcome any kind of contribution. Please give us feedback by raising issues or directly contribute your code and pull request!

Project details


Release history Release notifications | RSS feed

This version

0.3.4

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mosec-0.3.4.tar.gz (23.3 kB view details)

Uploaded Source

Built Distributions

mosec-0.3.4-cp310-cp310-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.10

mosec-0.3.4-cp310-cp310-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

mosec-0.3.4-cp39-cp39-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.9

mosec-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

mosec-0.3.4-cp38-cp38-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.8

mosec-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

mosec-0.3.4-cp37-cp37m-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.7m

mosec-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file mosec-0.3.4.tar.gz.

File metadata

  • Download URL: mosec-0.3.4.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4.tar.gz
Algorithm Hash digest
SHA256 97b1f6bce7c691ea370fc5753515ae4114eba84a2c0956d32816166c9a6b1e7c
MD5 67cd93707d2b6a8c5fdf9c74b58a020f
BLAKE2b-256 3eff2220dac0164af32967b2495624ccd6810b3d445f6b3304165a250e7aaeb6

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp310-cp310-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp310-cp310-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.10
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0ff25d5cc3f677fa0a90ef902b0b520b39b69d6dfe6a04719a5800d5be9904a4
MD5 fed6154d88c67f8f2035f7c12b0a6991
BLAKE2b-256 8c4d5af521111a8d0db201c2f7b2e33dc44d92e483c509f2ded93e1d6630ebc4

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp310-cp310-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.10, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d6278c302531742763e00eff54bf15834bbb13a33087af31bb78d3517b5c9741
MD5 e1ecc3a9ba13c9f26e10531e1c40c4a1
BLAKE2b-256 5100f86abc80954c157465ec16573a086934d7b7a48dc8730f28e649a4a0f6f2

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a210e0430c32e773ddedf3c264d408932ebf60ed7e42667396a814f2098b600c
MD5 f7e72d3df2b39dc296ce445c3e4526c6
BLAKE2b-256 0c0826060e32a023dd0151dee86ea117722008397fd78a56f60a59a85d4b2e5b

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 23da1febde5fd1fa4b66dcfe8ebba93c2396a6afe8fd7bd5ccd4003cfb219ce0
MD5 8e60b93f5c317fe4aa23b4c6752139f0
BLAKE2b-256 9c6a58d88506ec5929e2e23c9c46cd3e451bb7daed6b66cbc9c0f23600afe12c

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 802735532267644d28de7722ab03241cc046d5db4b1a6e72aa58f1b3e4bd3ffc
MD5 0c196ee45c2a10cd7dc6c4628c1fb81c
BLAKE2b-256 5dd683e823f11e7b44c0457e962c9dd64d15fdb9f4abb4b05fa6c9c6fec36337

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7e8f7d36ce8e2b0c73f8fbdaa9f7a3dafbc4081182437f96fb1a77c1013bb29e
MD5 0414feee502689400e2b0766c6a90e9c
BLAKE2b-256 2ba4ec7d08f95e67065dbfabc1a350041cb5a6bfb9909e32bc1ece0b6442338a

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 17b645fa4626df42d5e06cff41fc15c89db769e3a715da000e7b1758fb0a2a60
MD5 dad2c5caf6573f325cfaf9414b04e73e
BLAKE2b-256 dd0f445d2ee488ea0f07da30877007cba01292fa071c0d9175e8c956a80f143a

See more details on using hashes here.

File details

Details for the file mosec-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for mosec-0.3.4-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c1a29dc920ab26df4aec49830f35c085969bd1ea49e6b088f87f0198334406d9
MD5 69dba114f5dd7c87ec44b677d7694403
BLAKE2b-256 37223d513c62cca159f8db7e2732ad3d728d63f1348d9c9b71652b3b02ef3328

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page