Skip to main content

Model Serving made Efficient in the Cloud.

Project description

MOSEC

PyPI version PyPi Downloads License Check status

Model Serving made Efficient in the Cloud.

Introduction

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

  • Highly performant: web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O
  • Ease of use: user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing
  • Dynamic batching: aggregate requests from different users for batched inference and distribute results back
  • Pipelined stages: spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads
  • Cloud friendly: designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems
  • Do one thing well: focus on the online serving part, users can pay attention to the model performance and business logic

Installation

Mosec requires Python 3.7 or above. Install the latest PyPI package with:

> pip install -U mosec

Usage

Write the server

Import the libraries and set up a basic logger to better observe what happens.

import logging

from mosec import Server, Worker
from mosec.errors import ValidationError

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter(
    "%(asctime)s - %(process)d - %(levelname)s - %(filename)s:%(lineno)s - %(message)s"
)
sh = logging.StreamHandler()
sh.setFormatter(formatter)
logger.addHandler(sh)

Then, we build an API to calculate the exponential with base e for a given number. To achieve that, we simply inherit the Worker class and override the forward method. Note that the input req is by default a JSON-decoded object, e.g., a dictionary here (wishfully it receives data like {"x": 1}). We also enclose the input parsing part with a try...except... block to reject invalid input (e.g., no key named "x" or field "x" cannot be converted to float).

import math


class CalculateExp(Worker):
    def forward(self, req: dict) -> dict:
        try:
            x = float(req["x"])
        except KeyError:
            raise ValidationError("cannot find key 'x'")
        except ValueError:
            raise ValidationError("cannot convert 'x' value to float")
        y = math.exp(x)  # f(x) = e ^ x
        logger.debug(f"e ^ {x} = {y}")
        return {"y": y}

Finally, we append the worker to the server to construct a single-stage workflow, and we specify the number of processes we want it to run in parallel. Then we run the server.

if __name__ == "__main__":
    server = Server()
    server.append_worker(
        CalculateExp, num=2
    )  # we spawn two processes for parallel computing
    server.run()

Run the server

After merging the snippets above into a file named server.py, we can first have a look at the command line arguments:

> python server.py --help

Then let's start the server...

> python server.py

and in another terminal, test it:

> curl -X POST http://127.0.0.1:8000/inference -d '{"x": 2}'
{
  "y": 7.38905609893065
}

> curl -X POST http://127.0.0.1:8000/inference -d '{"input": 2}' # wrong schema
validation error: cannot find key 'x'

or check the metrics:

> curl http://127.0.0.1:8000/metrics

That's it! You have just hosted your exponential-computing model as a server! 😉

Example

More ready-to-use examples can be found in the Example section. It includes:

  • Multi-stage workflow
  • Batch processing worker
  • Shared memory IPC
  • PyTorch deep learning models:
    • sentiment analysis
    • image recognition

Qualitative Comparison*

Batcher Pipeline Parallel I/O Format(1) Framework(2) Backend Activity
TF Serving Limited(a) Heavily TF C++
Triton Limited Multiple C++
MMS Limited Heavily MX Java
BentoML Limited(b) Multiple Python
Streamer Customizable Agnostic Python
Flask(3) Customizable Agnostic Python
Mosec Customizable Agnostic Rust

*As accessed on 08 Oct 2021. By no means is this comparison showing that other frameworks are inferior, but rather it is used to illustrate the trade-off. The information is not guaranteed to be absolutely accurate. Please let us know if you find anything that may be incorrect.

(1): Data format of the service's request and response. "Limited" in the sense that the framework has pre-defined requirements on the format. (2): Supported machine learning frameworks. "Heavily" means the serving framework is designed towards a specific ML framework. Thus it is hard, if not impossible, to adapt to others. "Multiple" means the serving framework provides adaptation to several existing ML frameworks. "Agnostic" means the serving framework does not necessarily care about the ML framework. Hence it supports all ML frameworks (in Python). (3): Flask is a representative of general purpose web frameworks to host ML models.

Contributing

We welcome any kind of contribution. Please give us feedback by raising issues or directly contribute your code and pull request!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mosec-0.3.0.tar.gz (23.1 kB view details)

Uploaded Source

Built Distributions

mosec-0.3.0-cp39-cp39-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.9

mosec-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

mosec-0.3.0-cp38-cp38-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.8

mosec-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

mosec-0.3.0-cp37-cp37m-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.7m

mosec-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

mosec-0.3.0-cp36-cp36m-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.6m

mosec-0.3.0-cp36-cp36m-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file mosec-0.3.0.tar.gz.

File metadata

  • Download URL: mosec-0.3.0.tar.gz
  • Upload date:
  • Size: 23.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0.tar.gz
Algorithm Hash digest
SHA256 3741071d1cafc6404ea43810ecd8113e589982d402d24dc963c75a7aefddd535
MD5 1a493a34bd904a4972c325217aa2d799
BLAKE2b-256 32679a13d748f874e1ac288afebf3fb23c938553c3dad60c062190059f9dcce6

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d8af9769192d6008e5737bfd6f65c192218ffa90157e9c1f59d839456be976a4
MD5 69db49e4be31feba5efe8933f54df5d2
BLAKE2b-256 cbfcd44bc0314a49b2881177a643454d0e18633f64d55f712bb7297508263e46

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6a96451349a12bf4bf6cb7df4dcf3f8801be6335cc3cddea744a61c20a2aeafb
MD5 943129b22b3109259980bb34854de0f1
BLAKE2b-256 6f0538bc689feaceaf7f1a2dcb150e6472ce5f208a46e21983136b7ff22f1e6e

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 604a8391c46773d25ba2f54e47f6bf979a6530043651306920b22cc4cefad05b
MD5 13a9fcd4b09ded453c0273a090dc5863
BLAKE2b-256 1caf427b7d08fe79b623e26efdb16edc8c54d50d7923b5abfeefd8a0ffe42274

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 76dcfdc0cb4799e36a61e0f35c589340f9d0659e562526c06d4334e21f30b8f7
MD5 1a6ccba18a531ad2f709e3a913e37806
BLAKE2b-256 162d027e0016ba3d3541da034ca3c1f9aa4a1268ac54bc258225bed0f3dc734d

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d2b3aa888342f9ef9f8b826e387e0a261590d5424a2f45474bbb5811b0ce8ac5
MD5 6cae6cee67ef0b1856694fed04faa829
BLAKE2b-256 3691d76634f42fca4b79cb7ff33c0a1d3fcf665ffbd181b0c1c6e29690d6b175

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 850e50142c0d2b0244353710018a19ced884d1376fe6ecd3276f31a5d16249bb
MD5 c535327b3e20747afaddc97d24735be0
BLAKE2b-256 b011047c417346f01e719c65a8c704aed40b024b9a75e3561fc519a72f468e60

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 31dbe39b2d0126b32c1d34c8fc947652ff57534f96688a3162e8ac1ef340f111
MD5 6108a756cf583a57f438ccaac5a20db5
BLAKE2b-256 edeafdd907c469d926081a0a6d1b596ae2da2e73ca21de34de3552ad3c60f06f

See more details on using hashes here.

File details

Details for the file mosec-0.3.0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.0-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 048c60af5e13dd6337bb611373303508d0d596272191587fba03a3f191332823
MD5 ded442c7033761974637689b27d9d67b
BLAKE2b-256 678c864a8ead16667b789a56dd1efe5262acba3ec07fb60571a582da77bdf531

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page