Skip to main content

Model Serving made Efficient in the Cloud.

Project description

MOSEC

PyPI version PyPi Downloads License Check status

Model Serving made Efficient in the Cloud.

Introduction

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

  • Highly performant: web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O
  • Ease of use: user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing
  • Dynamic batching: aggregate requests from different users for batched inference and distribute results back
  • Pipelined stages: spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads
  • Cloud friendly: designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems
  • Do one thing well: focus on the online serving part, users can pay attention to the model performance and business logic

Installation

Mosec requires Python 3.7 or above. Install the latest PyPI package with:

> pip install -U mosec

Usage

Write the server

Import the libraries and set up a basic logger to better observe what happens.

import logging

from mosec import Server, Worker
from mosec.errors import ValidationError

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter(
    "%(asctime)s - %(process)d - %(levelname)s - %(filename)s:%(lineno)s - %(message)s"
)
sh = logging.StreamHandler()
sh.setFormatter(formatter)
logger.addHandler(sh)

Then, we build an API to calculate the exponential with base e for a given number. To achieve that, we simply inherit the Worker class and override the forward method. Note that the input req is by default a JSON-decoded object, e.g., a dictionary here (wishfully it receives data like {"x": 1}). We also enclose the input parsing part with a try...except... block to reject invalid input (e.g., no key named "x" or field "x" cannot be converted to float).

import math


class CalculateExp(Worker):
    def forward(self, req: dict) -> dict:
        try:
            x = float(req["x"])
        except KeyError:
            raise ValidationError("cannot find key 'x'")
        except ValueError:
            raise ValidationError("cannot convert 'x' value to float")
        y = math.exp(x)  # f(x) = e ^ x
        logger.debug(f"e ^ {x} = {y}")
        return {"y": y}

Finally, we append the worker to the server to construct a single-stage workflow, and we specify the number of processes we want it to run in parallel. Then we run the server.

if __name__ == "__main__":
    server = Server()
    server.append_worker(
        CalculateExp, num=2
    )  # we spawn two processes for parallel computing
    server.run()

Run the server

After merging the snippets above into a file named server.py, we can first have a look at the command line arguments:

> python server.py --help

Then let's start the server...

> python server.py

and in another terminal, test it:

> curl -X POST http://127.0.0.1:8000/inference -d '{"x": 2}'
{
  "y": 7.38905609893065
}

> curl -X POST http://127.0.0.1:8000/inference -d '{"input": 2}' # wrong schema
validation error: cannot find key 'x'

or check the metrics:

> curl http://127.0.0.1:8000/metrics

For more debug logs, you can enable it by changing the Python & Rust log level:

logger.setLevel(logging.DEBUG)
> RUST_LOG=debug python server.py

That's it! You have just hosted your exponential-computing model as a server! 😉

Example

More ready-to-use examples can be found in the Example section. It includes:

  • Multi-stage workflow
  • Batch processing worker
  • Shared memory IPC
  • PyTorch deep learning models:
    • sentiment analysis
    • image recognition

Qualitative Comparison*

Batcher Pipeline Parallel I/O Format(1) Framework(2) Backend Activity
TF Serving Limited(a) Heavily TF C++
Triton Limited Multiple C++
MMS Limited Heavily MX Java
BentoML Limited(b) Multiple Python
Streamer Customizable Agnostic Python
Flask(3) Customizable Agnostic Python
Mosec Customizable Agnostic Rust

*As accessed on 08 Oct 2021. By no means is this comparison showing that other frameworks are inferior, but rather it is used to illustrate the trade-off. The information is not guaranteed to be absolutely accurate. Please let us know if you find anything that may be incorrect.

(1): Data format of the service's request and response. "Limited" in the sense that the framework has pre-defined requirements on the format. (2): Supported machine learning frameworks. "Heavily" means the serving framework is designed towards a specific ML framework. Thus it is hard, if not impossible, to adapt to others. "Multiple" means the serving framework provides adaptation to several existing ML frameworks. "Agnostic" means the serving framework does not necessarily care about the ML framework. Hence it supports all ML frameworks (in Python). (3): Flask is a representative of general purpose web frameworks to host ML models.

Contributing

We welcome any kind of contribution. Please give us feedback by raising issues or directly contribute your code and pull request!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mosec-0.3.3.tar.gz (23.2 kB view details)

Uploaded Source

Built Distributions

mosec-0.3.3-cp310-cp310-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.10

mosec-0.3.3-cp310-cp310-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

mosec-0.3.3-cp39-cp39-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.9

mosec-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

mosec-0.3.3-cp38-cp38-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.8

mosec-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

mosec-0.3.3-cp37-cp37m-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.7m

mosec-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file mosec-0.3.3.tar.gz.

File metadata

  • Download URL: mosec-0.3.3.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3.tar.gz
Algorithm Hash digest
SHA256 35b6a6e3a05ef161cbfe43e640a4f51156357f07e11253f6d63dd82e410f0a4e
MD5 3bcc21a88278ef93e988ba8b2ed7c652
BLAKE2b-256 0673b3c0ea6611927c68e3c6b3ee71e854afded3c21b769ee69007cc07be8b1d

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp310-cp310-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp310-cp310-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.10
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 41fe15a4b315243c3cef62011b39ce78e6742966223dd79e1d208a6f8b0c3c6b
MD5 a78b72e4c97764a394427b4072dcae29
BLAKE2b-256 c134c70fe6928cb17d1334320fb1d07b872c3ab0f1d25e7550c1051cb79f57a9

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp310-cp310-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.10, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 32e2f13220aa309bc689d269e16b80d786909131984f513de9a475913d9e9f56
MD5 1c2ed1ad3982cb6468734685e0edc721
BLAKE2b-256 8ba601a0dfaee3d202465785d5ed262f74f1845e9a9b5c985f7c3f6fed5e9b24

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b38aeef010764d763f0906d46d3d9caca603042edcb53e199dcf9d5413fa3ce4
MD5 659283642fc0c36b83b540129d3c5f83
BLAKE2b-256 2522f647c04d59a10badea71a5e098d9cac0e85cca7ba456fc1a77d5ced5ca7d

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c712ca76966f97b62614738acbba6efd34adce3775463fa6d6b6c7c68887240a
MD5 66122d9ac41f1900073d662dc1fdb486
BLAKE2b-256 4a18e87af3bff62904d85a6dcfe0ace0cd0c3833fcb3c15ee287dfd0c6f3f3f4

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 710aff5c30bb825647d6e3b3c4864b8b63ec3fa4d29710eb35aa99d09ad57e5e
MD5 b24b12dd2e48aea46183b30bf1882074
BLAKE2b-256 ffd54237c1b0936a0fd1159e47691aa9bcf3bb0a6b4b32aa06aed50ef4ce41f0

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d5bfb30d4465a0cb59321e43a274e6493ea2c2107f570aab05e32a79b71edcac
MD5 4f7120616ff61b94b0adb799e1d82c76
BLAKE2b-256 facb0017699b72c2786a149763ddf0ba42dd13c7e6573e789fce0017cef45fdd

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 148b56434790e27bd7cff73591b31e8a1e2bafdc42b92330aea1873d32670414
MD5 c0ed4aa35ba9145a4dd6b5900b5e6def
BLAKE2b-256 9e16b8ec7740b8e015a9d5177a1c99aa64f9dced7c342a0f1423791d06cde0ae

See more details on using hashes here.

File details

Details for the file mosec-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for mosec-0.3.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e2afff074555e1d2d8578be604fe4d7347cb9da54f905338caab408ba24741ba
MD5 d4f917aaf628e522392a26872093c28c
BLAKE2b-256 954db375f95b08b2b9dcd1fa98492cb2f6ce48434c7a71c250fd353999705d17

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page