Skip to main content

Model Serving made Efficient in the Cloud.

Project description

MOSEC

PyPI version PyPi Downloads License Check status

Model Serving made Efficient in the Cloud.

Introduction

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

  • Highly performant: web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O
  • Ease of use: user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing
  • Dynamic batching: aggregate requests from different users for batched inference and distribute results back
  • Pipelined stages: spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads
  • Cloud friendly: designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems
  • Do one thing well: focus on the online serving part, users can pay attention to the model performance and business logic

Installation

Mosec requires Python 3.7 or above. Install the latest PyPI package with:

> pip install -U mosec

Usage

Write the server

Import the libraries and set up a basic logger to better observe what happens.

import logging

from mosec import Server, Worker
from mosec.errors import ValidationError

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter(
    "%(asctime)s - %(process)d - %(levelname)s - %(filename)s:%(lineno)s - %(message)s"
)
sh = logging.StreamHandler()
sh.setFormatter(formatter)
logger.addHandler(sh)

Then, we build an API to calculate the exponential with base e for a given number. To achieve that, we simply inherit the Worker class and override the forward method. Note that the input req is by default a JSON-decoded object, e.g., a dictionary here (wishfully it receives data like {"x": 1}). We also enclose the input parsing part with a try...except... block to reject invalid input (e.g., no key named "x" or field "x" cannot be converted to float).

import math


class CalculateExp(Worker):
    def forward(self, req: dict) -> dict:
        try:
            x = float(req["x"])
        except KeyError:
            raise ValidationError("cannot find key 'x'")
        except ValueError:
            raise ValidationError("cannot convert 'x' value to float")
        y = math.exp(x)  # f(x) = e ^ x
        logger.debug(f"e ^ {x} = {y}")
        return {"y": y}

Finally, we append the worker to the server to construct a single-stage workflow, and we specify the number of processes we want it to run in parallel. Then we run the server.

if __name__ == "__main__":
    server = Server()
    server.append_worker(
        CalculateExp, num=2
    )  # we spawn two processes for parallel computing
    server.run()

Run the server

After merging the snippets above into a file named server.py, we can first have a look at the command line arguments:

> python server.py --help

Then let's start the server...

> python server.py

and in another terminal, test it:

> curl -X POST http://127.0.0.1:8000/inference -d '{"x": 2}'
{
  "y": 7.38905609893065
}

> curl -X POST http://127.0.0.1:8000/inference -d '{"input": 2}' # wrong schema
validation error: cannot find key 'x'

or check the metrics:

> curl http://127.0.0.1:8000/metrics

For more debug logs, you can enable it by changing the Python & Rust log level:

logger.setLevel(logging.DEBUG)
> RUST_LOG=debug python server.py

That's it! You have just hosted your exponential-computing model as a server! 😉

Example

More ready-to-use examples can be found in the Example section. It includes:

  • Multi-stage workflow
  • Batch processing worker
  • Shared memory IPC
  • PyTorch deep learning models:
    • sentiment analysis
    • image recognition

Qualitative Comparison*

Batcher Pipeline Parallel I/O Format(1) Framework(2) Backend Activity
TF Serving Limited(a) Heavily TF C++
Triton Limited Multiple C++
MMS Limited Heavily MX Java
BentoML Limited(b) Multiple Python
Streamer Customizable Agnostic Python
Flask(3) Customizable Agnostic Python
Mosec Customizable Agnostic Rust

*As accessed on 08 Oct 2021. By no means is this comparison showing that other frameworks are inferior, but rather it is used to illustrate the trade-off. The information is not guaranteed to be absolutely accurate. Please let us know if you find anything that may be incorrect.

(1): Data format of the service's request and response. "Limited" in the sense that the framework has pre-defined requirements on the format. (2): Supported machine learning frameworks. "Heavily" means the serving framework is designed towards a specific ML framework. Thus it is hard, if not impossible, to adapt to others. "Multiple" means the serving framework provides adaptation to several existing ML frameworks. "Agnostic" means the serving framework does not necessarily care about the ML framework. Hence it supports all ML frameworks (in Python). (3): Flask is a representative of general purpose web frameworks to host ML models.

Contributing

We welcome any kind of contribution. Please give us feedback by raising issues or directly contribute your code and pull request!

Project details


Release history Release notifications | RSS feed

This version

0.3.5

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mosec-0.3.5.tar.gz (39.4 kB view details)

Uploaded Source

Built Distributions

mosec-0.3.5-cp310-cp310-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10

mosec-0.3.5-cp310-cp310-macosx_10_9_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

mosec-0.3.5-cp39-cp39-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9

mosec-0.3.5-cp39-cp39-macosx_10_9_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

mosec-0.3.5-cp38-cp38-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.8

mosec-0.3.5-cp38-cp38-macosx_10_9_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

mosec-0.3.5-cp37-cp37m-manylinux1_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.7m

mosec-0.3.5-cp37-cp37m-macosx_10_9_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file mosec-0.3.5.tar.gz.

File metadata

  • Download URL: mosec-0.3.5.tar.gz
  • Upload date:
  • Size: 39.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for mosec-0.3.5.tar.gz
Algorithm Hash digest
SHA256 e8577095879d322667a81b4defbf6f7d4b81c2e0fa2d218852f6943ba624d41a
MD5 3db57110569ecb2eabe44c5f79d73298
BLAKE2b-256 2be6aa80fb67709faa2a373bb9c0bbcec0232b072edec409ea35e9749cf826f7

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp310-cp310-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0d25f5ced15818b5135d4ead7a756d976232eb5f5f8756491ac108c6254e917a
MD5 881e5c2f972c57091494fc20193281bf
BLAKE2b-256 59859089c92e2f9ddcb463ea230d4a8b14cbf8ecf3a7ec0e051728624580ede9

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 185f1a9bc91121c12732562f1e7f5d0cc69c09d2ce9f7355d8b8382a56f16bc3
MD5 72ac43bf4fd3f83dbf20baee5760ab0a
BLAKE2b-256 99b7c9e3b8798b4cba175ffd6aefdf1306da91011554b3574efd396e867fa706

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp39-cp39-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3d0888600c8483d13198549880c4f7e1c883cc03ae919816135cca1a28a65d1e
MD5 6bb9462238fa3dfa7eb915d16a0016c9
BLAKE2b-256 9301f42ef768e80a5c1b1feda1e19f0263e3f60303eaace0e547c16508eb1e21

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f0df4640a97511fa276252ace2e800d67844222e5f89c20b6e4840f7a908fa1a
MD5 244d6529656103a91214510879be21e6
BLAKE2b-256 eefe9c228e23dd93b9e2902edbdb9b22f17ca20039340289856021c27ff78988

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 cdaa75146a1764cd637d85dfe838542058dfe0dd12ab85746bcf4f434093da69
MD5 7024f58ca3d1959287cd396334485f94
BLAKE2b-256 681470729ffa35d4d836f46a44aa1867c674f928091adf6f0152b18cc2684d9a

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4162d1015f8466b0c7fe50465c47dcf25a5f4e16b0097c6fa7bb38a734dc92d0
MD5 1f7d26b719d17493d55e81ee0e878821
BLAKE2b-256 660407c67f867be97aa4b4fdfa18d023612138ddc9920570f8247549a6ecf4c2

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a69fa6e5d2af4cbf8a86f442d7afecaa4c1f3d3a8a9a26ba407f058060871741
MD5 fc88a3b6189ec4d0658e83ac20b6595c
BLAKE2b-256 c7ca527b6bb9aec4b3c1b09fbdff62aaedbfc6ee914b02ed85fdc1ef0ea7da2c

See more details on using hashes here.

File details

Details for the file mosec-0.3.5-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mosec-0.3.5-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 2400ddd1d9d920aca90e5fb50d6738fe131fa9937f83fdddb060852e2517b76f
MD5 3e9db197387682b05843afe28cd68fff
BLAKE2b-256 8a6a945ef6b550c65e58f80f93029bdd5f965f1d501519fd3174130377795e3f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page