Skip to main content

Model Serving made Efficient in the Cloud.

Project description

MOSEC

PyPI version PyPi Downloads License Check status

Model Serving made Efficient in the Cloud.

Introduction

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

  • Highly performant: web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O
  • Ease of use: user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing
  • Dynamic batching: aggregate requests from different users for batched inference and distribute results back
  • Pipelined stages: spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads
  • Cloud friendly: designed to run in the cloud, with the model warmup, graceful shutdown, and Prometheus monitoring metrics, easily managed by Kubernetes or any container orchestration systems
  • Do one thing well: focus on the online serving part, users can pay attention to the model performance and business logic

Installation

Mosec requires Python 3.7 or above. Install the latest PyPI package with:

> pip install -U mosec

Usage

Write the server

Import the libraries and set up a basic logger to better observe what happens.

import logging

from mosec import Server, Worker
from mosec.errors import ValidationError

logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter(
    "%(asctime)s - %(process)d - %(levelname)s - %(filename)s:%(lineno)s - %(message)s"
)
sh = logging.StreamHandler()
sh.setFormatter(formatter)
logger.addHandler(sh)

Then, we build an API to calculate the exponential with base e for a given number. To achieve that, we simply inherit the Worker class and override the forward method. Note that the input req is by default a JSON-decoded object, e.g., a dictionary here (wishfully it receives data like {"x": 1}). We also enclose the input parsing part with a try...except... block to reject invalid input (e.g., no key named "x" or field "x" cannot be converted to float).

import math


class CalculateExp(Worker):
    def forward(self, req: dict) -> dict:
        try:
            x = float(req["x"])
        except KeyError:
            raise ValidationError("cannot find key 'x'")
        except ValueError:
            raise ValidationError("cannot convert 'x' value to float")
        y = math.exp(x)  # f(x) = e ^ x
        logger.debug(f"e ^ {x} = {y}")
        return {"y": y}

Finally, we append the worker to the server to construct a single-stage workflow, and we specify the number of processes we want it to run in parallel. Then we run the server.

if __name__ == "__main__":
    server = Server()
    server.append_worker(
        CalculateExp, num=2
    )  # we spawn two processes for parallel computing
    server.run()

Run the server

After merging the snippets above into a file named server.py, we can first have a look at the command line arguments:

> python server.py --help

Then let's start the server...

> python server.py

Run with Rust debug info:

> RUST_LOG=debug python server.py

and in another terminal, test it:

> curl -X POST http://127.0.0.1:8000/inference -d '{"x": 2}'
{
  "y": 7.38905609893065
}

> curl -X POST http://127.0.0.1:8000/inference -d '{"input": 2}' # wrong schema
validation error: cannot find key 'x'

or check the metrics:

> curl http://127.0.0.1:8000/metrics

That's it! You have just hosted your exponential-computing model as a server! 😉

Example

More ready-to-use examples can be found in the Example section. It includes:

  • Multi-stage workflow
  • Batch processing worker
  • Shared memory IPC
  • PyTorch deep learning models:
    • sentiment analysis
    • image recognition

Qualitative Comparison*

Batcher Pipeline Parallel I/O Format(1) Framework(2) Backend Activity
TF Serving Limited(a) Heavily TF C++
Triton Limited Multiple C++
MMS Limited Heavily MX Java
BentoML Limited(b) Multiple Python
Streamer Customizable Agnostic Python
Flask(3) Customizable Agnostic Python
Mosec Customizable Agnostic Rust

*As accessed on 08 Oct 2021. By no means is this comparison showing that other frameworks are inferior, but rather it is used to illustrate the trade-off. The information is not guaranteed to be absolutely accurate. Please let us know if you find anything that may be incorrect.

(1): Data format of the service's request and response. "Limited" in the sense that the framework has pre-defined requirements on the format. (2): Supported machine learning frameworks. "Heavily" means the serving framework is designed towards a specific ML framework. Thus it is hard, if not impossible, to adapt to others. "Multiple" means the serving framework provides adaptation to several existing ML frameworks. "Agnostic" means the serving framework does not necessarily care about the ML framework. Hence it supports all ML frameworks (in Python). (3): Flask is a representative of general purpose web frameworks to host ML models.

Contributing

We welcome any kind of contribution. Please give us feedback by raising issues or directly contribute your code and pull request!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mosec-0.3.1a1.tar.gz (23.2 kB view details)

Uploaded Source

Built Distributions

mosec-0.3.1a1-cp310-cp310-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.10

mosec-0.3.1a1-cp310-cp310-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

mosec-0.3.1a1-cp39-cp39-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.9

mosec-0.3.1a1-cp39-cp39-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

mosec-0.3.1a1-cp38-cp38-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.8

mosec-0.3.1a1-cp38-cp38-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

mosec-0.3.1a1-cp37-cp37m-manylinux1_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.7m

mosec-0.3.1a1-cp37-cp37m-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file mosec-0.3.1a1.tar.gz.

File metadata

  • Download URL: mosec-0.3.1a1.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1.tar.gz
Algorithm Hash digest
SHA256 5da0150027f7a0d415d4587c2bbc02065ea23b3019645a65f3d598f460aff270
MD5 339af181f5b3c366a0f46a950993a096
BLAKE2b-256 6d5f82d7a24197d81e05b5eb639ea2cbe6ae6c9e354f4611cbd070c8e85f50f9

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp310-cp310-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp310-cp310-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.10
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c06bb59ee8e4a36459cf33733699e382c5755fd40b12aaf51d7ddd6d13f0a03f
MD5 cc5a382040706f18e8899fac7bf191fd
BLAKE2b-256 8e3a715995b5032ed301dc95e4a91ff964f7cbb585add93700590b46c967e66c

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp310-cp310-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.10, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 197587c0ac39327344d9cba1904e6bbd47de4b0a1f13245b4d525ee01af1b411
MD5 e5d7123c7130a4a48ca1f7d7634ce18c
BLAKE2b-256 45e9c30b80a47eb461e12f5e712ec2e07abedc61ab13eb0571c95da5dce3c966

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 046cc0cc715412c82f4dd00079f93fc86ae86a25c682483e3b8cdcb30364915f
MD5 6939df49eb29425447de6cc29ebc6d5b
BLAKE2b-256 39d1cbcb4ae7a00fc239c2d1d121920fd08680bf9193820e71ce56c8cb361846

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 65d36d9d54662f3b2cc520ddb8925d7bd5393740269bf26e9900edcb16a3cbc9
MD5 8d684a2bb5f04710411fff173a2de2c8
BLAKE2b-256 ffdef957a839cc867bb394bb20fa58f898d79302776f3c0b31a4376dd20bf915

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 87ea1be538128a953000eb13edcaabce8df91c9f33545b3ce76c163d364d429e
MD5 32e5a13ebff0152d79288157eb094405
BLAKE2b-256 f1a066e68bdbd985f438feedb5f756a6b73471b2f8cadfdb4619d7b9e4aae751

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 01b5f6583130d87893697afc138bcf24919881b6e34690b692190a822b6904a4
MD5 dfdc341e46b843670d904d2eb1eeaac4
BLAKE2b-256 1cd424f5414bdc40f10f5222728cc36baf1194976d9bba1f2a638c32f4cda664

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 327c06d8d08707d63da36573ed7b518f8441b43dbc9d4f40cdd652670f1723bb
MD5 e65a6e13e3ea8261c0c580eb74390b3b
BLAKE2b-256 a1990ed9e77bf522fb3175b4293a4b7992421ec6254d56ee60f726e1ea62dfad

See more details on using hashes here.

File details

Details for the file mosec-0.3.1a1-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: mosec-0.3.1a1-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for mosec-0.3.1a1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c41b085a40d15268a66da01b189f19937a8842132758646eb4aa2e69b8b767a7
MD5 08a48822cf2659f51a68742e01f7f5ee
BLAKE2b-256 a0f1f23053eedb229e2d24fb9c41ce1bfd80444767b586f259336014a3d7fdba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page