Skip to main content

A middleware for model serving to speedup online inference.

Project description

<h1 align=”center”>Serving Agent</h1>

<p align=”center”> A middleware for model serving to speedup online inference. <a href=”./README_zh.md”>中文</a> </p>

<h2 align=”center”>What is Serving Agent</h2>

Serving Agent is designed as a middleware for model serving between web server and model server to help the server improve the GPU utilization then speedup online inference. For the service with machile learning model, the requests from the client are usually streaming. To utilize the parallel computing capability of GPUs, we usually import a message queue/message broker to cache the request from web server then batch process with model server (the below figure shows the architecture). Serving Agent encapsulates the detial actions that such as serialize the request data, communicate with message queue (redis) and deserialization and more over. With Serving Agent, it is easy to build a scalable service with serveral codes.

![model serving architecture](img/architecture.png)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

serving_agent-0.1.0.tar.gz (2.6 kB view details)

Uploaded Source

File details

Details for the file serving_agent-0.1.0.tar.gz.

File metadata

  • Download URL: serving_agent-0.1.0.tar.gz
  • Upload date:
  • Size: 2.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.7

File hashes

Hashes for serving_agent-0.1.0.tar.gz
Algorithm Hash digest
SHA256 16ef505e00df08f76a363823ddfc74fc3e931288b46ab25a08a2205842b26cc5
MD5 09d63c37e510d1973d64042f6e990680
BLAKE2b-256 344ab4c78a55a0575583e8c51b78d91629a51d7098c55001b4d05febe4b94214

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page