Skip to main content

A real-time inference server

Project description

inference service

flowchart TD
    A[ModelArtifact] -->B(Model Instance)
    G[InferenceFeatures] -->  B
    B --> C[VenueRatings]
    C -->D(Search List)

PyPI PyPI - Python Version PyPI - License Coookiecutter - Wolt codecov


Training Pipeline Source Code: https://github.com/ra312/personalization Source Code: https://github.com/ra312/model-server


A service to rate venues

Installation

python3 -m pip install recommendation-model-server

Running locally on host

If you choose to use pre-trained model in artifacts/rate_venues.pickle

python3 -m recommendation_model_server \
--host 0.0.0.0 \
--port 8000 \
--recommendation-model-path artifacts/rate_venues.pickle

In separate tab, please run

curl -X 'POST' \
'http://0.0.0.0:8000/predict' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '[
  {
    "venue_id": -4202398962129790000,
    "conversions_per_impression": 0.3556765815,
    "price_range": 1,
    "rating": 8.6,
    "popularity": 4.4884057024,
    "retention_rate": 8.6,
    "session_id_hashed": 3352618370338455600,
    "position_in_list": 31,
    "is_from_order_again": 0,
    "is_recommended": 0
  }
]'

Running in container

docker pull akylzhanov/my-fastapi-app
docker run -d --name my-fastapi-container -p 8000:8000 --rm akylzhanov/my-fastapi-app

Development

  • Clone this repository
  • Requirements:
  • Create a virtual environment and install the dependencies
poetry install
  • Activate the virtual environment
poetry shell

Testing

pytest tests

Pre-commit

Pre-commit hooks run all the auto-formatters (e.g. black, isort), linters (e.g. mypy, flake8), and other quality checks to make sure the changeset is in good shape before a commit/push happens.

You can install the hooks with (runs for each commit):

pre-commit install

Or if you want them to run only for each push:

pre-commit install -t pre-push

Or if you want e.g. want to run all checks manually for all files:

pre-commit run --all-files

How to run load tests

  1. Start service locally,
python3 -m recommendation_model_server \
--host 0.0.0.0 \
--port 8000 \
--recommendation-model-path artifacts/rate_venues.pickle
  1. Run load test with locust 1million users with spawn rate 100 users per second, i.e.
poetry shell && pytest tests/test_invokust_load.py -s

The output is similar to (the time is in milliseconds)

Ramping to 1000000 users at a rate of 100.00 per second
Type     Name  # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
POST     /predict    1453     0(0.00%) |    448       5    1948    390 |  167.83        0.00
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
        Aggregated    1453     0(0.00%) |    448       5    1948    390 |  167.83        0.00

This project was generated using the wolt-python-package-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

recommendation_model_server-0.1.1-py3-none-any.whl (17.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page