A real-time inference server
Project description
inference service
flowchart TD
A[ModelArtifact] -->B(Model Instance)
G[InferenceFeatures] --> B
B --> C[VenueRatings]
C -->D(Search List)
Training Pipeline Source Code: https://github.com/ra312/personalization Source Code: https://github.com/ra312/model-server
A service to rate venues
Installation
python3 -m pip install recommendation-model-server
Running locally on host
If you choose to use pre-trained model in artifacts/rate_venues.pickle
python3 -m recommendation_model_server \
--host 0.0.0.0 \
--port 8000 \
--recommendation-model-path artifacts/rate_venues.pickle
In separate tab, please run
curl -X 'POST' \
'http://0.0.0.0:8000/predict' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '[
{
"venue_id": -4202398962129790000,
"conversions_per_impression": 0.3556765815,
"price_range": 1,
"rating": 8.6,
"popularity": 4.4884057024,
"retention_rate": 8.6,
"session_id_hashed": 3352618370338455600,
"position_in_list": 31,
"is_from_order_again": 0,
"is_recommended": 0
}
]'
Running in container
docker pull akylzhanov/my-fastapi-app
docker run -d --name my-fastapi-container -p 8000:8000 --rm akylzhanov/my-fastapi-app
Development
- Clone this repository
- Requirements:
- Poetry
- Python 3.8.1+
- Create a virtual environment and install the dependencies
poetry install
- Activate the virtual environment
poetry shell
Testing
pytest tests
Pre-commit
Pre-commit hooks run all the auto-formatters (e.g. black
, isort
), linters (e.g. mypy
, flake8
), and other quality
checks to make sure the changeset is in good shape before a commit/push happens.
You can install the hooks with (runs for each commit):
pre-commit install
Or if you want them to run only for each push:
pre-commit install -t pre-push
Or if you want e.g. want to run all checks manually for all files:
pre-commit run --all-files
How to run load tests
- Start service locally,
python3 -m recommendation_model_server \
--host 0.0.0.0 \
--port 8000 \
--recommendation-model-path artifacts/rate_venues.pickle
- Run load test with locust 1million users with spawn rate 100 users per second, i.e.
poetry shell && pytest tests/test_invokust_load.py -s
The output is similar to (the time is in milliseconds)
Ramping to 1000000 users at a rate of 100.00 per second
Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
POST /predict 1453 0(0.00%) | 448 5 1948 390 | 167.83 0.00
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
Aggregated 1453 0(0.00%) | 448 5 1948 390 | 167.83 0.00
This project was generated using the wolt-python-package-cookiecutter template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for recommendation_model_server-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d69c5b930bec4badba5297fdb1fc05051593ece019cc75f42e4fd76d75a9b8ba |
|
MD5 | dd32a1f11fe3eae91783c7d000bfd7fd |
|
BLAKE2b-256 | 2c1190a5e0abebf4e436eee01c75a5881578372dc334bea6d406105163351d9f |