A real-time inference server
Project description
inference service
flowchart TD
A[ModelArtifact] -->B(Model Instance)
G[InferenceFeatures] --> B
B --> C[VenueRatings]
C -->D(Search List)
Training Pipeline Source Code: https://github.com/ra312/personalization Source Code: https://github.com/ra312/model-server
A service to rate venues
Installation
python3 -m pip install recommendation-model-server
Running locally on host
If you choose to use pre-trained model in artifacts/rate_venues.pickle
python3 -m recommendation_model_server \
--host 0.0.0.0 \
--port 8000 \
--recommendation-model-path artifacts/rate_venues.pickle
In separate tab, please run
curl -X 'POST' \
'http://0.0.0.0:8000/predict' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '[
{
"venue_id": -4202398962129790000,
"conversions_per_impression": 0.3556765815,
"price_range": 1,
"rating": 8.6,
"popularity": 4.4884057024,
"retention_rate": 8.6,
"session_id_hashed": 3352618370338455600,
"position_in_list": 31,
"is_from_order_again": 0,
"is_recommended": 0
}
]'
Running in container
docker pull akylzhanov/my-fastapi-app
docker run -d --name my-fastapi-container -p 8000:8000 --rm akylzhanov/my-fastapi-app
Development
- Clone this repository
- Requirements:
- Poetry
- Python 3.8.1+
- Create a virtual environment and install the dependencies
poetry install
- Activate the virtual environment
poetry shell
Testing
pytest tests
Pre-commit
Pre-commit hooks run all the auto-formatters (e.g. black, isort), linters (e.g. mypy, flake8), and other quality
checks to make sure the changeset is in good shape before a commit/push happens.
You can install the hooks with (runs for each commit):
pre-commit install
Or if you want them to run only for each push:
pre-commit install -t pre-push
Or if you want e.g. want to run all checks manually for all files:
pre-commit run --all-files
How to run load tests
- Start service locally,
python3 -m recommendation_model_server \
--host 0.0.0.0 \
--port 8000 \
--recommendation-model-path artifacts/rate_venues.pickle
- Run load test with locust 1million users with spawn rate 100 users per second, i.e.
poetry shell && pytest tests/test_invokust_load.py -s
The output is similar to (the time is in milliseconds)
Ramping to 1000000 users at a rate of 100.00 per second
Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
POST /predict 1453 0(0.00%) | 448 5 1948 390 | 167.83 0.00
--------||-------|-------------|-------|-------|-------|-------|--------|-----------
Aggregated 1453 0(0.00%) | 448 5 1948 390 | 167.83 0.00
This project was generated using the wolt-python-package-cookiecutter template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file recommendation_model_server-0.1.1-py3-none-any.whl.
File metadata
- Download URL: recommendation_model_server-0.1.1-py3-none-any.whl
- Upload date:
- Size: 17.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d69c5b930bec4badba5297fdb1fc05051593ece019cc75f42e4fd76d75a9b8ba
|
|
| MD5 |
dd32a1f11fe3eae91783c7d000bfd7fd
|
|
| BLAKE2b-256 |
2c1190a5e0abebf4e436eee01c75a5881578372dc334bea6d406105163351d9f
|