ML Model Serving Package
Project description
[Some Info]
- currently supported request type:
request_content_type=application/json
[Instructions]
- save your model in
.joblibformat. Example:
from joblib import dump
your_model_artifact = {
"model": your_model,
# other metadata
"tokenizer": ...,
"quantization": ...,
...
}
dump(your_model_artifact, "MODEL_ARTIFACT_PATH.joblib")
- Create inference script
inference.pywith two functionsinput_fnandpredict_fn(similar to how sagemaker inference does). Usually you'll create an inference file for each model you register. Example:
def input_fn(data):
processed_data_for_model_input = ... # some transformation logic
return processed_data_for_model_input
def predict_fn(input, model):
result = model(input)
return result
- Register model: run
deployaible register --name=MODLE_NAME --model_path=MODEL_ARTIFACT_PATH_JOBLIB --inference_path=INFERENCE_SCRIPT_PATH - Serve your model: run
deployaible serve --port=your_portYou will get a backend running onyour_port(default is 9000). A sample endpoint will belocalhost:9000/your_model_name/predict. - Test endpoint: run
curl -X POST -H "Content-Type: application/json" -d '{"data": ["val"]}' http://localhost:9100/GPT4/predict
- You can also the APIs via swagger UI on
http://localhost:your_port/docs
sample_notebook_placeholder
sample_architecture_placeholder
Highlights
- Supports multiple types of model serving
- Sample UI
- Works on Linux/MacOS/Windows
Install
[TODO] git instruction or pip install instruction
Basic Usage
Advanced Usage
Misc
Performance
Documentation
[TODO] set up using this link
Bugs/Requests
License
TODO's
-
models.py-initmethod needs to use model loader and allows torch/pickle/sklearn types -
model_manager.py- enforce Singleton pattern with right locking mechanisms (also need to change the test case) -
bugs:
AssertionError: write() before start_responsewhen go to predict page then go back -
Celery component: add
try/except KeyboardInterruptas a potential fix to continuing celery worker -
Kafka component: add APIs for submitting data to and listening result from kafka
python setup.py sdisttwine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dpai-0.1.0.tar.gz.
File metadata
- Download URL: dpai-0.1.0.tar.gz
- Upload date:
- Size: 22.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66235ca79b1c45fe53abbabc773ee550fca88f150815735ef30148f20f5a52a2
|
|
| MD5 |
34e7f5ea28004cc412af65255425faf4
|
|
| BLAKE2b-256 |
33696f15cbc31d43f08009017bbffa643f00d61410c7dc6265e0a525869e28ea
|
File details
Details for the file dpai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dpai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f382655fccc56aa8ad1c049929f3d93a02bd134a32d4bffe3c9e24c03e4025da
|
|
| MD5 |
48a06bcee143e829134e29bf6f0c8c4c
|
|
| BLAKE2b-256 |
cfd82bce79f7b88c4a46bf48dc944a0ccb30d76b0c8ed0c238981b7b9085e8fc
|