Deploy DL/ ML inference pipelines with minimal extra code.
Project description
fastDeploy
easy and performant micro-services for Python Deep Learning inference pipelines
- Deploy any python inference pipeline with minimal extra code
- Auto batching of concurrent inputs is enabled out of the box
- no changes to inference code (unlike tf-serving etc), entire pipeline is run as is
- Promethues metrics (open metrics) are exposed for monitoring
- Auto generates clean dockerfiles and kubernetes health check, scaling friendly APIs
- sequentially chained inference pipelines are supported out of the box
- can be queried from any language via easy to use rest apis
- easy to understand (simple consumer producer arch) and simple code base
Installation:
pip install --upgrade fastdeploy fdclient
# fdclient is optional, only needed if you want to use python client
CLI explained
Start fastDeploy server on a recipe:
# Invoke fastdeploy
python -m fastdeploy --help
# or
fastdeploy --help
# Start prediction "loop" for recipe "echo"
fastdeploy --loop --recipe recipes/echo
# Start rest apis for recipe "echo"
fastdeploy --rest --recipe recipes/echo
Send a request and get predictions:
auto generate dockerfile and build docker image:
# Write the dockerfile for recipe "echo"
# and builds the docker image if docker is installed
# base defaults to python:3.8-slim
fastdeploy --build --recipe recipes/echo
# Run docker image
docker run -it -p8080:8080 fastdeploy_echo
Serving your model (recipe):
Where to use fastDeploy?
- to deploy any non ultra light weight models i.e: most DL models, >50ms inference time per example
- if the model/pipeline benefits from batch inference, fastDeploy is perfect for your use-case
- if you are going to have individual inputs (example, user's search input which needs to be vectorized or image to be classified)
- in the case of individual inputs, requests coming in at close intervals will be batched together and sent to the model as a batch
- perfect for creating internal micro services separating your model, pre and post processing from business logic
- since prediction loop and inference endpoints are separated and are connected via sqlite backed queue, can be scaled independently
Where not to use fastDeploy?
- non cpu/gpu heavy models that are better of running parallely rather than in batch
- if your predictor calls some external API or uploads to s3 etc in a blocking way
- io heavy non batching use cases (eg: query ES or db for each input)
- for these cases better to directly do from rest api code (instead of consumer producer mechanism) so that high concurrency can be achieved
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastdeploy-3.1.1.tar.gz
(16.9 kB
view details)
Built Distribution
File details
Details for the file fastdeploy-3.1.1.tar.gz
.
File metadata
- Download URL: fastdeploy-3.1.1.tar.gz
- Upload date:
- Size: 16.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d080338d0806b8176df0a8779f6c0fbb4f1fa6379cb527eca2457d5608e7f329 |
|
MD5 | d195699531dc2a0a0774b7a2ba4a72a5 |
|
BLAKE2b-256 | f08735d833b9c938f29ab873714e4a744c83545a6a27725bfb7e2b63e13c8a9e |
Provenance
The following attestation bundles were made for fastdeploy-3.1.1.tar.gz
:
Publisher:
main.yml
on notAI-tech/fastDeploy
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
fastdeploy-3.1.1.tar.gz
- Subject digest:
d080338d0806b8176df0a8779f6c0fbb4f1fa6379cb527eca2457d5608e7f329
- Sigstore transparency entry: 147945044
- Sigstore integration time:
- Predicate type:
File details
Details for the file fastdeploy-3.1.1-py3-none-any.whl
.
File metadata
- Download URL: fastdeploy-3.1.1-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c209874f6884c74d3d33d3b9f76b54ef8641b1bb8f0384eab782c3f6b36180b |
|
MD5 | f8ee98bafd0622a2f0e87dd82fec0f3c |
|
BLAKE2b-256 | 84f1a8bf61b9dd0b58a8d0a73e42e73c4fe7e77206757dca7035018003f8bf99 |
Provenance
The following attestation bundles were made for fastdeploy-3.1.1-py3-none-any.whl
:
Publisher:
main.yml
on notAI-tech/fastDeploy
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
fastdeploy-3.1.1-py3-none-any.whl
- Subject digest:
4c209874f6884c74d3d33d3b9f76b54ef8641b1bb8f0384eab782c3f6b36180b
- Sigstore transparency entry: 147945045
- Sigstore integration time:
- Predicate type: