Skip to main content

Deploy DL/ ML inference pipelines with minimal extra code.

Project description

fastDeploy

easy and performant micro-services for Python Deep Learning inference pipelines

  • Deploy any python inference pipeline with minimal extra code
  • Auto batching of concurrent inputs is enabled out of the box
  • no changes to inference code (unlike tf-serving etc), entire pipeline is run as is
  • Promethues metrics (open metrics) are exposed for monitoring
  • Auto generates clean dockerfiles and kubernetes health check, scaling friendly APIs
  • sequentially chained inference pipelines are supported out of the box
  • can be queried from any language via easy to use rest apis
  • easy to understand (simple consumer producer arch) and simple code base

Installation:

pip install --upgrade fastdeploy fdclient
# fdclient is optional, only needed if you want to use python client

CLI explained

Start fastDeploy server on a recipe:

# Invoke fastdeploy 
python -m fastdeploy --help
# or
fastdeploy --help

# Start prediction "loop" for recipe "echo"
fastdeploy --loop --recipe recipes/echo

# Start rest apis for recipe "echo"
fastdeploy --rest --recipe recipes/echo

Send a request and get predictions:

auto generate dockerfile and build docker image:

# Write the dockerfile for recipe "echo"
# and builds the docker image if docker is installed
# base defaults to python:3.8-slim
fastdeploy --build --recipe recipes/echo

# Run docker image
docker run -it -p8080:8080 fastdeploy_echo

Serving your model (recipe):

Where to use fastDeploy?

  • to deploy any non ultra light weight models i.e: most DL models, >50ms inference time per example
  • if the model/pipeline benefits from batch inference, fastDeploy is perfect for your use-case
  • if you are going to have individual inputs (example, user's search input which needs to be vectorized or image to be classified)
  • in the case of individual inputs, requests coming in at close intervals will be batched together and sent to the model as a batch
  • perfect for creating internal micro services separating your model, pre and post processing from business logic
  • since prediction loop and inference endpoints are separated and are connected via sqlite backed queue, can be scaled independently

Where not to use fastDeploy?

  • non cpu/gpu heavy models that are better of running parallely rather than in batch
  • if your predictor calls some external API or uploads to s3 etc in a blocking way
  • io heavy non batching use cases (eg: query ES or db for each input)
  • for these cases better to directly do from rest api code (instead of consumer producer mechanism) so that high concurrency can be achieved

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastdeploy-3.1.1.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

fastdeploy-3.1.1-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file fastdeploy-3.1.1.tar.gz.

File metadata

  • Download URL: fastdeploy-3.1.1.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fastdeploy-3.1.1.tar.gz
Algorithm Hash digest
SHA256 d080338d0806b8176df0a8779f6c0fbb4f1fa6379cb527eca2457d5608e7f329
MD5 d195699531dc2a0a0774b7a2ba4a72a5
BLAKE2b-256 f08735d833b9c938f29ab873714e4a744c83545a6a27725bfb7e2b63e13c8a9e

See more details on using hashes here.

Provenance

The following attestation bundles were made for fastdeploy-3.1.1.tar.gz:

Publisher: main.yml on notAI-tech/fastDeploy

Attestations:

File details

Details for the file fastdeploy-3.1.1-py3-none-any.whl.

File metadata

  • Download URL: fastdeploy-3.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fastdeploy-3.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4c209874f6884c74d3d33d3b9f76b54ef8641b1bb8f0384eab782c3f6b36180b
MD5 f8ee98bafd0622a2f0e87dd82fec0f3c
BLAKE2b-256 84f1a8bf61b9dd0b58a8d0a73e42e73c4fe7e77206757dca7035018003f8bf99

See more details on using hashes here.

Provenance

The following attestation bundles were made for fastdeploy-3.1.1-py3-none-any.whl:

Publisher: main.yml on notAI-tech/fastDeploy

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page