Skip to main content

Python client library for the AMD Inference Server: unified inference across AMD CPUs, GPUs, and FPGAs

Project description

The AMD Inference Server is an open-source tool to deploy your machine learning models and make them accessible to clients for inference. Out-of-the-box, the server can support selected models that run on AMD CPUs, GPUs or FPGAs by leveraging existing libraries. For all these models and hardware accelerators, the server presents a common user interface based on community standards so clients can make requests to any using the same API. The server provides HTTP/REST and gRPC interfaces for clients to submit requests. For both, there are C++ and Python bindings to simplify writing client programs. You can also use the server backend directly using the native C++ API to write local applications.

Features

  • Supports client requests using HTTP/REST, gRPC and websocket protocols using an API based on KServe’s v2 specification

  • Custom applications can directly call the backend bypassing the other protocols using the native C++ API

  • C++ library with Python bindings to simplify making requests to the server

  • Incoming requests are transparently batched based on the user specifications

  • Users can define how many models, and how many instances of each, to run in parallel

The AMD Inference Server is integrated with the following libraries out of the gate:

  • TensorFlow and PyTorch models with ZenDNN on CPUs (optimized for AMD CPUs)

  • ONNX models with MIGraphX on AMD GPUs

  • XModel models with Vitis AI on AMD FPGAs

  • A graph of computation including as pre- and post-processing can be written using AKS on AMD FPGAs for end-to-end inference

Quick Start Deployment and Inference

The following example demonstrates how to deploy the server locally and run a sample inference. This example runs on the CPU and does not require any special hardware. You can see a more detailed version of this example in the quickstart.

# Step 1: Download the example files and create a model repository
wget https://github.com/Xilinx/inference-server/raw/main/examples/resnet50/quickstart-setup.sh
chmod +x ./quickstart-setup.sh
./quickstart-setup.sh

# Step 2: Launch the AMD Inference Server
docker run -d --net=host -v ${PWD}/model_repository:/mnt/models:rw amdih/serve:uif1.1_zendnn_amdinfer_0.3.0 amdinfer-server --enable-repository-watcher

# Step 3: Install the Python client library
pip install amdinfer

# Step 4: Send an inference request
python3 tfzendnn.py --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt

# Inference should print the following:
#
#     Running the TF+ZenDNN example for ResNet50 in Python
#     Waiting until the server is ready...
#     Making inferences...
#     Top 5 classes for ../../tests/assets/dog-3619020_640.jpg:
#       n02112018 Pomeranian
#       n02112350 keeshond
#       n02086079 Pekinese, Pekingese, Peke
#       n02112137 chow, chow chow
#       n02113023 Pembroke, Pembroke Welsh corgi

Learn more

The documentation for the AMD Inference Server is available online.

Check out the quickstart online to help you get started.

Support

Raise issues if you find a bug or need help. Refer to Contributing for more information.

License

The AMD Inference Server is licensed under the terms of Apache 2.0 (see LICENSE). The LICENSE file contains additional license information for third-party files distributed with this work. More license information can be seen in the dependencies.

IMPORTANT NOTICE CONCERNING OPEN-SOURCE SOFTWARE

Materials in this release may be licensed by Xilinx or third parties and may be subject to the GNU General Public License, the GNU Lesser General License, or other licenses.

Licenses and source files may be downloaded from:

Note: You are solely responsible for checking the header files and other accompanying source files (i) provided within, in support of, or that otherwise accompanies these materials or (ii) created from the use of third party software and tools (and associated libraries and utilities) that are supplied with these materials, because such header and/or source files may contain or describe various copyright notices and license terms and conditions governing such files, which vary from case to case based on your usage and are beyond the control of Xilinx. You are solely responsible for complying with the terms and conditions imposed by third parties as applicable to your software applications created from the use of third party software and tools (and associated libraries and utilities) that are supplied with the materials.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

amdinfer-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23.5 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

amdinfer-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23.5 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

amdinfer-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

amdinfer-0.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23.5 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

File details

Details for the file amdinfer-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: amdinfer-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 23.5 MB
  • Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/0.10.1 urllib3/1.26.6 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.13

File hashes

Hashes for amdinfer-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 76653c057fa09b84862942450568abd68e7a634a4791499ff17de6d75ada66a0
MD5 d69ab46592897e20e15023d09fcb941b
BLAKE2b-256 78b25f68d8dcf510955baee14957ed74cc6e710f86494bd4b749a143d18be731

See more details on using hashes here.

File details

Details for the file amdinfer-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: amdinfer-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 23.5 MB
  • Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/0.10.1 urllib3/1.26.6 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.13

File hashes

Hashes for amdinfer-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 eb5ed9a2a27507a8fa319be65f16126ec8ff4d2d7744376fcbfb8c956c4ab883
MD5 db615b2e6e3edef42408ab931d3461e2
BLAKE2b-256 678804525ca37a38645fd44bbe3581ebb361371d8074852b4633c5675360600a

See more details on using hashes here.

File details

Details for the file amdinfer-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: amdinfer-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 23.5 MB
  • Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/0.10.1 urllib3/1.26.6 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.13

File hashes

Hashes for amdinfer-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fbc41cb11dfcc0f2a17ef3e1cbc672cd902b618aa8adf461bc2ec1cdd0f87421
MD5 8d75305f39806d153ac3c3bd78291787
BLAKE2b-256 8943a829dc916b10307ac0a326b8c3bcfbdf40085925670eea30379e2587fe87

See more details on using hashes here.

File details

Details for the file amdinfer-0.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: amdinfer-0.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 23.5 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.25.1 requests-toolbelt/0.10.1 urllib3/1.26.6 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.13

File hashes

Hashes for amdinfer-0.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ef5723fbb81645b4181d9dbe8bccdc81f6fdabc3cf88ab757935f8c873f1fdba
MD5 eca6bde6616850fad8536006b6e1e4f2
BLAKE2b-256 c74915787aad7f4919a70f2912fcda0a60e512474b6b0353d3abcda330b1e7d5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page